To be a solid implementation the reference counting would need to
happen in the core database layer, I think. It's the same as hard
links in filesystems.

B.

On 28 October 2011 13:31, Benoit Chesneau <[email protected]> wrote:
> On Fri, Oct 28, 2011 at 1:25 PM, Robert Newson <[email protected]> wrote:
>> The approach would be to teach couchdb how to deduplicate
>> byte-identical attachments (or chunks thereof) with a file. Sounds a
>> bit tricky but not impossible.
>>
>> B.
>
> Other way would be saving attachments in one place and check their
> signatiure to detect duplication. At least per db it could work,
> couldn't it?
>
> - benoit
>
>>
>> On 28 October 2011 12:22, Gregor Martynus <[email protected]> wrote:
>>> Thanks for your responses!
>>>
>>> I'm not sure if there is any approach to go minimize the disadvantage of
>>> replicated attachments eating up space and performance, if there is, please
>>> let me know.
>>>
>>> My approach would be to setup a backend server that listens to new
>>> attachments coming in, transferring these to an external store like S3 and
>>> then replace the doc attachment in the DB with some kind of pointer to the
>>> new location of the attachments.
>>>
>>> Not sure if that makes sense, I'm open for suggestions.
>>>
>>> And once more thanks for your help!
>>>
>>> On Fri, Oct 28, 2011 at 1:14 PM, CGS <[email protected]> wrote:
>>>
>>>> Hi Gregor,
>>>>
>>>> I might be wrong because I am no expert in that field. But from the
>>>> documentation, one can deduce that all the attachments are inserted into 
>>>> the
>>>> document and not pointing toward a physical file (quite logic if you
>>>> consider the main purpose of CouchDB: web-oriented database). As 
>>>> replication
>>>> mechanism is the same for local replication and replication over the 
>>>> network
>>>> (just transferring the content of data from source file to the target 
>>>> file),
>>>> my guess is that your attachment is copied in all the physical files for
>>>> which a replication operation was applied.
>>>>
>>>> However, depending on your project requests, instead of attachment you can
>>>> use a pointer which you can use it in shows (at the user's end). The
>>>> limitations of such a method are imposed by the cross-domain limitations 
>>>> (if
>>>> you use AJAX).
>>>>
>>>> I hope this answer will help you in designing your project and if somebody
>>>> notice any mistake in my answer, please, correct me.
>>>>
>>>> Cheers,
>>>> CGS
>>>>
>>>>
>>>>
>>>>
>>>> On 10/28/2011 12:32 PM, Gregor Martynus wrote:
>>>>
>>>>> I wonder how couchDB stores document attachments internally. In
>>>>> particular,
>>>>> I'd like to know if I replicate a document with attachments from one
>>>>> database to another, will the attachments be stored twice internally or
>>>>> will
>>>>> the couchDB be smart enough to understand that the attachment does already
>>>>> exist and only needs to link to it?
>>>>>
>>>>> I hope my question is clear. In my case, each account has an own database
>>>>> with its own documents. Now documents can be shared between accounts which
>>>>> will be done using replication. But when attachments would get stored
>>>>> multiple times although they are exactly the same I fear that it would use
>>>>> up too much space and eventually slow down replications etc?
>>>>>
>>>>>
>>>>
>>>
>>
>

Reply via email to