The approach would be to teach couchdb how to deduplicate byte-identical attachments (or chunks thereof) with a file. Sounds a bit tricky but not impossible.
B. On 28 October 2011 12:22, Gregor Martynus <[email protected]> wrote: > Thanks for your responses! > > I'm not sure if there is any approach to go minimize the disadvantage of > replicated attachments eating up space and performance, if there is, please > let me know. > > My approach would be to setup a backend server that listens to new > attachments coming in, transferring these to an external store like S3 and > then replace the doc attachment in the DB with some kind of pointer to the > new location of the attachments. > > Not sure if that makes sense, I'm open for suggestions. > > And once more thanks for your help! > > On Fri, Oct 28, 2011 at 1:14 PM, CGS <[email protected]> wrote: > >> Hi Gregor, >> >> I might be wrong because I am no expert in that field. But from the >> documentation, one can deduce that all the attachments are inserted into the >> document and not pointing toward a physical file (quite logic if you >> consider the main purpose of CouchDB: web-oriented database). As replication >> mechanism is the same for local replication and replication over the network >> (just transferring the content of data from source file to the target file), >> my guess is that your attachment is copied in all the physical files for >> which a replication operation was applied. >> >> However, depending on your project requests, instead of attachment you can >> use a pointer which you can use it in shows (at the user's end). The >> limitations of such a method are imposed by the cross-domain limitations (if >> you use AJAX). >> >> I hope this answer will help you in designing your project and if somebody >> notice any mistake in my answer, please, correct me. >> >> Cheers, >> CGS >> >> >> >> >> On 10/28/2011 12:32 PM, Gregor Martynus wrote: >> >>> I wonder how couchDB stores document attachments internally. In >>> particular, >>> I'd like to know if I replicate a document with attachments from one >>> database to another, will the attachments be stored twice internally or >>> will >>> the couchDB be smart enough to understand that the attachment does already >>> exist and only needs to link to it? >>> >>> I hope my question is clear. In my case, each account has an own database >>> with its own documents. Now documents can be shared between accounts which >>> will be done using replication. But when attachments would get stored >>> multiple times although they are exactly the same I fear that it would use >>> up too much space and eventually slow down replications etc? >>> >>> >> >
