On 25 March 2013 22:44, Jens Alfke <[email protected]> wrote: > > On Mar 25, 2013, at 1:13 PM, svilen > <[email protected]<mailto:[email protected]>> wrote: > > As i don't really need more than 1 version back, i'm playing with idea > of using couchdb for that. Either putting the files as attachments, or > if not possible, using it as filesystem-miming synchronised metadata, > with appropriate listeners reacting on changes (like rename, mv, etc).
+1 to all Jens & Nils said with 2 more points. If you store only metadata in couch, using a hash like md5 of the data instead of the actual filename, then using that to point to the stored files on disk is quite attractive. Renames, moves, are all internal to couchdb as the data hasn't changed. It will deduplicate itself as well if you have multiple copies (e.g. revisions of docs). The down side of putting stuff outside couch is that you need to manage the things you get for free: - easy replication model - deletion handling (how many docs have this file, should I delete this file now because the document attachment was deleted, etc) - streaming of data from within couchdb - inbuilt compression - keeping replication partners in sync (I don't need this doc anymore but the others don't yet have the updated copy type problems, esp in mesh replication topology) The other nasty thing about attachments in couch is that during replication, if there is a failure we can't restart part-way through. And as they're stored directly on disk, we duplicate that waste on both the network, and in storage inside the DB file. This may or may not be a problem for your use case. A+ Dave
