On Fri, Apr 17, 2009 at 02:01:51AM +0200, André Warnier wrote: > I would be interested to understand if CouchDB would provide a reliable > and efficient replacement for our self-developed and self-maintained > storage structure.
Probably, but there are a few caveats to beware of. 1. The couchdb database is a single append-only file. Your filesystem needs to support huge files. 2. Once you get up to terabytes of documents, it may become impractical and/or too slow to compact the database, which involves reading the entire database from start to end and writing a completely new copy. In your case it sounds like you normally just append documents and leave them there forever. However, suppose you have a customer who leaves, and is no longer paying you for the half terabyte of storage they are using? Or another who, for legal reasons, requires a document to be purged? (Deleting a document in couchdb just marks it as deleted; it can still be retrieved until a compaction has been done.) I would suggest that the easiest away round these problems - and also a good way to improve security - is to have a separate couchdb database for each customer. This still only requires running a single instance of the couchdb server. > I also seem to have understood that if one of these repositories > suddenly became unavailable because the big one just hit, a document > request would automatically be satisfied by the next available one in > line. Yes ? Not as yet. It's up to you to proxy to the appropriate database in your application. > Would there be some way in CouchDB to store one such document, in some > logical group containing the original version (say OpenOffice text), > along with its PDF/A version (which we generate when the document is > originally stored) and with an image of the first page (ditto), in such > a way that by using the "main key" plus some additional parameter, I can > retrieve whichever version I need now ? Document plus multiple attachments. The "document" would just be the searchable metadata, whilst the original and derived versions would each be attachments. > Would I need to become proficient in Erlang before I can store a new > document or retrieve a stored one Nope: just HTTP PUT and GET. Regards, Brian.
