Re: Not-even-yet-newbie question

Brian Candler Sat, 18 Apr 2009 10:22:53 -0700

On Fri, Apr 17, 2009 at 02:01:51AM +0200, André Warnier wrote:
> I would be interested to understand if CouchDB would provide a reliable  
> and efficient replacement for our self-developed and self-maintained  
> storage structure.


Probably, but there are a few caveats to beware of.

1. The couchdb database is a single append-only file. Your filesystem needs
to support huge files.

2. Once you get up to terabytes of documents, it may become impractical
and/or too slow to compact the database, which involves reading the entire
database from start to end and writing a completely new copy.

In your case it sounds like you normally just append documents and leave
them there forever. However, suppose you have a customer who leaves, and is
no longer paying you for the half terabyte of storage they are using? Or
another who, for legal reasons, requires a document to be purged? (Deleting
a document in couchdb just marks it as deleted; it can still be retrieved
until a compaction has been done.)

I would suggest that the easiest away round these problems - and also a good
way to improve security - is to have a separate couchdb database for each
customer. This still only requires running a single instance of the couchdb
server.

> I also seem to have understood that if one of these repositories  
> suddenly became unavailable because the big one just hit, a document  
> request would automatically be satisfied by the next available one in  
> line. Yes ?

Not as yet. It's up to you to proxy to the appropriate database in your
application.

> Would there be some way in CouchDB to store one such document, in some  
> logical group containing the original version (say OpenOffice text),  
> along with its PDF/A version (which we generate when the document is  
> originally stored) and with an image of the first page (ditto), in such  
> a way that by using the "main key" plus some additional parameter, I can  
> retrieve whichever version I need now ?

Document plus multiple attachments. The "document" would just be the
searchable metadata, whilst the original and derived versions would each be
attachments.

> Would I need to become proficient in Erlang before I can store a new  
> document or retrieve a stored one

Nope: just HTTP PUT and GET.

Regards,

Brian.

Re: Not-even-yet-newbie question

Reply via email to