Bryan,

I have a similar situation. As you are probably seeing when you do a read it 
takes a while for the map reduce view to catchup (even a basic 2 x NVP emit 
with no reduce; document sizes range from (100k to 7.5MB) and we have 90K of 
them). I've got round this by using "?stale=update_after" parameter, but this 
is not my real issue. I want to discover useful information from our logs i.e. 
i don’t know what  need from the logs yet, however the feedback loop for doing 
this discovery seems to be very long i.e. review document, write java script, 
build view (on a very very small and un representative data set ie. 10 
documents) and repeat. This takes a long time even before I have to apply it to 
the real data set which then takes 6 hours to build the view. So in summary:

1. Takes way too long to re-build and update views
2. Feedback loop is long for information discovery
3. High write low read on largish documents means a high query latency (map 
reduce catchup)

As you say there's a lot to like about Couchdb (schema less, replication , 
JavaScript Query, incremental map reduce, RESTful API, its just plain simple), 
pus all my data is in it now! So what I'm currently looking to do is use Couch 
as a message/document store  but add the following on top of it:

1. couchdb-lucene to; speed up information discovery
2. Luciddb; to pull my course grained view data created from the discovery 
phase in to luciddb to enable ad-hoc querying

No idea if this helps you or not, but you're not alone :-)

Mike 

-----Original Message-----
From: bryan rasmussen [mailto:[email protected]] 
Sent: 10 May 2012 08:46
To: [email protected]
Subject: when couchdb is not right for my use case

Hi,

I really like working with couchdb, one of the benefits it gives at
the beginning of a project is the ability to play with data, to
determine the right data structure that one actually needs (since I'm
an XML guy this is pretty important to me[I also think couchdb does
this much better than XQuery based DBs - too strongly typed])

So anyway, because I like couchdb I have embarked on a apache/solr
logs analysis project for which couchdb does not seem to be
well-suited (which I knew beforehand but was using couchdb as quick
proof of concept for some of the things I wanted to do.)

the drawbacks are:

logs pile up quickly, so the project is write intensive. Since the
data is being used internally for reports it is not likely to be read
intensive.
Should not need any revision management.
A lot of the benefits of db replication will not be useful.
Lots of views to data need to be provided.

So has anyone ever had a similar situation, and what did you move to
as your DB. Or how did you structure you couchdb solution to make it
more suitable?

Thanks,
Bryan Rasmussen

Reply via email to