Bryan, I have a similar situation. As you are probably seeing when you do a read it takes a while for the map reduce view to catchup (even a basic 2 x NVP emit with no reduce; document sizes range from (100k to 7.5MB) and we have 90K of them). I've got round this by using "?stale=update_after" parameter, but this is not my real issue. I want to discover useful information from our logs i.e. i don’t know what need from the logs yet, however the feedback loop for doing this discovery seems to be very long i.e. review document, write java script, build view (on a very very small and un representative data set ie. 10 documents) and repeat. This takes a long time even before I have to apply it to the real data set which then takes 6 hours to build the view. So in summary:
1. Takes way too long to re-build and update views 2. Feedback loop is long for information discovery 3. High write low read on largish documents means a high query latency (map reduce catchup) As you say there's a lot to like about Couchdb (schema less, replication , JavaScript Query, incremental map reduce, RESTful API, its just plain simple), pus all my data is in it now! So what I'm currently looking to do is use Couch as a message/document store but add the following on top of it: 1. couchdb-lucene to; speed up information discovery 2. Luciddb; to pull my course grained view data created from the discovery phase in to luciddb to enable ad-hoc querying No idea if this helps you or not, but you're not alone :-) Mike -----Original Message----- From: bryan rasmussen [mailto:[email protected]] Sent: 10 May 2012 08:46 To: [email protected] Subject: when couchdb is not right for my use case Hi, I really like working with couchdb, one of the benefits it gives at the beginning of a project is the ability to play with data, to determine the right data structure that one actually needs (since I'm an XML guy this is pretty important to me[I also think couchdb does this much better than XQuery based DBs - too strongly typed]) So anyway, because I like couchdb I have embarked on a apache/solr logs analysis project for which couchdb does not seem to be well-suited (which I knew beforehand but was using couchdb as quick proof of concept for some of the things I wanted to do.) the drawbacks are: logs pile up quickly, so the project is write intensive. Since the data is being used internally for reports it is not likely to be read intensive. Should not need any revision management. A lot of the benefits of db replication will not be useful. Lots of views to data need to be provided. So has anyone ever had a similar situation, and what did you move to as your DB. Or how did you structure you couchdb solution to make it more suitable? Thanks, Bryan Rasmussen
