Re: Some stats about couch DB

Talib Sharif Fri, 23 Jul 2010 22:52:03 -0700

Thanks Chris,

This is extremely helpful.


-Talib

On Jul 23, 2010, at 6:42 PM, J Chris Anderson wrote:

On Jul 23, 2010, at 5:01 PM, Talib Sharif wrote:
Hi All,
As I am playing more and more with couchdb (it is relaxing andfun), i just am trying to understand the limits and theexpectations in large production system environment.
Right now i have about 100K documents and i have about 10 differentviews, one of the view generates does about 100 emits per document.
As i am building the view indexes, it is taking about 7-8 hours oftime.
this is about right for 10 million rows. That works out to about 350rows per second (maybe more depending on what your other view aredoing), which is a bit slower than I'm used to seeing, but itdepends on the size of your emitted keys and values. If you canshrink the keys or the values you should see some speedup (marginal,not an order of magnitude).
because view generation is incremental, in production the 7-8 hoursisn't the big issue, it's whether view generation can keep up withthe insert rate. So if you are generating less than a few documentsper second (x 100 emitted rows) then you should be able to keep theindexes current. If the indexes start to fall behind I'd suggesteither upgrading hardware or moving to a clustered solution likeCouchDB-Lounge.
for purposes of prototyping you will probably be happier working ona subset of the documents.
I would like to know is that how are other people using it?
Is 7-8 or even 24 hours of checkpointing view generation typical?
How many documents do people have??
How is other people's experience in genereting a view on lets say 1MIllion documents.
I have switched to the native _sum function for reduce. What elseis taking long? Is it the map function written in JavaScript? Is itthe Index that's getting too big?
using an Erlang view function could potentially speed things up (butmy guess is you are more likely disk-io bound, not CPU bound, somaybe it won't make much difference.)
Is the view generation linear or does it gets worse when you havemore documents?
the btree should get slower at roughly O(log n) where n is thenumber of rows. The base of the log is pretty big, too. Once you getup to the billion-rows territory you'll probably want to look moreclosely at CouchDB Lounge or the Cloudant clustering.
I would extremely appreciate help in answering or discussing thesequestions.
Thanks in advance,
Talib

Re: Some stats about couch DB

Reply via email to