On 10/30/2010 03:52 PM, Anand Chitipothu wrote:
I'm trying to setup a couchdb database with 14M documents. The view
generation is taking too long. It is running at the rate of 22
docs/sec right now. At this rate it will take 7days to build the view,
which is too slow and I expect the speed to go down further as the
view file size increase.


Hi ,

What is the size of the design document files on the drive ?

I noticed that large views use quite large file ;).

I also noticed that the view group indexers take a large amount time to achieve the last 30% of the task. At least twice then to complete the first 70%.

In my case I have a 'small' database containing 400K docs. I also hava a design doc that indexes 80% of the docs with 8 views. Map functions only emit a single property per doc and a null value, so they should be compact.

The overall size of this desing doc .view file on disk is 17G ;).

I don't know how couchdb handles the update of such large files but maybe there is something with updating large files ...

Concerning the performance, I use std javascript as interpreter and get a rate of ~60 changes/sec in the beginning of the process.

Then it drops to 15c/s after 70%.

I'm about 6c/s, then after 85%

The first 70% took 52minutes and the whole process runned for 3h21m on a small stand alone dedicated server.

So I get the feeling that it is not an issue with the view "calculation" algo, but probably something that is related to the disk i/o.

I have no erlang knowlege, and I might be quite wrong about the feeling, but if you guys know a little bit on this part of couch code maybe there is something that could be checked and would improve the overall design doc refresh performance ?

Regards,

cdrx



Reply via email to