On 10/30/2010 03:52 PM, Anand Chitipothu wrote:
I'm trying to setup a couchdb database with 14M documents. The view
generation is taking too long. It is running at the rate of 22
docs/sec right now. At this rate it will take 7days to build the view,
which is too slow and I expect the speed to go down further as the
view file size increase.
Hi ,
What is the size of the design document files on the drive ?
I noticed that large views use quite large file ;).
I also noticed that the view group indexers take a large amount time to
achieve the last 30% of the task. At least twice then to complete the
first 70%.
In my case I have a 'small' database containing 400K docs. I also hava
a design doc that indexes 80% of the docs with 8 views. Map functions
only emit a single property per doc and a null value, so they should be
compact.
The overall size of this desing doc .view file on disk is 17G ;).
I don't know how couchdb handles the update of such large files but
maybe there is something with updating large files ...
Concerning the performance, I use std javascript as interpreter and get
a rate of ~60 changes/sec in the beginning of the process.
Then it drops to 15c/s after 70%.
I'm about 6c/s, then after 85%
The first 70% took 52minutes and the whole process runned for 3h21m on a
small stand alone dedicated server.
So I get the feeling that it is not an issue with the view "calculation"
algo, but probably something that is related to the disk i/o.
I have no erlang knowlege, and I might be quite wrong about the feeling,
but if you guys know a little bit on this part of couch code maybe
there is something that could be checked and would improve the overall
design doc refresh performance ?
Regards,
cdrx