On 12 October 2011 14:22, Arnaud Bailly <[email protected]> wrote: > Hello, > We have started experimenting with CouchDb as our backend, being especially > interested with the changes API, and we ran into performances issues. > We have a DB containing aournd 3.5M docs, each about 10K in size. Running > the following query on the database : > > http://192.168.1.166:5984/infowarehouse/_changes?since=0 > > takes about 30minutes on a 4-core, Windows 7 box, which seems rather high. > > Is this expected ? Are there any bench available on this API ?
I'm not too surprised - CouchDB is probably building a massive JSON changes response containing 3.5M items ;-). Instead you should use the since=<start> and limit=<batch-size> args together to get the items in sensibly-sized batches, ending when you see no more items in the response. Alternatively, you might be able to use feed=continuous with timeout=0 to stream the changes as fast as possible. The timeout=0 arg is just there to shutdown the changes feed as soon as you've seen everything. My laptop takes about 50s to stream about 1M changes using this technique (sending the output to /dev/null). - Matt
