On 12 Mar 2010, at 11:56, Julian Stahnke wrote:
> Am 12.03.2010 um 17:24 schrieb J Chris Anderson:
>
>>
>> On Mar 12, 2010, at 7:10 AM, Julian Stahnke wrote:
>>
>>> Hello!
>>>
>>> I have a problem with a view being slow, even though it’s indexed and
>>> cached and so on. I have database of books (–120,000 documents) and a
>>> map/reduce function that counts how many books there are per author. I’m
>>> then calling the view with ?group=true to get the list. I’m neither
>>> emitting nor outputting any actual documents, only the counts. This results
>>> in an output of about 78,000 key/value pairs that look like the following:
>>> {"key":"Albert Kapr","value":3}.
>>>
>>> Now, even when the view is indexed and cached, it still takes 60 seconds to
>>> receive the output, using PHP’s cURL functions, the browser, whatever I’ve
>>> tried. Getting the same output served from a static file takes only a
>>> fraction of a second.
>>>
>>> When I set limit=100, it’s basically instantaneous. I want to sort the
>>> output by value though, so I can’t really limit it or use ranges. Trying it
>>> with about 7,000 books, the request takes about 5 seconds, so it seems to
>>> be linear to the number of lines being output?
>>
>> For each line of output in the group reduce view, CouchDB must calculate 1
>> final reduction (even when the intermediate reductions are already cached in
>> the btree). This is because the btree nodes might not have the exact same
>> boundaries as your group keys.
>>
>> There is a remedy. You can replace your simple summing reduce with the text
>> "_sum" (without quotes). This triggers the same function, but implemented in
>> Erlang by CouchDB. Most of your slowness is probably due to IO between
>> CouchDB and serverside JavaScript. Using the _sum function will help with
>> this.
>>
>> There will still be a calculation per group reduce row, but the cost is much
>> lower.
>>
>> Let us know how much faster this is!
>>
>> Chris
>
> Oh wow, thanks! It’s now taking about 4 seconds instead of a minute!
>
> Is this function documented somewhere? I didn’t come across it anywhere, so I
> added it to the Performance page in the wiki:
> http://wiki.apache.org/couchdb/Performance I hope that is okay.
Thanks for adding it :)
Cheers
Jan
--