Hi Calle, At least you could move the post-processing to server side using a list function: http://wiki.apache.org/couchdb/Formatting_with_Show_and_List
A better option for performance is to do the filtering inside the reduce function. Try to look at this snippet, looks close to what you are trying to achieve: http://wiki.apache.org/couchdb/View_Snippets#Retrieve_the_top_N_tags. Good luck, Mehdi On Mon, Sep 19, 2011 at 1:27 PM, Calle Dybedahl <[email protected]>wrote: > Hello. > > I have a pretty simple pair of map and reduce functions. The first is > basically just emitting a key and a 1, and the reduce is the built-in _sum > function. This works fine, and tells me how many times every key has been > seen. > > Now, the problem is that I'm actually only interested in the handful of > keys that have been seen the most often. The data fits a power-law > distribution, which means that there is a long tail that I'm not at all > interested in. And by "long" here I'm talking about tens of thousands of > rows. At the moment, my client-side code spends more than 99.9% of its > runtime receiving and parsing JSON from the CouchDB server, very nearly all > of which it will promptly throw away as soon as it's been parsed. This is > annoying and silly. > > Is there any way at all to filter the results of a reduced query on the > CouchDB end? Alternatively, is there a way for a reduce function to know > that it's the final stage in the re-reduce chain (if I could drop all keys > with a final value of 1, I'd save an order of magnitude of runtime)? > > I can't be the first one ever to run into a problem like this, but I've > failed to find any solutions on the net. > -- > Calle Dybedahl > [email protected] -*- +46 703 - 970 612 > > > > -- * Mehdi El Fadil twitter: @mango_info <http://www.twitter.com/mango_info> website: http://www.mango-is.com linkedin: http://be.linkedin.com/in/elfadme * <http://www.mango-is.com>
