Have you tried having a map function that emits the 'a_name' as a key then a reduce function that uses the builtin sum(), then you query with group=true?
A little info: group option: https://wiki.apache.org/couchdb/HTTP_view_API#Querying_Options look at the last example before 'Enforcing Uniqueness' http://guide.couchdb.org/draft/cookbook.html#aggregate On Thu, May 26, 2011 at 3:12 AM, Torstein Krause Johansen < [email protected]> wrote: > Hi all, > > I have problems solving the following problem with CouchDB and am wondering > if I'm trying to solve something for which Couch isn't suitable, if there is > something I have misunderstood or if there's some hidden feature I haven't > discovered yet. > > I have documents with the following fields: > { > one_id : 1, > another_id : 22, > created_at : "2011-05-26", > a_name : "Lisa" > } > > I want to search all occurrences with a combination of the three first ones > as query parameters and then count the number of a_name occurrences within > each of these search collections. For this reason, I put this into my > view/map.js: emit([one_id, another_id, created_at], a_name); > > Now, using these keys and start/end key, I get the result rows I want. So > far so good. > > My next step, is that I want to count the number of a_name within each of > these hits, producing a dictionary like: > { > "John" : 234142, > "Dominique" : 21177, > "Lisa" : 123 > } > > Initially, I tried to do this with a reduce.js, but couldn't work out how > I'd go about this. The documentation I've read on reduces only mentions > simple (built in) functions for counting and summing up the total rows and > what I want here are counts based on the values themselves as "keys" in the > view's result. > > I've managed to get working using (exploiting?) lists, but this doesn't > scale well with 100 000s of rows. > > For these reasons, I've resorted to doing two view operations, one to get > the initial results and one to get the count of each a_name within the first > result. This works, but doesn't feel optimal. Also, the returned dataset of > the first search is overwhelming, leading to a ~5-7 second download of the > data (and putting nginx/gzip infront of Couch didn't improve matters enough > :-) > > The total time it takes to do my two queries adds up to ~6-9 seconds, > something which is not fast enough for my application and I am therefore > seeking your guidance. > > Cheers, > > -Torstein > > > > > > > -- “The limits of language are the limits of one's world. “ -Ludwig von Wittgenstein
