After looking at this more, let me restate. I would totally get all of this if the signature of reduce was:reduce: function(key, values, rereduce)
What I don't get is: why does reduce get called with an arbitrarily long list of keys? I thought reduce was precisely for reducing all of the mapped inputs that are indexed under the *same* key. I think if I can get that, the rest will come clear. Thanks again, A On Tue, Jan 20, 2009 at 7:52 PM, Adam Wolff <[email protected]> wrote: > Thanks for the reply! > I'd seen all of this, though I re-read the wikipedia entry carefully. > Damien's blog entries don't appear to match the APIs in the version I'm > running, which is 0.8.1 > The wikipedia entry suggests that reduce is called only with values that > match a single key. Using the log() function in CouchDB, I can see that's > not the case for its reduce function -- it's called with multiple different > keys, though it does appear that the input values are *ordered* by matching > keys. > > Anyway, I totally get how re-reduce (or "combine") works in conventional > map/reduce, but I'm hazy on the details w/r/t to CouchDB. I'm starting to > understand the answer to #1, but I'm really unclear on #2 (how/why rereduce > is run.) > > Thanks again, > A > > > On Tue, Jan 20, 2009 at 6:50 PM, Jeff Hinrichs - DM&T > <[email protected]>wrote: > >> On Tue, Jan 20, 2009 at 7:47 PM, Adam Wolff <[email protected]> wrote: >> > Hi everyone,I'm really excited about CouchDB and I've started playing >> with >> > it. I get all of it, except for reduce, and especially re-reduce. >> > >> > My first question is: how does CouchDB maintain all the separate output >> for >> > a given key from the map function? I mean: given a simple reduce that >> just >> > sums results, how does couch maintain separate results for each possible >> > key/key range that can be given as input to that view? >> > >> > My second question: when and why does rereduce get called? Is this >> simply to >> > allow the server to chunk the processing, or is there semantic meaning >> to >> > it? I had assumed the former -- it's just a way of limiting the size of >> the >> > input to the reduce function -- but then this really confused me: if I >> log >> > each time my reduce function gets called, I see that the last time it's >> > called, it's with rereduce=false. How is this possible? Don't all the >> > results have to be funneled through rereduce to produce a single result >> > value? >> > >> > Any help here would be much appreciated. If there's a resource on the >> web I >> > should look at, please send it my way. Thanks! >> > >> > A >> Being that I just went through the learning process on reduce, I'll >> point you here: >> http://wiki.apache.org/couchdb/Introduction_to_CouchDB_views >> "Reduce Functions" >> >> As a good place to start. >> Also, the mailing list, is an excellent resource. >> >> http://mail-archives.apache.org/mod_mbox/couchdb-user/200901.mbox/%[email protected]%3e >> >> along with: >> http://en.wikipedia.org/wiki/MapReduce >> http://labs.google.com/papers/mapreduce.html >> and >> http://damienkatz.net/2008/02/incremental_map.html >> >> Regards, >> >> Jeff >> > >
