On Mar 4, 2009, at 3:28 AM, Chris Anderson wrote:
On Tue, Mar 3, 2009 at 1:32 PM, Wout Mertens <[email protected]>
wrote:
Would the problem be alleviated if you could specify for views that
couch
should not reduce past the group level? In other words, only
calculate
what's needed for views with group=true?
Sort of. Essentially this would require an entirely different
map/reduce implementation. It would probably only provide reductions
at the group level (like Hadoop reduce). CouchDB is open to /
interested in alternate view engines, and something like this could
probably be created in a not-to-overwhelming amount of Erlang, on top
of CouchDB's btree storage engine. Patches welcome! (Also, there are
some patches floating around - once 0.9.0 is off our plate we'll
probably have more spare cycles available for evaluating/consolidating
them.)
Actually, I just did some tests around this, and it turns out that if
you always query with group=true, CouchDB never runs the final reduce!
I tested it by making a temporary view in Futon:
map: function(doc) { emit(doc._rev%10,doc._id); }
reduce: function(k,v,r) { if(r) { log(["rereduce",v]); } else
{ log(["reduce",v]); } return(v); }
Just interacting with the view in Futon doesn't run rereduce on all
view keys. Once you access the view directly, without the group=true
parameter, CouchDB calculates the (re)reduce. Actually I didn't
realize, but for small databases, it never calls rereduce. That makes
sense now.
So as long as you promise never to run a particular "wide view"
without the group=true parameter, and the "wideness" of your view
results is manageable, it looks like you should be ok.
Of course, some attacker could DoS your server by calling the view
without group=true :-/
Let's say I'm 70% certain of the above being true. I think I'm still
missing some subtleties in map/reduce. Any opinions?
Anyway, CouchDB rocks :-)
Wout.