On Mar 4, 2009, at 3:28 AM, Chris Anderson wrote:

On Tue, Mar 3, 2009 at 1:32 PM, Wout Mertens <[email protected]> wrote:
Would the problem be alleviated if you could specify for views that couch should not reduce past the group level? In other words, only calculate
what's needed for views with group=true?


Sort of. Essentially this would require an entirely different
map/reduce implementation. It would probably only provide reductions
at the group level (like Hadoop reduce). CouchDB is open to /
interested in alternate view engines, and something like this could
probably be created in a not-to-overwhelming amount of Erlang, on top
of CouchDB's btree storage engine. Patches welcome! (Also, there are
some patches floating around - once 0.9.0 is off our plate we'll
probably have more spare cycles available for evaluating/consolidating
them.)

Actually, I just did some tests around this, and it turns out that if you always query with group=true, CouchDB never runs the final reduce!

I tested it by making a temporary view in Futon:

map:    function(doc) { emit(doc._rev%10,doc._id); }
reduce: function(k,v,r) { if(r) { log(["rereduce",v]); } else { log(["reduce",v]); } return(v); }

Just interacting with the view in Futon doesn't run rereduce on all view keys. Once you access the view directly, without the group=true parameter, CouchDB calculates the (re)reduce. Actually I didn't realize, but for small databases, it never calls rereduce. That makes sense now.

So as long as you promise never to run a particular "wide view" without the group=true parameter, and the "wideness" of your view results is manageable, it looks like you should be ok.

Of course, some attacker could DoS your server by calling the view without group=true :-/

Let's say I'm 70% certain of the above being true. I think I'm still missing some subtleties in map/reduce. Any opinions?

Anyway, CouchDB rocks :-)

Wout.

Reply via email to