I'd like to read some clarifying comments about _why_ we want to disable it by default. Maybe they've already been made, and I missed them.

I gather that we perceive (probably rightfully so) that most custom reduce functions are poorly written, and hurt performance. Is this the primary/only reason to disable it? Just to give users one more hoop to jump through before shooting themselves in the foot?

The original message on 10/13 from Robert Newson began by explaining some technical difficulties surrounding custom reduce functions, but then explained that those have been resolved.

Maybe this muddied the issue in my mind--are there still any _technical_ reasons to disable custom reduce by default, or is it purely the (mis)usability issue?

While it may change with more information, at the moment, my feelings on the two issues below are:

-0.5 for deprecate custom reduce functions
-0.0 for disable custom reduce functions by default

Jonathan


On 10/13/20 10:21 PM, Robert Samuel Newson wrote:
Nick, let's broaden the thread to two questions then;

1) Deprecate custom reduce functions
2) Disable custom reduce functions by default, but don't deprecate them.



On 13 Oct 2020, at 21:16, Nick Vatamaniuc <vatam...@gmail.com> wrote:

In case of _sum, like Joan mentioned, we can emit objects or arrays
and the built-in _sum will add the values of the fields together:

So  {"map": 'function(d){ emit(d._id, {"bar":1, "foo":2, "baz":3});
}',  "reduce" : '_sum' } for 10 docs would produce {"bar": 10, "baz":
30, "foo": 20}.

As for the deprecation, I wouldn't necessarily call for deprecation
but I can see leaving it disabled by default and let the users enable
it if they want to. If we see that there is a good demand for custom
functions, and it is annoying for users to have to enable it, we could
revert it back to enabled by default or like it was discussed, or, try
to add more built-in reducers.

Cheers,
-Nick

On Tue, Oct 13, 2020 at 3:38 PM Jonathan Hall <fli...@flimzy.com> wrote:
So looking through the code that uses this, it looks like the main use
I've had for custom reduce functions is summing multiple values at
once.  A rough equivalent of 'SELECT SUM(foo),SUM(bar),SUM(baz)'.

The first thing that comes to mind to duplicate this functionality
without a custom reduce function would mean building one unique index
for each value that needs to be summed, which I expect would be a lot
less efficient.

But maybe I'm overlooking a more clever and efficient alternative.

Jonathan


On 10/13/20 6:31 PM, Robert Samuel Newson wrote:
Hi,

Yes, that's what I'm referring to, the javascript reduce function.

I'm curious what you do with custom reduce that isn't covered by the built-in 
reduces?

I also think if custom reduce was disabled by default that we would be 
motivated to expand this set of built-in reduce functions.

B.

On 13 Oct 2020, at 17:06, Jonathan Hall <fli...@flimzy.com> wrote:

To be clear, by "custom reduce functions" you mean this 
(https://docs.couchdb.org/en/stable/ddocs/ddocs.html#reduce-and-rereduce-functions)?

So by default, only built-in reduce functions could be used 
(https://docs.couchdb.org/en/stable/ddocs/ddocs.html#built-in-reduce-functions)?

If my understanding is correct, I guess I find it a but surprising. I've always 
thought of map/reduce of one of the core features of CouchDB, so to see half of 
that turned off (even if it can be re-enabled) makes me squint a bit. And it is 
a feature I use, so I would not be in favor of deprecating it entirely, without 
a clear proposal/documentation for an alternative/work-around.

Based on the explanation below, it doesn't sound like there's a technical 
reason to deprecate it, but rather a user-experience reason. Is this correct?

If my understanding is correct, I'm not excited about the proposal, but before 
I dive further into my thoughts, I'd like confirmation that I actually 
understand the proposal, and am not worried about something else ;)

Jonathan


On 10/13/20 5:48 PM, Robert Samuel Newson wrote:
Hi All,

As part of CouchDB 4.0, which moves the storage tier of CouchDB into 
FoundationDB, we have struggled to reproduce the full map/reduce functionality. 
Happily this has now happened, and that work is now merged to the couchdb main 
branch.

This functionality includes the use of custom (javascript) reduce functions. It 
is my experience that these are very often problematic, in that much more often 
than not the functions do not significantly reduce the input parameters into a 
smaller result (indeed, sometimes the output is the same or larger than the 
input).

To that end, I'm asking if we should deprecate the feature entirely.

In scope for this thread is the middle ground proposal that Paul Davis has 
written up here;

https://github.com/apache/couchdb/pull/3214

Where custom reduces are not allowed by default but can be enabled.

The core _ability_ to do custom reduces will always been maintained, this is 
intrinsic to the design of ebtree, the structure we use on top of FoundationDB 
to hold and maintain intermediate reduce values.

My view is that we should merge #3214 and disable custom reduces by default.

B.

Reply via email to