On 13/10/2020 11:48, Robert Samuel Newson wrote:
Hi All,

As part of CouchDB 4.0, which moves the storage tier of CouchDB into 
FoundationDB, we have struggled to reproduce the full map/reduce functionality. 
Happily this has now happened, and that work is now merged to the couchdb main 
branch.

\o/

This functionality includes the use of custom (javascript) reduce functions. It 
is my experience that these are very often problematic, in that much more often 
than not the functions do not significantly reduce the input parameters into a 
smaller result (indeed, sometimes the output is the same or larger than the 
input).

Agreed, it is very rare that I find a well-written custom reduce function. It happens, though, and the people who write them are also advanced or expert CouchDB users. They would know how to toggle the default.
To that end, I'm asking if we should deprecate the feature entirely.

and, from the reply to Jonathan:

I also think if custom reduce was disabled by default that we would be 
motivated to expand this set of built-in reduce functions.
If deprecation means eventual removal, we need to take additional steps.

What would help inform this decision would be a survey of the community for custom reduce functions. If this can then inform writing more built-in _reduces that we ship in various 4.x releases, and remove the feature in 5.0, that could work.

There needs to be a concerted effort to reach out to users and understand these use cases, followed by a similar effort to write replacements and have the community vet them. To date we've only added two new built-in enhancements I can remember, and that's the HyperLogLog stuff, plus the ability to do _sum / _count / _stats on lists and objects (which was a Cloudant donation about 6 years ago, IIRC).

Here's some examples of custom reduces I've seen recently that could not be satisfied by our current built-ins:

* wallet/balance calculation, based on transactional data
* _stats like functionality, but derived from complex documents that
  have lists of objects that must be iterated over
* advanced statistical calculation: ANOVA, t-test, linear regression,
  bayesian, etc.

None of these are unsolveable, but they will require effort. I'm ready to help talk to users if this is the direction we want to go, but I want to see a firm commitment by other developers to help implement new built-in reduces brought to the table before +1'ing this decision. Companies like IBM/Cloudant and Neighbourhoodie have special access here, and would be key players in helping get this work done.

Let's contrast this with a famous deprecation that didn't go as well: list/show/rewrites removal. Most of us agree that this functionality is much better served by parallel servers that have a huge plethora of functionality available to them, plus a wide base of support outside of our own ecosystem. Critically, these functions are purely transformative: none store new data into the database. I'm don't think a similar approach makes sense for custom reduce, since those results *are* pre-calculated and stored.

One more contrast. Two years ago, I wrote up a spec to introduce VDU and update handler functionality into Mango[1]. Here's a situation where there was broad user acceptance, and general agreement on the direction to move forward. We could arguably deprecate our current approach for these once this functionality has built. The problem has been finding someone willing to develop it -- I don't have the time.

Looking forward to others' thoughts.

-Joan "developers, developers, developers" Touzet

[1]: https://github.com/apache/couchdb/issues/1554



In scope for this thread is the middle ground proposal that Paul Davis has 
written up here;

https://github.com/apache/couchdb/pull/3214

Where custom reduces are not allowed by default but can be enabled.

The core _ability_ to do custom reduces will always been maintained, this is 
intrinsic to the design of ebtree, the structure we use on top of FoundationDB 
to hold and maintain intermediate reduce values.

My view is that we should merge #3214 and disable custom reduces by default.

B.

Reply via email to