Hi All, Digging up this thread again just to let you know I wrote an RFC on implementing reduce functions on top of FoundationDB https://github.com/apache/couchdb-documentation/pull/441
Cheers Garren On Thu, Apr 25, 2019 at 6:05 PM Adam Kocoloski <kocol...@apache.org> wrote: > Hiya Garren, thanks for starting this one. A few comments inline: > > > On Apr 23, 2019, at 11:38 AM, Garren Smith <gar...@apache.org> wrote: > > > > Hi All, > > > > Following on from the map discussion, I want to start the discussion on > > built-in reduce indexes. > > > > ## Builtin reduces > > Builtin reduces are definitely the easier of the two reduce options to > > reason about and design. The one factor to keep in mind for reduces is > that > > we need to be able reduce at different group levels. So a data model for > > that would like this: > > > > {?DATABASE, ?VIEWS, ?VIEW_SIGNATURE, ?VIEWS, <view_id>, ?REDUCE, > > <group_level>, <group_key>, <_reduce_function_name>} -> > <aggregrate_value>} > > - I think this should be <view_signature>, not ?VIEW_SIGNATURE — you’re > intending for the actual checksum to go in that element of the tuple, > right? I wonder about using a Directory in that element of the tuple; > adding the full signature to every key will blow up the index substantially. > > - I don’t think the second ?VIEWS is needed — you already scoped to ?VIEWS > above it. > > - Why is the <_reduce_function_name> after the <group_key>? If someone > defines multiple reduce functions on a single view and then wants to do a > range query on the grouped result this will mean reading and discarding the > output from the other reduce function. It seems to me that it would make > more sense to put this directly after ?REDUCE. > > - You could consider having specific constants for each builtin reduce > function, e.g. ?SUM, instead of the full “_sum” byte string in > <_reduce_function_name>, which would save several bytes per key. > > > > > Most of that is similar to the map data model, where it changes is from > the > > ?REDUCE subspace, we add the group_level (from 1 -> number of keys > emitted > > in the map function), then the group key used in the reduce, the reduce > > function name e.g _sum, _count and then we store the aggregated value as > > the FDB value. > > I think we need a better definition of <group_level> here. It’s not really > the number of keys emitted from the map function, it’s the number of > elements of an array key emitted by the map function that are used to > determine the degree of aggregation. > > Also, before we even get to group_level we should describe how reduce > functions work with keys that are not arrays. In the current API this is > group=true (separate aggregations for each set of KV pairs that share the > same key) or group=false (one aggregated result for the entire view). > Internally in the current codebase we map these two settings to > group_level=exact and group_level=0, respectively. Basically, we need a way > to represent “exact” in the <group_level>. > > > ### Index management > > > > To update the reduce indexes, we will rely on the `id_index` and the > > `update_seq` defined in the map discussion. Then to apply changes, we > > calculate the change of an aggregate value for the keys at the highest > > group level, then apply that change to all the group levels lower than it > > using fdb’s atomic operations [1]. > > This is a really important point and worth calling out more explicitly. > One reason most of the builtins are significantly simpler than the custom > functions is because we know exactly how the aggregate values change when > we get the map output for each updated doc in isolation. We don’t need to > go retrieve all the rest of the map output that happens to share the same > map key. Moreover, the change to the aggregate value can be applied using > atomic operations so we don’t need to do anything special to avoid txn > conflicts for views where lots of documents emit the same map key. > > > ### Reducer functions > > > > The FDB’s atomic functions support all the built in reduce functions > > CouchDB supports. So we can use those as part of our red function. For > the > > `_stats` reduce function, we will have to split that across multiple key > > values. So its data model will have an extra key in it to record what > stat > > it is for the _stats reducer: > > > > {?DATABASE, ?VIEWS, ?VIEW_SIGNATURE, ?VIEWS, <view_id>, ?REDUCE, > > <group_level>, <group_key>, <_reduce_function_name>, <_stat_field>} -> > > <aggregrate_value>} > > > > We do have some problems, with `_approx_count_distinct` because it does > > not support removing keys from the filter. So we have three options: > > > > 1. We can ignore key removal entirely in the filter since this is just an > > estimate > > 2. Implement a real COUNT DISTINCT function, we can do because we’re not > > trying to merge results from different local shards > > 3. Don’t support it going forward > > There’s actually a fourth option here, which is to maintain > _approx_count_distinct using the future machinery for custom reduce > functions, generating a new filter from the raw map output at the lowest > level when keys are removed. I don’t have an informed opinion yet about > that approach. > > > ### Group_level=0 > > > > One tricker situation is if a user does a group_level=0 query with a key > > range, this would require us to do some client level aggregation. We > would > > have to get the aggregate values for a `group_level=1` for the supplied > key > > range and then aggregate those values together. > > This isn’t unique to group_level=0, it could happen for any > group_level!=exact. It can also require descending more than one > group_level depending on the precision of the supplied key range; e.g. for > a date-based key like [YYYY, MM, DD]; someone could ask for yearly > aggregations with group_level=1 but supply a startkey and endkey with > specific dates rather than months. > > Good start! Cheers, Adam > >