Re: [DISCUSS] Built-in reduce indexes

Garren Smith Tue, 24 Sep 2019 07:01:31 -0700

Hi All,

Digging up this thread again just to let you know I wrote an RFC on
implementing reduce functions on top of FoundationDB
https://github.com/apache/couchdb-documentation/pull/441


Cheers
Garren

On Thu, Apr 25, 2019 at 6:05 PM Adam Kocoloski <kocol...@apache.org> wrote:

> Hiya Garren, thanks for starting this one. A few comments inline:
>
> > On Apr 23, 2019, at 11:38 AM, Garren Smith <gar...@apache.org> wrote:
> >
> > Hi All,
> >
> > Following on from the map discussion, I want to start the discussion on
> > built-in reduce indexes.
> >
> > ## Builtin reduces
> > Builtin reduces are definitely the easier of the two reduce options to
> > reason about and design. The one factor to keep in mind for reduces is
> that
> > we need to be able reduce at different group levels. So a data model for
> > that would like this:
> >
> > {?DATABASE, ?VIEWS, ?VIEW_SIGNATURE, ?VIEWS, <view_id>, ?REDUCE,
> > <group_level>, <group_key>, <_reduce_function_name>} ->
> <aggregrate_value>}
>
> - I think this should be <view_signature>, not ?VIEW_SIGNATURE — you’re
> intending for the actual checksum to go in that element of the tuple,
> right? I wonder about using a Directory in that element of the tuple;
> adding the full signature to every key will blow up the index substantially.
>
> - I don’t think the second ?VIEWS is needed — you already scoped to ?VIEWS
> above it.
>
> - Why is the <_reduce_function_name> after the <group_key>? If someone
> defines multiple reduce functions on a single view and then wants to do a
> range query on the grouped result this will mean reading and discarding the
> output from the other reduce function. It seems to me that it would make
> more sense to put this directly after ?REDUCE.
>
> - You could consider having specific constants for each builtin reduce
> function, e.g. ?SUM, instead of the full “_sum” byte string in
> <_reduce_function_name>, which would save several bytes per key.
>
> >
> > Most of that is similar to the map data model, where it changes is from
> the
> > ?REDUCE subspace, we add the group_level (from 1 -> number of keys
> emitted
> > in the map function), then the group key used in the reduce, the reduce
> > function name e.g _sum, _count and then we store the aggregated value as
> > the FDB value.
>
> I think we need a better definition of <group_level> here. It’s not really
> the number of keys emitted from the map function, it’s the number of
> elements of an array key emitted by the map function that are used to
> determine the degree of aggregation.
>
> Also, before we even get to group_level we should describe how reduce
> functions work with keys that are not arrays. In the current API this is
> group=true (separate aggregations for each set of KV pairs that share the
> same key) or group=false (one aggregated result for the entire view).
> Internally in the current codebase we map these two settings to
> group_level=exact and group_level=0, respectively. Basically, we need a way
> to represent “exact” in the <group_level>.
>
> > ### Index management
> >
> > To update the reduce indexes, we will rely on the `id_index` and the
> > `update_seq` defined in the map discussion. Then to apply changes,  we
> > calculate the change of an aggregate value for the keys at the highest
> > group level, then apply that change to all the group levels lower than it
> > using fdb’s atomic operations [1].
>
> This is a really important point and worth calling out more explicitly.
> One reason most of the builtins are significantly simpler than the custom
> functions is because we know exactly how the aggregate values change when
> we get the map output for each updated doc in isolation. We don’t need to
> go retrieve all the rest of the map output that happens to share the same
> map key. Moreover, the change to the aggregate value can be applied using
> atomic operations so we don’t need to do anything special to avoid txn
> conflicts for views where lots of documents emit the same map key.
>
> > ### Reducer functions
> >
> > The FDB’s atomic functions support all the built in reduce functions
> > CouchDB supports. So we can use those as part of our red function. For
> the
> > `_stats` reduce function, we will have to split that across multiple key
> > values. So its data model will have an extra key in it to record what
> stat
> > it is for the _stats reducer:
> >
> > {?DATABASE, ?VIEWS, ?VIEW_SIGNATURE, ?VIEWS, <view_id>, ?REDUCE,
> > <group_level>, <group_key>, <_reduce_function_name>, <_stat_field>} ->
> > <aggregrate_value>}
> >
> > We do have some problems, with `_approx_count_distinct`  because it does
> > not support removing keys from the filter. So we have three options:
> >
> > 1. We can ignore key removal entirely in the filter since this is just an
> > estimate
> > 2.  Implement a real COUNT DISTINCT function, we can do because we’re not
> > trying to merge results from different local shards
> > 3. Don’t support it going forward
>
> There’s actually a fourth option here, which is to maintain
> _approx_count_distinct using the future machinery for custom reduce
> functions, generating a new filter from the raw map output at the lowest
> level when keys are removed. I don’t have an informed opinion yet about
> that approach.
>
> > ### Group_level=0
> >
> > One tricker situation is if a user does a group_level=0 query with a key
> > range, this would require us to do some client level aggregation. We
> would
> > have to get the aggregate values for a `group_level=1` for the supplied
> key
> > range and then aggregate those values together.
>
> This isn’t unique to group_level=0, it could happen for any
> group_level!=exact. It can also require descending more than one
> group_level depending on the precision of the supplied key range; e.g. for
> a date-based key like [YYYY, MM, DD]; someone could ask for yearly
> aggregations with group_level=1 but supply a startkey and endkey with
> specific dates rather than months.
>
> Good start! Cheers, Adam
>
>

Re: [DISCUSS] Built-in reduce indexes

Reply via email to