+1, as I’m almost always in favour of simplification
> On 21. Apr 2017, at 19:59, Robert Bradshaw <rober...@google.com.INVALID>
> wrote:
>
> Strongly in favor of removing this. If it's actually needed one can
> incorporate the key into the value for inspection in the various
> phases of the CombineFn, so it's no loss of expressiveness. It's
> perfectly reasonable to make this (rare) usecase more complicated to
> greatly simplify the common API. Also, the (very few) legitimate
> instances of the internal google equivalent could be just as well
> written as a standard CombineFn followed by a separate DoFn that then
> acts on the key.
>
> +1
>
>
> On Fri, Apr 21, 2017 at 10:48 AM, Davor Bonaci <da...@apache.org> wrote:
>> +1 -- this is a good simplification.
>>
>> On Fri, Apr 21, 2017 at 10:24 AM, Kenneth Knowles <k...@google.com.invalid>
>> wrote:
>>
>>> Hi all,
>>>
>>> I propose that we remove KeyedCombineFn before the first stable release.
>>>
>>> I don't think it adds enough value for the complexity it adds to e.g.
>>> CombineWithContext [1] and state [2, 3], and it doesn't seem to me that
>>> users really use it when we might expect. I am happy to be demonstrated
>>> wrong.
>>>
>>> It is very likely that you have never written [4, 5] or thought about
>>> KeyedCombineFn. So for context, here are excepts from signatures just to
>>> show the difference from CombineFn:
>>>
>>> CombineFn<InputT, AccumT, OutputT> {
>>> AccumT createAccumulator();
>>> AccumT addInput(AccumT accum, InputT input);
>>> AccumT mergeAccumulators(Iterable<AccumT> accums);
>>> OutputT extractOutput(AccumT accum);
>>> }
>>>
>>> KeyedCombineFn<K, InputT, AccumT, OutputT> {
>>> AccumT createAccumulator(K key);
>>> AccumT addInput(K key, AccumT accum, InputT input);
>>> AccumT mergeAccumulators(K key, Iterable<AccumT> accums);
>>> OutputT extractOutput(K key, AccumT accum);
>>> }
>>>
>>> So what are the particular reasons for this, versus a CombineFn that has
>>> KVs as its input and accumulator types?
>>>
>>> - There are some performance improvements potentially from not passing
>>> keys around, based on the assumption they are always available.
>>>
>>> - There is also a spec difference because it only has to be associative
>>> and commutative per key, cannot be applied in a global combine, and
>>> addInput is automatically key preserving.
>>>
>>> But in fact, in all of my code crawling the class is almost never used
>>> (even over the course of its history at Google) and even the few uses I
>>> found were often mistakes where the key is totally ignored, probably
>>> because a user thinks "I am doing a keyed combine so I need a keyed combine
>>> function". So the number of users actually affected is about zero.
>>>
>>> I would be curious if anyone has a compelling case for keeping
>>> KeyedCombineFn.
>>>
>>> Kenn
>>>
>>> [1]
>>> https://github.com/yafengguo/Apache-beam/blob/master/sdks/
>>> java/core/src/main/java/org/apache/beam/sdk/transforms/
>>> CombineWithContext.java
>>> [2] https://issues.apache.org/jira/browse/BEAM-1336
>>> [3] https://github.com/apache/beam/pull/2627
>>> [4]
>>> https://github.com/search?l=Java&q=KeyedCombineFn&ref=
>>> advsearch&type=Code&utf8=%E2%9C%93
>>> [5] https://www.google.com/search?q=KeyedCombineFn
>>>