I believe that foldl in Haskell https://www.haskell.org/hoogle/?hoogle=foldl admits a separate accumulator type from the type of the data structure being "folded" And, well, python lets you have your way with mixing types, but this certainly works as another example:python -c "print(reduce(lambda ac, elem: '%s%d' % (ac,elem), [1,2,3,4,5], ''))" Is there anything special about the AccumT->OutputT conversion that extractOutput() needs to be in the same interface as createAccumulator(), addInput() and mergeAccumulators()? If the interface were segregated such that one interface managed the InputT->AccumT conversion, and the second managed the AccumT->InputT conversion, it seems like maybe the AccumT->OutputT conversion could even get replaced with MapElements? And then the full current "Combine" functionality could be implemented as a composition of the lower-level primitives? I haven't dug that deeply into Combine yet, so I may be missing something obvious. --- Wesley Tanaka https://wtanaka.com/
On Monday, April 17, 2017, 11:32:29 PM HST, Aljoscha Krettek <aljos...@apache.org> wrote:Hi, I think both fold and reduce fail to capture all the power or (what we call) combine. Reduce requires a function of type (T, T) -> T. It requires that the output type be the same as the input type. Fold takes a function (T, A) -> A where T is the input type and A is the accumulation type. Here, the output type can be different from the input type. However, there is no way of combining these aggregators so the operation is not distributive, i.e. we cannot hierarchically apply the operation. Combine is the generalisation of this: We have three types, T (input), A (accumulator), O (output) and we require a function that can merge accumulators. The operation is distributive, meaning we can efficiently execute it and we can also have an output type that is different from the input type. Quick FYI: in Flink the CombineFn is called AggregatingFunction and CombiningState is AggregatingState. Best, Aljoscha > On 18. Apr 2017, at 04:29, Wesley Tanaka <wtan...@yahoo.com.INVALID> wrote: > > As I start to understand Combine.Globally, it seems that it is, in spirit, > Beam's implementation of the "fold" higher-order function > https://en.wikipedia.org/wiki/Fold_(higher-order_function)#Folds_in_various_languages > > Was there a reason the word "combine" was picked instead of either "fold" or > "reduce"? From the wikipedia list above, it seems as though "fold" and > "reduce" are in much more common usage, so either of those might be easier > for newcomers to understand. > --- > Wesley Tanaka > http://wtanaka.com/