The definition of foldl in Haskell is the same as the description I gave earlier:
foldl :: (a -> b -> a) -> a -> [b] -> a The function (a -> b -> a) is what I described as (T, A) -> A and it’s used to fold a list of b’s into an a (the accumulator type). You’re right that the mapping AccumT->OutputT is not important and could be delegated to a separate method. The important part of the interface is mergeAccumulators() since this makes the operation distributive: we can “fold” a bunch of Ts into As in parallel (even on different machines) and then merge them together. This is what is missing from a functional fold. Best, Aljoscha > On 18. Apr 2017, at 12:03, Wesley Tanaka <wtan...@yahoo.com.INVALID> wrote: > > I believe that foldl in Haskell https://www.haskell.org/hoogle/?hoogle=foldl > admits a separate accumulator type from the type of the data structure being > "folded" > And, well, python lets you have your way with mixing types, but this > certainly works as another example:python -c "print(reduce(lambda ac, elem: > '%s%d' % (ac,elem), [1,2,3,4,5], ''))" > Is there anything special about the AccumT->OutputT conversion that > extractOutput() needs to be in the same interface as createAccumulator(), > addInput() and mergeAccumulators()? If the interface were segregated such > that one interface managed the InputT->AccumT conversion, and the second > managed the AccumT->InputT conversion, it seems like maybe the > AccumT->OutputT conversion could even get replaced with MapElements? And > then the full current "Combine" functionality could be implemented as a > composition of the lower-level primitives? > I haven't dug that deeply into Combine yet, so I may be missing something > obvious. > --- > Wesley Tanaka > https://wtanaka.com/ > > On Monday, April 17, 2017, 11:32:29 PM HST, Aljoscha Krettek > <aljos...@apache.org> wrote:Hi, > I think both fold and reduce fail to capture all the power or (what we call) > combine. Reduce requires a function of type (T, T) -> T. It requires that the > output type be the same as the input type. Fold takes a function (T, A) -> A > where T is the input type and A is the accumulation type. Here, the output > type can be different from the input type. However, there is no way of > combining these aggregators so the operation is not distributive, i.e. we > cannot hierarchically apply the operation. > > Combine is the generalisation of this: We have three types, T (input), A > (accumulator), O (output) and we require a function that can merge > accumulators. The operation is distributive, meaning we can efficiently > execute it and we can also have an output type that is different from the > input type. > > Quick FYI: in Flink the CombineFn is called AggregatingFunction and > CombiningState is AggregatingState. > > Best, > Aljoscha >> On 18. Apr 2017, at 04:29, Wesley Tanaka <wtan...@yahoo.com.INVALID> wrote: >> >> As I start to understand Combine.Globally, it seems that it is, in spirit, >> Beam's implementation of the "fold" higher-order function >> https://en.wikipedia.org/wiki/Fold_(higher-order_function)#Folds_in_various_languages >> >> Was there a reason the word "combine" was picked instead of either "fold" or >> "reduce"? From the wikipedia list above, it seems as though "fold" and >> "reduce" are in much more common usage, so either of those might be easier >> for newcomers to understand. >> --- >> Wesley Tanaka >> http://wtanaka.com/