Max Michels created FLINK-1621:
----------------------------------

             Summary: Create a generalized combine function
                 Key: FLINK-1621
                 URL: https://issues.apache.org/jira/browse/FLINK-1621
             Project: Flink
          Issue Type: Improvement
          Components: Distributed Runtime
    Affects Versions: 0.9
            Reporter: Max Michels
             Fix For: 0.9


Flink allows combiners which accept a type {{I}} and combine the values of this 
type into type {{O}}. In Google Dataflow, combiners are more generalized. They 
accept an Input {{I}}, produce an intermediate combine value of {{T}}, and 
finally an output {{O}}. Flink's combiners are like the {{SimpleCombineFn}} in 
Google Dataflow.

Right now, we translate the {{KeyedCombineFn}} into a {{SortPartition}} 
followed by a {{MapPartition}} to emulate the Combiner's behavior. Rudimentary 
performance tests showed that this behavior causes a significant increase in 
run time compared to the proper Combine implementation.

Let's implement a more generalized Combiner to create a better mapping from 
Google Dataflow to Flink.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to