+1 for the FLIP, thank vino for your efforts.

Best,
Leesf

vino yang <yanghua1...@gmail.com> 于2019年6月12日周三 下午5:46写道:

> Hi folks,
>
> I would like to start the FLIP discussion thread about supporting local
> aggregation in Flink.
>
> In short, this feature can effectively alleviate data skew. This is the
> FLIP:
>
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-44%3A+Support+Local+Aggregation+in+Flink
>
>
> *Motivation* (copied from FLIP)
>
> Currently, keyed streams are widely used to perform aggregating operations
> (e.g., reduce, sum and window) on the elements that have the same key. When
> executed at runtime, the elements with the same key will be sent to and
> aggregated by the same task.
>
> The performance of these aggregating operations is very sensitive to the
> distribution of keys. In the cases where the distribution of keys follows a
> powerful law, the performance will be significantly downgraded. More
> unluckily, increasing the degree of parallelism does not help when a task
> is overloaded by a single key.
>
> Local aggregation is a widely-adopted method to reduce the performance
> degraded by data skew. We can decompose the aggregating operations into two
> phases. In the first phase, we aggregate the elements of the same key at
> the sender side to obtain partial results. Then at the second phase, these
> partial results are sent to receivers according to their keys and are
> combined to obtain the final result. Since the number of partial results
> received by each receiver is limited by the number of senders, the
> imbalance among receivers can be reduced. Besides, by reducing the amount
> of transferred data the performance can be further improved.
>
> *More details*:
>
> Design documentation:
>
> https://docs.google.com/document/d/1gizbbFPVtkPZPRS8AIuH8596BmgkfEa7NRwR6n3pQes/edit?usp=sharing
>
> Old discussion thread:
>
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Support-Local-Aggregation-in-Flink-td29307.html#a29308
>
> JIRA: FLINK-12786 <https://issues.apache.org/jira/browse/FLINK-12786>
>
> We are looking forwards to your feedback!
>
> Best,
> Vino
>

Reply via email to