[
https://issues.apache.org/jira/browse/FLINK-22887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17368803#comment-17368803
]
Jiayi Liao commented on FLINK-22887:
------------------------------------
[~akalashnikov] I've implemented this optimization in our inner version. And
thanks for asking, you can take this issue after we reach a consesus on the
design.
> Backlog based optimizations for RebalancePartitioner and RescalePartitioner
> ---------------------------------------------------------------------------
>
> Key: FLINK-22887
> URL: https://issues.apache.org/jira/browse/FLINK-22887
> Project: Flink
> Issue Type: Improvement
> Components: Runtime / Network
> Affects Versions: 1.13.1
> Reporter: Jiayi Liao
> Priority: Major
>
> {\{RebalancePartitioner}} uses round-robin to distribute the records but this
> may not work as expected, because the environments and the processing ability
> of the downstream tasks may differ from each other. In such cases, the
> throughput of the whole job will be limited by the slowest downstream
> subtask, which is very similar with the "HASH" scenario.
> Instead, after the credit-based mechanism is introduced, we can leverage the
> {{backlog}} on the sender side to identify the "load" on each receiver side,
> which help us distribute the data more fairly in {{RebalancePartitioner}} and
> {{RescalePartitioner}}.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)