[jira] [Commented] (FLINK-22887) Backlog based optimizations for RebalancePartitioner and RescalePartitioner

Piotr Nowojski (Jira) Tue, 22 Jun 2021 00:27:05 -0700


    [ 
https://issues.apache.org/jira/browse/FLINK-22887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17367064#comment-17367064
 ]


Piotr Nowojski commented on FLINK-22887:
----------------------------------------

I think the question which metric to use is a secondary issue. The main 
question/concern is how to update it? Even with unsafe access, checking all 
sub-partitions per every record would be too high overhead for high throughput 
cases. It would still be useful, but it's use-fullness would be limited. Hence 
I would suggest first to focus how can we cheaply and lazily update/maintain 
some kind of form of best effort prioritised queue of channels.

I'm not sure if using "best effort" unsafe access will be working good enough? 
It would require quite a bit of testing. But definitely that would reduce the 
overhead of picking channel/subpartition to write.

[~akalashnikov] and [~wind_ljy] can you write down the gists of your 
ideas/proposals how to do it?

> Backlog based optimizations for RebalancePartitioner and RescalePartitioner
> ---------------------------------------------------------------------------
>
>                 Key: FLINK-22887
>                 URL: https://issues.apache.org/jira/browse/FLINK-22887
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Network
>    Affects Versions: 1.13.1
>            Reporter: Jiayi Liao
>            Priority: Major
>
> {\{RebalancePartitioner}} uses round-robin to distribute the records but this 
> may not work as expected, because the environments and the processing ability 
> of the downstream tasks may differ from each other. In such cases, the 
> throughput of the whole job will be limited by the slowest downstream 
> subtask, which is very similar with the "HASH" scenario.
> Instead, after the credit-based mechanism is introduced, we can leverage the 
> {{backlog}} on the sender side to identify the "load" on each receiver side, 
> which help us distribute the data more fairly in {{RebalancePartitioner}} and 
> {{RescalePartitioner}}. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-22887) Backlog based optimizations for RebalancePartitioner and RescalePartitioner

Reply via email to