[ 
https://issues.apache.org/jira/browse/FLINK-31655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17736668#comment-17736668
 ] 

Rui Fan commented on FLINK-31655:
---------------------------------

Hi [~tartarus] , thanks for the hard testing.

>From the result, I think the maxTraverseSize is a good choice! The *Time 
>consumption* won't increase when the subpartition number is high.

I still have some questions:
 * What's the difference between solution 1(Global Optimal) and optimized 
solution 1(Most of the best)?
 * When maxTraverseSize > the subpartition number, the maxTraverseSize should 
be the subpartition number. For example,  maxTraverseSize = 20, and the 
subpartition number is 10.
 * Do we need to set maxTraverseSize through api? The option is enough for most 
of cases.
 * How the SQL job to use the adaptiveRebalance? The rebalance should be the 
default.

Anyway, it's the time to start a FLIP, and anyone can discuss the 
adaptiveRebalance in the mail list~

> Adaptive Channel selection for partitioner
> ------------------------------------------
>
>                 Key: FLINK-31655
>                 URL: https://issues.apache.org/jira/browse/FLINK-31655
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Task
>            Reporter: tartarus
>            Assignee: tartarus
>            Priority: Major
>
> In Flink, if the upstream and downstream operator parallelism is not the 
> same, then by default the RebalancePartitioner will be used to select the 
> target channel.
> In our company, users often use flink to access redis, hbase or other rpc 
> services, If some of the Operators are slow to return requests (for external 
> service reasons), then because Rebalance/Rescale are Round-Robin the Channel 
> selection policy, so the job is easy to backpressure.
> Because the Rebalance/Rescale policy does not care which subtask the data is 
> sent to downstream, so we expect Rebalance/Rescale to refer to the processing 
> power of the downstream subtask when choosing a Channel.
> Send more data to the free subtask, this ensures the best possible throughput 
> of job!
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to