[ 
https://issues.apache.org/jira/browse/FLINK-25774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yingjie Cao updated FLINK-25774:
--------------------------------
    Description: Currently, for sort-shuffle, the maximum number of buffers can 
be used per result partition is Integer.MAX_VALUE. However, if too many buffers 
are taken by one result partition, other result partitions and input gates may 
spend too much time waiting for buffers which can influence performance. This 
ticket aims to restrict the maximum number of buffers can be used per result 
partition and the selected value is an empirical one based on the TPC-DS test 
results.  (was: Currently, for blocking shuffle, the maximum number of buffers 
can be used per result partition is Integer.MAX_VALUE. For hash-shuffle, the 
maximum number of buffers to be used is (numSubpartition + 1), because the 
hash-shuffle implementation always flush the previous buffer after a new buffer 
is added, so setting the maximum number of buffers can be used to 
Integer.MAX_VALUE is meaningless. For sort-shuffle, if too many buffers are 
taken by one result partition, other result partitions and input gates may 
spend too much time waiting for buffers which can influence performance. This 
ticket aims to restrict the maximum number of buffers can be used per result 
partition and the selected value is an empirical one based on the TPC-DS test 
results.)

> Restrict the maximum number of buffers can be used per result partition for 
> sort-shuffle
> ----------------------------------------------------------------------------------------
>
>                 Key: FLINK-25774
>                 URL: https://issues.apache.org/jira/browse/FLINK-25774
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Runtime / Network
>            Reporter: Yingjie Cao
>            Priority: Major
>             Fix For: 1.15.0
>
>
> Currently, for sort-shuffle, the maximum number of buffers can be used per 
> result partition is Integer.MAX_VALUE. However, if too many buffers are taken 
> by one result partition, other result partitions and input gates may spend 
> too much time waiting for buffers which can influence performance. This 
> ticket aims to restrict the maximum number of buffers can be used per result 
> partition and the selected value is an empirical one based on the TPC-DS test 
> results.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to