[
https://issues.apache.org/jira/browse/FLINK-28512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yingjie Cao updated FLINK-28512:
--------------------------------
Description: Currently, the SortMergeResultPartition select to use
HashBasedDataBuffer and SortBasedDataBuffer based on the number of required
buffers per result partition decided by
'taskmanager.network.sort-shuffle.min-buffers'. If the configured value is
large enough, HashBasedDataBuffer will be used, otherwise, SortBasedDataBuffer
will be used. Usually, the HashBasedDataBuffer has better performance. However,
it is not easy to tune this value, because if a user tries to increase it for
better performance, he/she is easy to encounter the 'Insufficient number of
network buffers' error. This patch improves this case by selecting
HashBasedDataBuffer and SortBasedDataBuffer dynamically based on the number of
network buffers can be allocated. More specifically, if there is enough buffers
at runtime, HashBasedDataBuffer will be used, otherwise, SortBasedDataBuffer
will be used. To achieve better performance, the user only need to increase
total amount of network memory per task manager. (was: Currently, theĀ )
> Select HashBasedDataBuffer and SortBasedDataBuffer dynamically based on the
> number of network buffers can be allocated for SortMergeResultPartition
> ---------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: FLINK-28512
> URL: https://issues.apache.org/jira/browse/FLINK-28512
> Project: Flink
> Issue Type: Sub-task
> Reporter: Yingjie Cao
> Priority: Major
> Fix For: 1.16.0
>
>
> Currently, the SortMergeResultPartition select to use HashBasedDataBuffer and
> SortBasedDataBuffer based on the number of required buffers per result
> partition decided by 'taskmanager.network.sort-shuffle.min-buffers'. If the
> configured value is large enough, HashBasedDataBuffer will be used,
> otherwise, SortBasedDataBuffer will be used. Usually, the HashBasedDataBuffer
> has better performance. However, it is not easy to tune this value, because
> if a user tries to increase it for better performance, he/she is easy to
> encounter the 'Insufficient number of network buffers' error. This patch
> improves this case by selecting HashBasedDataBuffer and SortBasedDataBuffer
> dynamically based on the number of network buffers can be allocated. More
> specifically, if there is enough buffers at runtime, HashBasedDataBuffer will
> be used, otherwise, SortBasedDataBuffer will be used. To achieve better
> performance, the user only need to increase total amount of network memory
> per task manager.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)