[
https://issues.apache.org/jira/browse/FLINK-28561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yingjie Cao updated FLINK-28561:
--------------------------------
Fix Version/s: (was: 1.17.0)
> Merge subpartition shuffle data read request for better sequential IO
> ---------------------------------------------------------------------
>
> Key: FLINK-28561
> URL: https://issues.apache.org/jira/browse/FLINK-28561
> Project: Flink
> Issue Type: Improvement
> Components: Runtime / Network
> Reporter: Yingjie Cao
> Priority: Major
>
> Currently, the shuffle data of each subpartition for blocking shuffle is read
> separately. To achieve better performance and reduce IOPS, we can merge
> consecutive data requests of the same field together and serves them in one
> IO request. More specifically,
> 1) if multiple data requests are reading the same data, for example, reading
> broadcast data, the reader will read the data only once and send the same
> piece of data to multiple downstream consumers.
> 2) if multiple data requests are reading the consecutive data in one file, we
> will merge those data requests together as one large request and read a
> larger size of data sequentially which is good for file IO performance.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)