[ 
https://issues.apache.org/jira/browse/FLINK-28889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-28889:
-------------------------------
    Summary: Hybrid shuffle should supports multiple consumer  (was: Hybrid 
shuffle writes multiple copies of broadcast data)

> Hybrid shuffle should supports multiple consumer
> ------------------------------------------------
>
>                 Key: FLINK-28889
>                 URL: https://issues.apache.org/jira/browse/FLINK-28889
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Network
>    Affects Versions: 1.16.0
>            Reporter: Weijie Guo
>            Assignee: Weijie Guo
>            Priority: Critical
>             Fix For: 1.17.0
>
>
> Hybrid shuffle does not support multiple consumer for single subpartition 
> data. This will bring some defects, such as the inability to support 
> partition reuse, speculative execution. In particular, it cannot support 
> broadcast optimization, that is, hybrid shuffle writes multiple copies of 
> broadcast data, This will cause a waste of memory and disk space and affect 
> the performance of shuffle write phase. Ideally, for the full spilling 
> strategy, any broadcast data (record or event) should only write one piece of 
> data in the memory, and the same is true for the disk.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to