Weijie Guo created FLINK-28889:
----------------------------------

             Summary: Hybrid shuffle writes multiple copies of broadcast data
                 Key: FLINK-28889
                 URL: https://issues.apache.org/jira/browse/FLINK-28889
             Project: Flink
          Issue Type: Bug
            Reporter: Weijie Guo


Hybrid shuffle writes multiple copies of broadcast data, This will cause a 
waste of memory and disk space and affect the performance of shuffle write 
phase. Ideally, for the full spilling strategy, any broadcast data (record or 
event) should only write one piece of data in the memory, and the same is true 
for the disk. For selective spilling strategy, if the broadcast edge is 
encountered, we should consider directly turning it into the edge of 
HYBRID_FULL, or introducing configuration option to decide whether to do this 
switch. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to