[jira] [Commented] (FLINK-33668) Decoupling Shuffle network memory and job topology

dalongliu (Jira) Tue, 28 Nov 2023 17:50:05 -0800


    [ 
https://issues.apache.org/jira/browse/FLINK-33668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17790844#comment-17790844
 ]


dalongliu commented on FLINK-33668:
-----------------------------------

Big +1, there also has  a depulicated issue: 
https://issues.apache.org/jira/browse/FLINK-31643

> Decoupling Shuffle network memory and job topology
> --------------------------------------------------
>
>                 Key: FLINK-33668
>                 URL: https://issues.apache.org/jira/browse/FLINK-33668
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Network
>            Reporter: Jiang Xin
>            Priority: Major
>             Fix For: 1.19.0
>
>
> With FLINK-30469  and FLINK-31643, we have decoupled the shuffle network 
> memory and the parallelism of tasks by limiting the number of buffers for 
> each InputGate and ResultPartition. However, when too many shuffle tasks are 
> running simultaneously on the same TaskManager, "Insufficient number of 
> network buffers" errors would still occur. This usually happens when Slot 
> Sharing Group is enabled or a TaskManager contains multiple slots.
> We want to make sure that the TaskManager does not encounter "Insufficient 
> number of network buffers" even if there are dozens of InputGates and 
> ResultPartitions running on the same TaskManager simultaneously.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-33668) Decoupling Shuffle network memory and job topology

Reply via email to