Jiang Xin created FLINK-33668: --------------------------------- Summary: Decoupling Shuffle network memory and job topology Key: FLINK-33668 URL: https://issues.apache.org/jira/browse/FLINK-33668 Project: Flink Issue Type: Improvement Components: Runtime / Network Reporter: Jiang Xin Fix For: 1.19.0
With [FLINK-30469|https://issues.apache.org/jira/browse/FLINK-30469] and [FLINK-31643|https://issues.apache.org/jira/browse/FLINK-31643], we have decoupled the shuffle network memory and the parallelism of tasks by limiting the number of buffers for each InputGate and ResultPartition. However, when too many shuffle tasks are running simultaneously on the same TaskManager, "Insufficient number of network buffers" errors would still occur. This usually happens when Slot Sharing Group is enabled or a TaskManager contains multiple slots. So we need to make sure that the TaskManager does not encounter "Insufficient number of network buffers" even if there are dozens of InputGates and ResultPartitions running on the same TaskManager simultaneously. -- This message was sent by Atlassian Jira (v8.20.10#820010)