[ 
https://issues.apache.org/jira/browse/FLINK-33954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiang Xin updated FLINK-33954:
------------------------------
    Description: 
In some cases, the job may hang when there are not enough buffers in the local 
buffer pool. For instance, the parallelism is 10, so the HashBufferAccumulator 
is used. The size of local buffer pool is parallelism + 1

1. The local buffer pool size can be very small when the parallelism is small. 
So when a large record comes and it needs more buffers than the buffer pool 
has, a hang would happen.

  was:The local buffer pool size can be very small when the parallelism is 
small. So when a large record comes and it needs more buffers than the buffer 
pool has, a hang would happen.


> Large record may cause the hybrid shuffle hang
> ----------------------------------------------
>
>                 Key: FLINK-33954
>                 URL: https://issues.apache.org/jira/browse/FLINK-33954
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / Network
>            Reporter: Jiang Xin
>            Priority: Major
>
> In some cases, the job may hang when there are not enough buffers in the 
> local buffer pool. For instance, the parallelism is 10, so the 
> HashBufferAccumulator is used. The size of local buffer pool is parallelism + 
> 1
> 1. The local buffer pool size can be very small when the parallelism is 
> small. So when a large record comes and it needs more buffers than the buffer 
> pool has, a hang would happen.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to