[ 
https://issues.apache.org/jira/browse/FLINK-35073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17837259#comment-17837259
 ] 

Julien Tournay edited comment on FLINK-35073 at 4/15/24 1:32 PM:
-----------------------------------------------------------------

Closing this issue because even though I captured a deadlock, that was not what 
was causing my job to block. Instead it looks like there's a scheduling bug 
when using the default scheduler with hybrid shuffle. 
Adaptive scheduler + hybrid shuffle works fine.


was (Author: jto):
Closing this issue because even though I captured a deadlock, that was not what 
was causing my job to block. Instead it looks like there's a scheduling bug 
when using the default scheduler with hybrid shuffle.

> Deadlock in LocalBufferPool when segments become available in the global pool
> -----------------------------------------------------------------------------
>
>                 Key: FLINK-35073
>                 URL: https://issues.apache.org/jira/browse/FLINK-35073
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / Network
>    Affects Versions: 1.19.0
>            Reporter: Julien Tournay
>            Priority: Critical
>              Labels: deadlock, network
>         Attachments: deadlock_threaddump_extract.json
>
>
> The reported issue is easy to reproduce in batch mode using hybrid shuffle 
> and a somewhat large total number of slots in the cluster. Low parallelism 
> (60) still triggers it.
> Note: Joined a partial threaddump to illustrate the issue.
> When `NetworkBufferPool.internalRecycleMemorySegments` is called. The 
> following chain of call may happen:
> {code:java}
> NetworkBufferPool.internalRecycleMemorySegments -> 
> LocalBufferPool.onGlobalPoolAvailable ->
> LocalBufferPool.checkAndUpdateAvailability -> 
> LocalBufferPool.requestMemorySegmentFromGlobalWhenAvailable{code}
> Several instances of `LocalBufferPool` will be notified as soon as a segment 
> becomes available in the global pool because of the implementation of 
> `requestMemorySegmentFromGlobalWhenAvailable`:
> {code:java}
> networkBufferPool.getAvailableFuture().thenRun(this::onGlobalPoolAvailable));{code}
> The issue arises when 2 or more threads go through this specific code path at 
> the same time.
> Each thread will notify the same instances of `LocalBufferPool` by invoking 
> `onGlobalPoolAvailable` on each of them and in the process try to acquire a 
> series of locks
> As an example, assume there are 6 `LocalBufferPool` instance A, B, C, D, E 
> and F:
> Thread 1 calls `onGlobalPoolAvailable` on A, B, C and D. it locks A, B, C and 
> tries to lock D
> Thread 2 calls `onGlobalPoolAvailable` on D, E, F, and A. It locks D, E, F 
> and tries to lock A
> ==> Threads 1 and 2 are mutually blocked.
> The example threadump captured this issue:
> First thread locked java.util.ArrayDeque@41d6a3bb and is blocked on 
> java.util.ArrayDeque@e2b5e34
> Second thread locked java.util.ArrayDeque@e2b5e34 and is blocked on 
> java.util.ArrayDeque@41d6a3bb
>  
> Note that I'm not familiar enough with Flink internals to know what the fix 
> should be but I'm happy to submit a PR if someone tells me what the correct 
> behaviour should be.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to