[ 
https://issues.apache.org/jira/browse/FLINK-29396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17610038#comment-17610038
 ] 

Chesnay Schepler commented on FLINK-29396:
------------------------------------------

The question is when the RM can send that notification.

The disconnect calls are weirdly cyclic and have 0 notion of who initiated it 
initially.
The JM calls {{ResourceManagerGateway#disconnectJobManager}} which ends with 
the RM calling {{JobMasterGateway#disconnectResourceManager}}.
But this sequence can also happen in reverse; so who waits for who?

Maybe we should rework these methods to actually be an {{ask}}.

The shutdown sequence is is actually super annoying in general because even an 
orderly shutdown by the mini cluster invariably leads to someone getting an 
exception because the other party has already shut down.

> Race condition in JobMaster shutdown can leak resource requirements
> -------------------------------------------------------------------
>
>                 Key: FLINK-29396
>                 URL: https://issues.apache.org/jira/browse/FLINK-29396
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / Coordination
>    Affects Versions: 1.15.0
>            Reporter: Chesnay Schepler
>            Priority: Blocker
>
> When a JobMaster is stopped it
> a) sends a message to the RM informing it of the final job status
> b) removes itself as the leader.
> Once the JM loses leadership the RM is also informed about that.
> With that we have 2 messages being sent to the RM at about the same time.
> If the shutdown notifications arrives first (and job is in a terminal state) 
> we wipe the resource requirements, and the leader loss notification is 
> effectively ignored.
> If the leader loss notification arrives first we keep the resource 
> requirements, assuming that another JM will pick the job up later on, and the 
> shutdown notification will be ignored.
> This can cause a session cluster to essentially do nothing until the job 
> timeout is triggered due to no leader being present (default 5 minutes).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to