reswqa commented on PR #21137:
URL: https://github.com/apache/flink/pull/21137#issuecomment-1292379575

   > Thanks @reswqa for this PR. I'm wondering how executing the leadership 
granting/revocation call from within another thread would help fixing the 
issue. The locks might be still acquired concurrently in opposite orders 
leading to the deadlock situation.
   > 
   > The usecase that was described in 
[FLINK-29234](https://issues.apache.org/jira/browse/FLINK-29234) essentially 
happens because the Dispatcher is stopped (which, as a consequence, would stop 
`JobMasterServiceLeadershipRunner`) while the 
`JobMasterServiceLeadershipRunner` is granted leadership causing the locks to 
be acquired in the opposite order.
   > 
   > I think the problem is that we're still trying to acquire the lock in 
[JobMasterServiceLeadershipRunner#runIfStateRunning:453](https://github.com/apache/flink/blob/bfe4f9cc3d67d37a2258ab4226d70b6a7d24f22c/flink-runtime/src/main/java/org/apache/flink/runtime/jobmaster/JobMasterServiceLeadershipRunner.java#L453)
 even though the `JobMasterServiceLeadershipRunner` is already switched to 
`STOPPED` state. I'm wondering whether we could make 
`JobMasterServiceLeadershipRunner#state` volatile and check the instance being 
in `RUNNING` state outside of the lock. But this wouldn't solve the issue 
entirely because there's still a slight chance that the state changes after the 
state check is processed but before entering the lock... Essentially, we would 
need another lock guarding the state field. 🤔
   
   Thanks @XComp for your feedback. I agree that the reason for deadlock is 
that the `JobMasterServiceLeadershipRunner` was granted leadership when it was 
being closed. 
   At the beginning, I tried to make the `leaderContender.grantLeadership 
(issuedLeaderSessionID)` in `onGrantLeadership ` method moved outside the lock 
of `DefaultLeaderElectionService` to avoid nesting locks. But later, I found 
that there is another layer of lock nesting relationship. When executing 
`isLeader` and other methods, the curator framework will add a lock itself and 
can still reproduce the deadlock phenomenon.
   So I began to consider whether there was a way to completely avoid the 
nesting of locks when calling the `grantLeadership` of the 
`JobMasterServiceLeadershipRunner`. Inspired by `ResourceManagerServiceImpl` 
(which executes the leader event  in a separate single thread executor), I 
thought that the same way should also be used to avoid the nesting of lock 
structures. When we let a piece of code to another thread for execution, the 
locks placed on the current thread will not be moved to another thread, which 
will enable the logic of `granteLeadership` in 
`JobMasterServiceLeadershipRunner` has chance to get rid of the locks  that may 
be imposed by the caller (whether from the leaderSelectionService or from the 
Curator).
   We record the lock in `DefaultLeaderSelectionService` as A and the lock in 
`JobMasterServiceLeadershipRunner` as B.
   FLINK-29234's case is as following: 
   Thread t1 invoke `JobMasterServiceLeadershipRunner#closeAsync`, holds lock 
B, and then attempts to call `DefaultLeaderElectionService#stop` to obtain lock 
A.
   Thread t2 invoke `DefaultLeaderElectionService#onGrantLeadership`, hold lock 
A, and then try to call `JobMasterServiceLeadershipRunner#grantLeadership` to 
obtain lock B.
   After introducing an executor to process `grantLeadership`, t2 calling 
`onGrantLeadership` will hold lock A, move the code logic into the executor, 
and then release lock A without attempting to acquire lock B. It is the thread 
in executor's responsibility to obtain B and do the real work.
   If I miss something, you can tell me. I'm willing to listen to your opinion, 
because I'm not particularly familiar with this part code.
   
   
   
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to