reswqa commented on PR #21137: URL: https://github.com/apache/flink/pull/21137#issuecomment-1292379575
> Thanks @reswqa for this PR. I'm wondering how executing the leadership granting/revocation call from within another thread would help fixing the issue. The locks might be still acquired concurrently in opposite orders leading to the deadlock situation. > > The usecase that was described in [FLINK-29234](https://issues.apache.org/jira/browse/FLINK-29234) essentially happens because the Dispatcher is stopped (which, as a consequence, would stop `JobMasterServiceLeadershipRunner`) while the `JobMasterServiceLeadershipRunner` is granted leadership causing the locks to be acquired in the opposite order. > > I think the problem is that we're still trying to acquire the lock in [JobMasterServiceLeadershipRunner#runIfStateRunning:453](https://github.com/apache/flink/blob/bfe4f9cc3d67d37a2258ab4226d70b6a7d24f22c/flink-runtime/src/main/java/org/apache/flink/runtime/jobmaster/JobMasterServiceLeadershipRunner.java#L453) even though the `JobMasterServiceLeadershipRunner` is already switched to `STOPPED` state. I'm wondering whether we could make `JobMasterServiceLeadershipRunner#state` volatile and check the instance being in `RUNNING` state outside of the lock. But this wouldn't solve the issue entirely because there's still a slight chance that the state changes after the state check is processed but before entering the lock... Essentially, we would need another lock guarding the state field. 🤔 Thanks @XComp for your feedback. I agree that the reason for deadlock is that the `JobMasterServiceLeadershipRunner` was granted leadership when it was being closed. At the beginning, I tried to make the `leaderContender.grantLeadership (issuedLeaderSessionID)` in `onGrantLeadership ` method moved outside the lock of `DefaultLeaderElectionService` to avoid nesting locks. But later, I found that there is another layer of lock nesting relationship. When executing `isLeader` and other methods, the curator framework will add a lock itself and can still reproduce the deadlock phenomenon. So I began to consider whether there was a way to completely avoid the nesting of locks when calling the `grantLeadership` of the `JobMasterServiceLeadershipRunner`. Inspired by `ResourceManagerServiceImpl` (which executes the leader event in a separate single thread executor), I thought that the same way should also be used to avoid the nesting of lock structures. When we let a piece of code to another thread for execution, the locks placed on the current thread will not be moved to another thread, which will enable the logic of `granteLeadership` in `JobMasterServiceLeadershipRunner` has chance to get rid of the locks that may be imposed by the caller (whether from the leaderSelectionService or from the Curator). We record the lock in `DefaultLeaderSelectionService` as A and the lock in `JobMasterServiceLeadershipRunner` as B. FLINK-29234's case is as following: Thread t1 invoke `JobMasterServiceLeadershipRunner#closeAsync`, holds lock B, and then attempts to call `DefaultLeaderElectionService#stop` to obtain lock A. Thread t2 invoke `DefaultLeaderElectionService#onGrantLeadership`, hold lock A, and then try to call `JobMasterServiceLeadershipRunner#grantLeadership` to obtain lock B. After introducing an executor to process `grantLeadership`, t2 calling `onGrantLeadership` will hold lock A, move the code logic into the executor, and then release lock A without attempting to acquire lock B. It is the thread in executor's responsibility to obtain B and do the real work. If I miss something, you can tell me. I'm willing to listen to your opinion, because I'm not particularly familiar with this part code. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
