Weijie Guo created FLINK-30195: ---------------------------------- Summary: LeaderElectionService should avoid potential deadlock with leaderContender Key: FLINK-30195 URL: https://issues.apache.org/jira/browse/FLINK-30195 Project: Flink Issue Type: Improvement Components: Runtime / Coordination Reporter: Weijie Guo
As discussed in [https://github.com/apache/flink/pull/21137|https://github.com/apache/flink/pull/21137,] , leader election service should not call `contender#grant/revokeLeadership` under a lock while the same lock can be accessed by the contender. We can fix this issue with a dedicated executor to get rid of the nested lock structure. This would affect all contenders and we need to carefully check that no existing contenders are relying on the current behavior that `grant/removeLeadership{{{}`{}}} are called under lock. We should also clean up things like `ResourceManagerServiceImpl.handleLeaderEventExecutor`. -- This message was sent by Atlassian Jira (v8.20.10#820010)