[GitHub] [flink] XComp commented on pull request #21137: [FLINK-29234][runtime] JobMasterServiceLeadershipRunner handle leader event in a separate executor to avoid dead lock

GitBox Wed, 26 Oct 2022 08:15:48 -0700


XComp commented on PR #21137:
URL: https://github.com/apache/flink/pull/21137#issuecomment-1292206053

Thanks @reswqa for this PR. I'm wondering how executing the leadership
granting/revocation being called from within another thread would help fixing
the issue. The locks might be still acquired concurrently in opposite orders
leading to the deadlock situation.

The usecase that was described in FLINK-29234 essentially happens because
the Dispatcher is stopped (which, as a consequence, would stop
`JobMasterServiceLeadershipRunner`) while the
`JobMasterServiceLeadershipRunner` is granted leadership causing the locks to
be acquired in the opposite order.

I think the problem is that we're still trying to acquire the lock in
[JobMasterServiceLeadershipRunner#runIfStateRunning:453](https://github.com/apache/flink/blob/bfe4f9cc3d67d37a2258ab4226d70b6a7d24f22c/flink-runtime/src/main/java/org/apache/flink/runtime/jobmaster/JobMasterServiceLeadershipRunner.java#L453)
even though the `JobMasterServiceLeadershipRunner` is already switched to
`STOPPED` state. I'm wondering whether we could make
`JobMasterServiceLeadershipRunner#state` volatile and check the instance being
in `RUNNING` state outside of the lock. But this wouldn't solve the issue
entirely because there's still a slight chance that the state changes after the
state check is processed but before entering the lock... :thinking:

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [flink] XComp commented on pull request #21137: [FLINK-29234][runtime] JobMasterServiceLeadershipRunner handle leader event in a separate executor to avoid dead lock

Reply via email to