tillrohrmann commented on issue #9832: [FLINK-11843] Bind lifespan of Dispatcher to leader session URL: https://github.com/apache/flink/pull/9832#issuecomment-541699593 @TisonKun the scenario you are describing cannot happen because `JobGraphWriter#putJobGraph` is called from the `Dispatcher's` main thread. This of course has other problems but it prevents the thread interleaving of your example from occurring. In the future I would like to improve this so that persisting the `JobGraph` happens asynchronously. I think the general contract should be that if the `Dispatcher` is shut down, then no other operation spawned by the `Dispatcher` should be running. Hence, the `Dispatcher` makes sure that it properly terminates every concurrent process. This together with `DefaultDispatcherRunner#previousDispatcherLeaderProcessTerminationFuture` will make sure that the new leader will only start executing once the previous one has completely shut down if there is a single `DispatcherRunner`. If there are multiple `DispatcherRunner`, then you might observe the situation that the new leader has been started and that the old leader still writes a `JobGraph` to the `JobGraphStore`. I think this problem can only be solved by making the `JobGraph` storage transactional wrt the leader session.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
