Github user StephanEwen commented on the issue:
https://github.com/apache/flink/pull/4220
We have to also consider anther aspect: Changing this from a synchronous
call to an asynchronous callback makes new types of races possible, against
which we need to guard.
We need to make sure that the `restart()` call cannot succeed if there was
another cycle of failure that goes into `RESTARTING`.
I would suggest to address https://issues.apache.org/jira/browse/FLINK-6667
first, so that the callback can check that the `globalModVersion` is unchanged
upon restart. Then there is no anger in moving this to an asynchronous callback.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---