[
https://issues.apache.org/jira/browse/FLINK-6665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16073437#comment-16073437
]
ASF GitHub Bot commented on FLINK-6665:
---------------------------------------
Github user StephanEwen commented on the issue:
https://github.com/apache/flink/pull/4220
We have to also consider anther aspect: Changing this from a synchronous
call to an asynchronous callback makes new types of races possible, against
which we need to guard.
We need to make sure that the `restart()` call cannot succeed if there was
another cycle of failure that goes into `RESTARTING`.
I would suggest to address https://issues.apache.org/jira/browse/FLINK-6667
first, so that the callback can check that the `globalModVersion` is unchanged
upon restart. Then there is no anger in moving this to an asynchronous callback.
> Pass a ScheduledExecutorService to the RestartStrategy
> ------------------------------------------------------
>
> Key: FLINK-6665
> URL: https://issues.apache.org/jira/browse/FLINK-6665
> Project: Flink
> Issue Type: Sub-task
> Components: Distributed Coordination
> Reporter: Stephan Ewen
> Assignee: Fang Yong
> Fix For: 1.4.0
>
>
> Currently, the {{RestartStrategy}} is called when the {{ExecutionGraph}}
> should be restarted.
> To facilitate delays before restarting, the strategy simply sleeps, blocking
> the thread that runs the ExecutionGraph's recovery method.
> I suggest to pass {{ScheduledExecutorService}}) to the {{RestartStrategy}}
> and let it schedule the restart call that way, avoiding any sleeps.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)