[
https://issues.apache.org/jira/browse/FLINK-3187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15087394#comment-15087394
]
ASF GitHub Bot commented on FLINK-3187:
---------------------------------------
Github user tillrohrmann commented on the pull request:
https://github.com/apache/flink/pull/1470#issuecomment-169675092
Thanks for the review @rmetzger.
I think this is not a problem, because the user cannot define restart
strategies. In order to set a `RestartStrategy`, the user has to provide a
`RestartStrategyConfiguration`. The `RestartStrategyConfiguration` cannot be
extended outside the `RestartStrategies` class so that the user cannot define
his own `RestartStrategyConfiguration`. Additionally, the strategy itself will
only be instantiated from this configuration on the `JobManager` via the
`RestartStrategyFactory`. This is also code which cannot be changed by the user
via the API.
> Decouple restart strategy from ExecutionGraph
> ---------------------------------------------
>
> Key: FLINK-3187
> URL: https://issues.apache.org/jira/browse/FLINK-3187
> Project: Flink
> Issue Type: Improvement
> Affects Versions: 1.0.0
> Reporter: Till Rohrmann
> Assignee: Till Rohrmann
> Priority: Minor
>
> Currently, the {{ExecutionGraph}} supports the following restart logic:
> Whenever a failure occurs and the number of restart attempts aren't depleted,
> wait for a fixed amount of time and then try to restart. This behaviour can
> be controlled by the configuration parameters {{execution-retries.default}}
> and {{execution-retries.delay}}.
> I propose to decouple the restart logic from the {{ExecutionGraph}} a bit by
> introducing a strategy pattern. That way it would not only allow us to define
> a job specific restart behaviour but also to implement different restart
> strategies. Conceivable strategies could be: Fixed timeout restart,
> exponential backoff restart, partial topology restarts, etc.
> This change is a preliminary step towards having a restart strategy which
> will scale the parallelism of a job down in case that not enough slots are
> available.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)