[
https://issues.apache.org/jira/browse/FLINK-13921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated FLINK-13921:
-----------------------------------
Labels: pull-request-available (was: )
> Simplify cluster level RestartStrategy configuration
> ----------------------------------------------------
>
> Key: FLINK-13921
> URL: https://issues.apache.org/jira/browse/FLINK-13921
> Project: Flink
> Issue Type: Improvement
> Components: Runtime / Coordination
> Affects Versions: 1.10.0
> Reporter: Till Rohrmann
> Assignee: Till Rohrmann
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.10.0
>
>
> Currently, Flink's behaviour with respect to configuring the
> {{RestartStrategies}} is quite complicated and convoluted. The reason for
> this is that we evolved the way it has been configured and wanted to keep it
> backwards compatible. Due to this, we have currently the following behaviour:
> * If the config option {{restart-strategy}} is configured, then Flink uses
> this {{RestartStrategy}} (so far so simple)
> * If the config option {{restart-strategy}} is not configured, then
> ** If {{restart-strategy.fixed-delay.attempts}} or
> {{restart-strategy.fixed-delay.delay}} are defined, then instantiate
> {{FixedDelayRestartStrategy(restart-strategy.fixed-delay.attempts,
> restart-strategy.fixed-delay.delay)}}
> ** If {{restart-strategy.fixed-delay.attempts}} and
> {{restart-strategy.fixed-delay.delay}} are not defined, then
> *** If checkpointing is disabled, then choose {{NoRestartStrategy}}
> *** If checkpointing is enabled, then choose
> {{FixedDelayRestartStrategy(Integer.MAX_VALUE, "0 s")}}
> I would like to simplify the configuration by removing the "If
> {{restart-strategy.fixed-delay.attempts}} or
> {{restart-strategy.fixed-delay.delay}}, then" condition. That way, the logic
> would be the following:
> * If the config option {{restart-strategy}} is configured, then Flink uses
> this {{RestartStrategy}}
> * If the config option {{restart-strategy}} is not configured, then
> ** If checkpointing is disabled, then choose {{NoRestartStrategy}}
> ** If checkpointing is enabled, then choose
> {{FixedDelayRestartStrategy(Integer.MAX_VALUE, "0 s")}}
> That way we retain the user friendliness that their jobs restart if they
> enable checkpointing and we make it clear that any
> {{restart-strategy.fixed-delay}} setting will only be respected if
> {{restart-strategy}} has been set to {{fixed-delay}}.
> This simplification would, however, change Flink's behaviour and might break
> existing setups. Since we introduced {{RestartStrategies}} with Flink
> {{1.0.0}} and deprecated the prior configuration mechanism which enables
> restarting if either the {{attempts}} or the {{delay}} has been set, I think
> that the number of broken jobs should be minimal if not non-existent.
--
This message was sent by Atlassian Jira
(v8.3.2#803003)