[ 
https://issues.apache.org/jira/browse/FLINK-13921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann reassigned FLINK-13921:
-------------------------------------

    Assignee: Till Rohrmann

> Simplify cluster level RestartStrategy configuration
> ----------------------------------------------------
>
>                 Key: FLINK-13921
>                 URL: https://issues.apache.org/jira/browse/FLINK-13921
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Coordination
>    Affects Versions: 1.10.0
>            Reporter: Till Rohrmann
>            Assignee: Till Rohrmann
>            Priority: Major
>             Fix For: 1.10.0
>
>
> Currently, Flink's behaviour with respect to configuring the 
> {{RestartStrategies}} is quite complicated and convoluted. The reason for 
> this is that we evolved the way it has been configured and wanted to keep it 
> backwards compatible. Due to this, we have currently the following behaviour:
> * If the config option {{restart-strategy}} is configured, then Flink uses 
> this {{RestartStrategy}} (so far so simple)
> * If the config option {{restart-strategy}} is not configured, then 
> ** If {{restart-strategy.fixed-delay.attempts}} or 
> {{restart-strategy.fixed-delay.delay}} are defined, then instantiate 
> {{FixedDelayRestartStrategy(restart-strategy.fixed-delay.attempts, 
> restart-strategy.fixed-delay.delay)}}
> ** If {{restart-strategy.fixed-delay.attempts}} and 
> {{restart-strategy.fixed-delay.delay}} are not defined, then
> *** If checkpointing is disabled, then choose {{NoRestartStrategy}}
> *** If checkpointing is enabled, then choose 
> {{FixedDelayRestartStrategy(Integer.MAX_VALUE, "0 s")}}
> I would like to simplify the configuration by removing the "If 
> {{restart-strategy.fixed-delay.attempts}} or 
> {{restart-strategy.fixed-delay.delay}}, then" condition. That way, the logic 
> would be the following:
> * If the config option {{restart-strategy}} is configured, then Flink uses 
> this {{RestartStrategy}}
> * If the config option {{restart-strategy}} is not configured, then 
> ** If checkpointing is disabled, then choose {{NoRestartStrategy}}
> ** If checkpointing is enabled, then choose 
> {{FixedDelayRestartStrategy(Integer.MAX_VALUE, "0 s")}}
> That way we retain the user friendliness that their jobs restart if they 
> enable checkpointing and we make it clear that any 
> {{restart-strategy.fixed-delay}} setting will only be respected if 
> {{restart-strategy}} has been set to {{fixed-delay}}.
> This simplification would, however, change Flink's behaviour and might break 
> existing setups. Since we introduced {{RestartStrategies}} with Flink 
> {{1.0.0}} and deprecated the prior configuration mechanism which enables 
> restarting if either the {{attempts}} or the {{delay}} has been set, I think 
> that the number of broken jobs should be minimal if not non-existent.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

Reply via email to