[jira] [Commented] (FLINK-8099) Reduce default restart delay to 1 second

ASF GitHub Bot (JIRA) Sun, 19 Nov 2017 23:56:47 -0800

    [ 
https://issues.apache.org/jira/browse/FLINK-8099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16258909#comment-16258909
 ]


ASF GitHub Bot commented on FLINK-8099:
---------------------------------------

Github user tillrohrmann commented on a diff in the pull request:

    https://github.com/apache/flink/pull/5031#discussion_r151920171
  
    --- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/restart/FixedDelayRestartStrategy.java
 ---
    @@ -81,28 +80,19 @@ public void run() {
        public static FixedDelayRestartStrategyFactory 
createFactory(Configuration configuration) throws Exception {
                int maxAttempts = 
configuration.getInteger(ConfigConstants.RESTART_STRATEGY_FIXED_DELAY_ATTEMPTS, 
1);
     
    -           String timeoutString = configuration.getString(
    -                   AkkaOptions.WATCH_HEARTBEAT_INTERVAL);
    -
                String delayString = configuration.getString(
                        ConfigConstants.RESTART_STRATEGY_FIXED_DELAY_DELAY,
    -                   timeoutString
    +                   "1 s"
    --- End diff --
    
    What about introducing a `ConfigOption`?


> Reduce default restart delay to 1 second
> ----------------------------------------
>
>                 Key: FLINK-8099
>                 URL: https://issues.apache.org/jira/browse/FLINK-8099
>             Project: Flink
>          Issue Type: Improvement
>          Components: State Backends, Checkpointing
>            Reporter: Aljoscha Krettek
>            Assignee: Aljoscha Krettek
>            Priority: Blocker
>             Fix For: 1.4.0
>
>
> Currently, when a job fails Flink will wait for 10 seconds until restarting 
> the job. Even zero delay is a reasonable setting but will result in 
> "flooding" the logs and quickly increasing the restart counter because at 
> zero delay you will always see failures when no standby resources are 
> available.
> Reducing this to 1 second should make for a nicer out-of-box experience and 
> not flood too much.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (FLINK-8099) Reduce default restart delay to 1 second

Reply via email to