[jira] [Commented] (FLINK-3187) Decouple restart strategy from ExecutionGraph

ASF GitHub Bot (JIRA) Mon, 25 Jan 2016 03:00:01 -0800

    [ 
https://issues.apache.org/jira/browse/FLINK-3187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15115040#comment-15115040
 ]


ASF GitHub Bot commented on FLINK-3187:
---------------------------------------

Github user uce commented on a diff in the pull request:

    https://github.com/apache/flink/pull/1470#discussion_r50679685
  
    --- Diff: 
flink-core/src/main/java/org/apache/flink/api/common/ExecutionConfig.java ---
    @@ -237,53 +236,26 @@ public ExecutionConfig setParallelism(int 
parallelism) {
        }
     
        /**
    -    * Gets the number of times the system will try to re-execute failed 
tasks. A value
    -    * of {@code -1} indicates that the system default value (as defined in 
the configuration)
    -    * should be used.
    +    * Sets the restart strategy configuration which defines which restart 
strategy shall be used
    +    * for the execution graph of the corresponding job.
    --- End diff --
    
    I would add a `<code></code>` example showing the `RestartStrategies`, 
which will be the common way to configure it I guess.
    
    The text could maybe also be simplified at the end by removing execution 
graph and corresponding job. The average user will not know what it is. On the 
other hand, it might be a good pointer for someone who wants to work on it.


> Decouple restart strategy from ExecutionGraph
> ---------------------------------------------
>
>                 Key: FLINK-3187
>                 URL: https://issues.apache.org/jira/browse/FLINK-3187
>             Project: Flink
>          Issue Type: Improvement
>    Affects Versions: 1.0.0
>            Reporter: Till Rohrmann
>            Assignee: Till Rohrmann
>            Priority: Minor
>
> Currently, the {{ExecutionGraph}} supports the following restart logic: 
> Whenever a failure occurs and the number of restart attempts aren't depleted, 
> wait for a fixed amount of time and then try to restart. This behaviour can 
> be controlled by the configuration parameters {{execution-retries.default}} 
> and {{execution-retries.delay}}.
> I propose to decouple the restart logic from the {{ExecutionGraph}} a bit by 
> introducing a strategy pattern. That way it would not only allow us to define 
> a job specific restart behaviour but also to implement different restart 
> strategies. Conceivable strategies could be: Fixed timeout restart, 
> exponential backoff restart, partial topology restarts, etc.
> This change is a preliminary step towards having a restart strategy which 
> will scale the parallelism of a job down in case that not enough slots are 
> available.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-3187) Decouple restart strategy from ExecutionGraph

Reply via email to