[ https://issues.apache.org/jira/browse/FLINK-3187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15128092#comment-15128092 ]
ASF GitHub Bot commented on FLINK-3187: --------------------------------------- Github user tillrohrmann commented on a diff in the pull request: https://github.com/apache/flink/pull/1470#discussion_r51556295 --- Diff: flink-core/src/main/java/org/apache/flink/api/common/ExecutionConfig.java --- @@ -237,53 +236,26 @@ public ExecutionConfig setParallelism(int parallelism) { } /** - * Gets the number of times the system will try to re-execute failed tasks. A value - * of {@code -1} indicates that the system default value (as defined in the configuration) - * should be used. + * Sets the restart strategy configuration which defines which restart strategy shall be used + * for the execution graph of the corresponding job. --- End diff -- Agreed. Good point. I've simplified the the description and added a code example. > Decouple restart strategy from ExecutionGraph > --------------------------------------------- > > Key: FLINK-3187 > URL: https://issues.apache.org/jira/browse/FLINK-3187 > Project: Flink > Issue Type: Improvement > Affects Versions: 1.0.0 > Reporter: Till Rohrmann > Assignee: Till Rohrmann > Priority: Minor > > Currently, the {{ExecutionGraph}} supports the following restart logic: > Whenever a failure occurs and the number of restart attempts aren't depleted, > wait for a fixed amount of time and then try to restart. This behaviour can > be controlled by the configuration parameters {{execution-retries.default}} > and {{execution-retries.delay}}. > I propose to decouple the restart logic from the {{ExecutionGraph}} a bit by > introducing a strategy pattern. That way it would not only allow us to define > a job specific restart behaviour but also to implement different restart > strategies. Conceivable strategies could be: Fixed timeout restart, > exponential backoff restart, partial topology restarts, etc. > This change is a preliminary step towards having a restart strategy which > will scale the parallelism of a job down in case that not enough slots are > available. -- This message was sent by Atlassian JIRA (v6.3.4#6332)