[ 
https://issues.apache.org/jira/browse/FLINK-13060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16877814#comment-16877814
 ] 

Chesnay Schepler commented on FLINK-13060:
------------------------------------------

Alternatively, we could piggy-back the entire failover execution on the restart 
strategies decision to restart by encapsulating it in a {{RestartCallback}}, 
and calling {{RestartStrateg#restart}} from the {{FailoverStrategy}}. If the 
strategy wants to restart, the failover logic will be executed, otherwise no 
failover will be attempted.

> FailoverStrategies should respect restart constraints
> -----------------------------------------------------
>
>                 Key: FLINK-13060
>                 URL: https://issues.apache.org/jira/browse/FLINK-13060
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Coordination
>            Reporter: Chesnay Schepler
>            Assignee: Chesnay Schepler
>            Priority: Major
>             Fix For: 1.9.0
>
>
> RestartStrategies can define their own restrictions for whether job can be 
> restarted or not. For example, they could count the number of total failures 
> or observe failure rates.
> FailoverStrategies are used for partial restarts of jobs, and currently 
> largely bypass the restrictions defined by the restart strategies.
> My proposal is the following:
> Introduce a new method into the {{RestartStrategy}} interface to notify the 
> strategy of failed task executions. Currently, strategies implicitly handle 
> this in {{RestartStrategy#restart}}, as such the migration of our existing 
> strategies should be trivial.
> Next, before calling {{RestartStrategy#restart}}, inform the strategy about 
> the task failure. This retains existing behavior.
> Additionally, the {{FailoverStrategy}} implementation may additionally inform 
> the restart strategy about task failures, if and when they perform a local 
> failover. Additionally, all implementation have to check 
> {{RestartStrategy#canRestart}} before attempting a failover.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to