[ 
https://issues.apache.org/jira/browse/FLINK-32895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17756752#comment-17756752
 ] 

Rui Fan commented on FLINK-32895:
---------------------------------

Thanks [~zhuzh] for the quick feedback!

{quote}A FLIP is required because it includes changes to pubic interfaces 
(config options). And it is proposing a new feature which needs to be seen and 
set by users.{quote}

Got it, thanks for the clarification!

{quote}And maybe we can re-consider the new config option, to make to easier 
for understanding. e.g. introduce a 
restart-strategy.exponential-delay.fail-on-exceeding-max-backoff.{quote}

Good suggestion, and I will record this suggestion, and we can discuss the 
option in the mail list later.

{quote}I would also suggest to not change RestartStrategies any more because we 
are considering to deprecate it later when improving Flink configuration. 
RestartStrategies is not flexible for custom restart strategy and can be 
superseded by config options.
 {quote}

To be honest, I and our internal flink platform always use the config option 
instead of Java code for flink configuration. So I totally agree deprecating 
the RestartStrategies.

{quote}Can we delay this work a bit, waiting for the result of the FLIP and ML 
discussion of the deprecation? It should happen soon.{quote}

Sure, this improvement can wait for deprecating the RestartStrategies, and 
could you ping me if the discussion is started? thanks a lot :)


> Introduce the max attempts for Exponential Delay Restart Strategy
> -----------------------------------------------------------------
>
>                 Key: FLINK-32895
>                 URL: https://issues.apache.org/jira/browse/FLINK-32895
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Coordination
>            Reporter: Rui Fan
>            Assignee: Rui Fan
>            Priority: Major
>              Labels: pull-request-available
>
> Currently, Flink has 3 restart strategies, they are: fixed-delay, 
> failure-rate and exponential-delay.
> The exponential-delay is suitable if a job continues to fail for a period of 
> time. The fixed-delay and failure-rate has the max attempts mechanism, that 
> means, the job won't restart and go to fail after the attempt exceeds the 
> threshold of max attempts. 
> The max attempts mechanism is reasonable, flink should not or need to 
> infinitely restart the job if the job keeps failing. However, the 
> exponential-delay doesn't have the max attempts mechanism.
> I propose introducing the 
> `restart-strategy.exponential-delay.max-attempts-before-reset` to support the 
> max attempts mechanism for exponential-delay. It means flink won't restart 
> job if the number of job failures before reset exceeds 
> max-attempts-before-reset when is exponential-delay is enabled.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to