[
https://issues.apache.org/jira/browse/FLINK-8099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16258909#comment-16258909
]
ASF GitHub Bot commented on FLINK-8099:
---------------------------------------
Github user tillrohrmann commented on a diff in the pull request:
https://github.com/apache/flink/pull/5031#discussion_r151920171
--- Diff:
flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/restart/FixedDelayRestartStrategy.java
---
@@ -81,28 +80,19 @@ public void run() {
public static FixedDelayRestartStrategyFactory
createFactory(Configuration configuration) throws Exception {
int maxAttempts =
configuration.getInteger(ConfigConstants.RESTART_STRATEGY_FIXED_DELAY_ATTEMPTS,
1);
- String timeoutString = configuration.getString(
- AkkaOptions.WATCH_HEARTBEAT_INTERVAL);
-
String delayString = configuration.getString(
ConfigConstants.RESTART_STRATEGY_FIXED_DELAY_DELAY,
- timeoutString
+ "1 s"
--- End diff --
What about introducing a `ConfigOption`?
> Reduce default restart delay to 1 second
> ----------------------------------------
>
> Key: FLINK-8099
> URL: https://issues.apache.org/jira/browse/FLINK-8099
> Project: Flink
> Issue Type: Improvement
> Components: State Backends, Checkpointing
> Reporter: Aljoscha Krettek
> Assignee: Aljoscha Krettek
> Priority: Blocker
> Fix For: 1.4.0
>
>
> Currently, when a job fails Flink will wait for 10 seconds until restarting
> the job. Even zero delay is a reasonable setting but will result in
> "flooding" the logs and quickly increasing the restart counter because at
> zero delay you will always see failures when no standby resources are
> available.
> Reducing this to 1 second should make for a nicer out-of-box experience and
> not flood too much.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)