tillrohrmann commented on a change in pull request #17606:
URL: https://github.com/apache/flink/pull/17606#discussion_r740942116
##########
File path:
flink-tests/src/test/java/org/apache/flink/test/recovery/TaskManagerProcessFailureBatchRecoveryITCase.java
##########
@@ -67,7 +67,7 @@ public void testTaskManagerFailure(Configuration
configuration, final File coord
ExecutionEnvironment env =
ExecutionEnvironment.createRemoteEnvironment("localhost",
1337, configuration);
env.setParallelism(PARALLELISM);
- env.setRestartStrategy(RestartStrategies.fixedDelayRestart(1, 1500L));
+ env.setRestartStrategy(RestartStrategies.fixedDelayRestart(10, 1500L));
Review comment:
For this test, we have the heartbeat interval set to 200ms and the
restart delay to 1.5s. Hence there should be multiple heartbeats being sent
during the restart delay. Moreover, we mark TMs as unreachable after a single
lost message. Hence, I can only think of a processing gap on the test machine
to explain this situation.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]