tillrohrmann commented on a change in pull request #17606:
URL: https://github.com/apache/flink/pull/17606#discussion_r740942116



##########
File path: 
flink-tests/src/test/java/org/apache/flink/test/recovery/TaskManagerProcessFailureBatchRecoveryITCase.java
##########
@@ -67,7 +67,7 @@ public void testTaskManagerFailure(Configuration 
configuration, final File coord
         ExecutionEnvironment env =
                 ExecutionEnvironment.createRemoteEnvironment("localhost", 
1337, configuration);
         env.setParallelism(PARALLELISM);
-        env.setRestartStrategy(RestartStrategies.fixedDelayRestart(1, 1500L));
+        env.setRestartStrategy(RestartStrategies.fixedDelayRestart(10, 1500L));

Review comment:
       For this test, we have the heartbeat interval set to 200ms and the 
restart delay to 1.5s. Hence there should be multiple heartbeats being sent 
during the restart delay. Moreover, we mark TMs as unreachable after a single 
lost message. Hence, I can only think of a processing gap on the test machine 
to explain this situation.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to