zhuzhurk commented on code in PR #21970: URL: https://github.com/apache/flink/pull/21970#discussion_r1113780430
########## flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/DefaultScheduler.java: ########## @@ -377,6 +377,10 @@ private void restartTasks( final Set<ExecutionVertexID> verticesToRestart = executionVertexVersioner.getUnmodifiedExecutionVertices(executionVertexVersions); + if (verticesToRestart.isEmpty()) { + return; Review Comment: A global failover can be superseded by a regional failover, regarding the tasks to restart. Here's an example: Here's a job consists of one only pipelined region. A global failure happens first(caused by the OperatorCoordinator) and need to restart all the tasks. It also needs `OperatorCoordinatorHolder#resetToCheckpoint()` to be invoked to recover from an inconsistent status. However, a task happens later but almost at the same time, which needs to restart all the tasks. Therefore, the `verticesToRestart` would be empty when `restartTasks(...)` is invoked for the global failure. And `OperatorCoordinatorHolder#resetToCheckpoint()` will not be invoked. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org