zentol opened a new pull request #12948: URL: https://github.com/apache/flink/pull/12948
Fixes an issue where a task would be canceled if a task executor reported the execution via heartbeats without the acknowledgement yet being processed by the JobMaster. This can happen because the Acknowledge was lost, or due to message re-ordering on the JobMaster side. The `ExecutionDeployment(Tracker/Reconciler)` now distinguish between PENDING/DEPLOYED executions. For reconciliation purposes executions in a PENDING state are ignored. Executions are moved into a deployed state once the acknowledge by the TaskExecutor has been processed by the JobMaster. If the task is never acknowledged, then, as before, the task is failed by the JobMaster, and then removed from the tracker (due to terminal state transition, as before). ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
