[
https://issues.apache.org/jira/browse/FLINK-20222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Robert Metzger updated FLINK-20222:
-----------------------------------
Fix Version/s: 1.12.0
> The CheckpointCoordinator should reset the OperatorCoordinators when fail
> before the first checkpoint.
> ------------------------------------------------------------------------------------------------------
>
> Key: FLINK-20222
> URL: https://issues.apache.org/jira/browse/FLINK-20222
> Project: Flink
> Issue Type: Bug
> Components: Runtime / Checkpointing
> Reporter: Jiangjie Qin
> Assignee: Stephan Ewen
> Priority: Critical
> Fix For: 1.12.0
>
>
> Right now, if a job failed before the first successful checkpoint, the
> CheckpointCoordinator will not reset the OperatorCoordinator state. This may
> leave the OperatorCoordinators in inconsistent state.
> The CheckpointCoordinator should also reset the OperatorCoordinator state in
> this case, just like it does for the master hooks. It essentially means
> "reset to no checkpoint". There are two options for the fix:
> # Add a reset() method to the OperatorCoordinator.
> # Call resetToCheckpoint(null) on the OperatorCoordinator.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)