[
https://issues.apache.org/jira/browse/FLINK-31077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Zhu Zhu closed FLINK-31077.
---------------------------
Resolution: Fixed
master:
eb17ec3f05d4bd512bc70ee79296d0b884894eaf
release-1.17:
dca819556fb9b675852df99ada45e0f22262cb28
release-1.16:
4c8159140028cd0654a93dcb7c25fe074ad1f059
> Trigger checkpoint failed but it were shown as COMPLETED by rest API
> --------------------------------------------------------------------
>
> Key: FLINK-31077
> URL: https://issues.apache.org/jira/browse/FLINK-31077
> Project: Flink
> Issue Type: Bug
> Components: Runtime / Checkpointing
> Affects Versions: 1.17.0, 1.15.3, 1.16.1
> Reporter: Junrui Li
> Assignee: Junrui Li
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.17.0, 1.16.2
>
>
> Currently, we can trigger a checkpoint and poll the status of the checkpoint
> until it is finished by rest according to FLINK-27101. However, even if the
> checkpoint status returned by rest is completed, it does not mean that the
> checkpoint is really completed. If an exception occurs after marking the
> pendingCheckpoint
> completed([here|https://github.com/apache/flink/blob/bf0ad52cbcb052961c54c94c7013f5ac0110ef8a/flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CheckpointCoordinator.java#L1309]),
> the checkpoint is not written to the HA service and we can not failover from
> this checkpoint.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)