[
https://issues.apache.org/jira/browse/FLINK-29856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17635693#comment-17635693
]
Mason Chen commented on FLINK-29856:
------------------------------------
Yes, Savepoint is completed successfully from Flink UI, metrics, and jobmanager
logs but should be unsuccessful since the operators did not finish
checkpointing. Note I didn't see any operator failures during the Savepoint
process.
It's not only with the source–I also confirmed that the stateful sink operator
also doesn't call snapshotState/notifyCheckpointComplete.
I haven't checked the Savepoint contents and I didn't notice the affects of
corruption (e.g. missing source splits in state) since another checkpoint
finished before I shutdown the job. I will test again tomorrow.
> Triggering savepoint does not trigger source operator checkpoint
> -----------------------------------------------------------------
>
> Key: FLINK-29856
> URL: https://issues.apache.org/jira/browse/FLINK-29856
> Project: Flink
> Issue Type: Improvement
> Components: Runtime / Checkpointing
> Affects Versions: 1.16.0
> Reporter: Mason Chen
> Priority: Major
>
> When I trigger a savepoint with the Flink K8s operator, I verified for two
> sources (KafkaSource and MultiClusterKafkaSource) do not invoke snapshotState
> or notifyCheckpointComplete. This is easily reproducible in a simple pipeline
> (e.g. KafkaSource -> print). In this case, the savepoint is complete and
> successful, which is verified by the Flink Checkpoint UI tab and the
> jobmanager logs. e.g. `
> Triggering checkpoint 3 (type=SavepointType\{name='Savepoint',
> postCheckpointAction=NONE, formatType=CANONICAL})`
>
> However, when the checkpoint occurs via the interval, I do see the sources
> checkpointing properly and expected logs in the output.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)