[ https://issues.apache.org/jira/browse/FLINK-29856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17635693#comment-17635693 ]
Mason Chen commented on FLINK-29856: ------------------------------------ Yes, Savepoint is completed successfully from Flink UI, metrics, and jobmanager logs but should be unsuccessful since the operators did not finish checkpointing. Note I didn't see any operator failures during the Savepoint process. It's not only with the source–I also confirmed that the stateful sink operator also doesn't call snapshotState/notifyCheckpointComplete. I haven't checked the Savepoint contents and I didn't notice the affects of corruption (e.g. missing source splits in state) since another checkpoint finished before I shutdown the job. I will test again tomorrow. > Triggering savepoint does not trigger source operator checkpoint > ----------------------------------------------------------------- > > Key: FLINK-29856 > URL: https://issues.apache.org/jira/browse/FLINK-29856 > Project: Flink > Issue Type: Improvement > Components: Runtime / Checkpointing > Affects Versions: 1.16.0 > Reporter: Mason Chen > Priority: Major > > When I trigger a savepoint with the Flink K8s operator, I verified for two > sources (KafkaSource and MultiClusterKafkaSource) do not invoke snapshotState > or notifyCheckpointComplete. This is easily reproducible in a simple pipeline > (e.g. KafkaSource -> print). In this case, the savepoint is complete and > successful, which is verified by the Flink Checkpoint UI tab and the > jobmanager logs. e.g. ` > Triggering checkpoint 3 (type=SavepointType\{name='Savepoint', > postCheckpointAction=NONE, formatType=CANONICAL})` > > However, when the checkpoint occurs via the interval, I do see the sources > checkpointing properly and expected logs in the output. -- This message was sent by Atlassian Jira (v8.20.10#820010)