[
https://issues.apache.org/jira/browse/FLINK-25322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17747840#comment-17747840
]
Piotr Nowojski commented on FLINK-25322:
----------------------------------------
Thanks for pointing this out [~masteryhx].
{quote}
But this will cause data duplication when the job using the transactional sink
resumes from that restored checkpoint
{quote}
I think this not a viable option from the perspective of users.
{quote}
Another option is to record in the checkpoint metadata which state artifacts
are borrowed from the non-claimed checkpoint, and when the new checkpoint is
used for claim mode recovery, those state artifacts borrowed from the
non-claimed checkpoint will not be deleted.
{quote}
This on the other hand I think brakes the {{NO_CLAIM}} mode recovery.
# User has some checkpoint/savepoint {{A}} that's under his own control
# User starts a new job using checkpoint/savepoint {{A}}, using {{NO_CLAIM}}
# Job fails above supported number of failovers or is cancelled, leaving behind
a retained checkpoint {{B}}. {{B}} still depends on {{A}}, as job hasn't
managed to fully materialized the "borrowed" files.
After this, how user should know, if it is safe for him to for example manually
delete {{A}}? The retained checkpoint {{B}} might be left unused for an
arbitrary long time.
In the end I think user just has to use one of the workarounds that I mentioned
before, or just accept that recovering with changelog state backend in the
{{NO_CLAIM}} mode will most likely cause the first checkpoint to be quite long.
> Support no-claim mode in changelog state backend
> ------------------------------------------------
>
> Key: FLINK-25322
> URL: https://issues.apache.org/jira/browse/FLINK-25322
> Project: Flink
> Issue Type: Sub-task
> Components: Runtime / Checkpointing, Runtime / State Backends
> Reporter: Dawid Wysakowicz
> Assignee: Feifan Wang
> Priority: Major
> Fix For: 1.18.0
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)