[jira] [Commented] (FLINK-25322) Support no-claim mode in changelog state backend

Piotr Nowojski (Jira) Thu, 27 Jul 2023 01:29:07 -0700


    [ 
https://issues.apache.org/jira/browse/FLINK-25322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17747840#comment-17747840
 ]


Piotr Nowojski commented on FLINK-25322:
----------------------------------------

Thanks for pointing this out [~masteryhx].
{quote}
But this will cause data duplication when the job using the transactional sink 
resumes from that restored checkpoint
{quote}
I think this not a viable option from the perspective of users.
{quote}
Another option is to record in the checkpoint metadata which state artifacts 
are borrowed from the non-claimed checkpoint, and when the new checkpoint is 
used for claim mode recovery, those state artifacts borrowed from the 
non-claimed checkpoint will not be deleted. 
{quote}
This on the other hand I think brakes the {{NO_CLAIM}} mode recovery.
# User has some checkpoint/savepoint {{A}} that's under his own control
# User starts a new job using checkpoint/savepoint {{A}}, using {{NO_CLAIM}}
# Job fails above supported number of failovers or is cancelled, leaving behind 
a retained checkpoint {{B}}. {{B}} still depends on {{A}}, as job hasn't 
managed to fully materialized the "borrowed" files.
After this, how user should know, if it is safe for him to for example manually 
delete {{A}}? The retained checkpoint {{B}} might be left unused for an 
arbitrary long time.

In the end I think user just has to use one of the workarounds that I mentioned 
before, or just accept that recovering with changelog state backend in the 
{{NO_CLAIM}} mode will most likely cause the first checkpoint to be quite long.


> Support no-claim mode in changelog state backend
> ------------------------------------------------
>
>                 Key: FLINK-25322
>                 URL: https://issues.apache.org/jira/browse/FLINK-25322
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Runtime / Checkpointing, Runtime / State Backends
>            Reporter: Dawid Wysakowicz
>            Assignee: Feifan Wang
>            Priority: Major
>             Fix For: 1.18.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-25322) Support no-claim mode in changelog state backend

Reply via email to