[jira] [Commented] (FLINK-25322) Support no-claim mode in changelog state backend

Feifan Wang (Jira) Mon, 24 Jul 2023 20:38:30 -0700


    [ 
https://issues.apache.org/jira/browse/FLINK-25322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17746750#comment-17746750
 ]


Feifan Wang commented on FLINK-25322:
-------------------------------------

Thanks for reply [~masteryhx] .
{quote}Users have to check the status of restore mode by the Flink UI or the 
REST API, right ?
{quote}
Yes, if users want to delete the restored non claimed checkpoint, he/she must 
check the flink ui or REST API to confirm the new job is state self-sustained. 
Otherwise, new jobs that are not state self-sustaining may fail to restart 
because of file not found. But I don't think this adds complexity to the user, 
because before that the user also has to check the Flink UI or REST UI to 
determine whether the new job has completed at least one checkpoint. In 
contrast, I think checking whether the job is state self-sustained is more 
intuitive.
{quote}If users stop the job before the 'slowest materilization of all 
subtasks', this behaves like LEGACY mode, otherwise this could behaves like 
NO_CLIAM mode, right ?
{quote}
In fact, I am also thinking about how to deal with retained checkpoints before 
the job reaches state self-sustained. One option is that checkpoints before 
state self-sustained will not be retained as retained checkpoints. But this 
will cause data duplication when the job using the transactional sink resumes 
from that restored checkpoint. Another option is to record in the checkpoint 
metadata which state artifacts are borrowed from the non-claimed checkpoint, 
and when the new checkpoint is used for claim mode recovery, those state 
artifacts borrowed from the non-claimed checkpoint will not be deleted. Do you 
have any thoughts on this issue [~pnowojski]  ?
{quote}Of course, IIUC, If users want to use NO_CLAIM mode, they'd like to 
retain a CP to let other jobs use.
{quote}
In fact, even if a user manually redeploys the same job after updating the 
business logic code, it is desirable to be able to use the no-claim mode. 
Because the no-claim mode can guarantee that the job can be rolled back when 
there is a problem with the new logic code.

> Support no-claim mode in changelog state backend
> ------------------------------------------------
>
>                 Key: FLINK-25322
>                 URL: https://issues.apache.org/jira/browse/FLINK-25322
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Runtime / Checkpointing, Runtime / State Backends
>            Reporter: Dawid Wysakowicz
>            Assignee: Feifan Wang
>            Priority: Major
>             Fix For: 1.18.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-25322) Support no-claim mode in changelog state backend

Reply via email to