[ 
https://issues.apache.org/jira/browse/FLINK-25322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17739364#comment-17739364
 ] 

Feifan Wang commented on FLINK-25322:
-------------------------------------

Sorry for the late reply [~pnowojski] .
{quote}just use the native savepoint code path for this
{quote}
Native savepoints can only be triggered by users, and cannot be executed as 
quickly and frequently as log based checkpoints. Therefore, it can only be used 
in scenarios where the user actively restarts the job.
{quote}use claim mode for recovery/restarts
{quote}
Claim mode destroys the snapshots, which prevents users from starting multiple 
jobs from the same snapshot and rolling back jobs with old code (that is useful 
for some important job).
{quote}use/implement fast duplicating FS
{quote}
Fast duplicating is not always available, for example the very commonly used 
hadoop filesystem does not currently support fast copy (See 
[HDFS-2139|https://issues.apache.org/jira/browse/HDFS-2139]) . 
{quote}nobody has complained about no-claim mode not working for the changelog 
statebackend so far
{quote}
In fact, not supporting the no-claim mode is one of the obstacles to promoting 
log based checkpoint in our company. Our team provides flink-related technical 
support to other teams in our company that develop flink jobs. We do have some 
users who want to use both log based checkpoint and no-claim mode. For example, 
one of our users has many thousands of concurrent jobs. A single job is 
distributed on many machines due to the high number of concurrency, so that it 
often encounters job restarts caused by machine failures. Before using log 
based checkpoint, they can only set the checkpoint interval to 10 minutes, and 
if it is shorter, the checkpoint will become unstable. Each job restart caused 
roughly ten minutes of data to be reconsumed, which in turn caused tens of 
minutes of latency. We are trying to use log based checkpoint on these jobs and 
everything is fine except no-claim mode is not supported.
{quote}That would require a bigger discussion I think.
{quote}
I'd love to initiate a discussion like this, how do I go about it ?

> Support no-claim mode in changelog state backend
> ------------------------------------------------
>
>                 Key: FLINK-25322
>                 URL: https://issues.apache.org/jira/browse/FLINK-25322
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Runtime / Checkpointing, Runtime / State Backends
>            Reporter: Dawid Wysakowicz
>            Assignee: Feifan Wang
>            Priority: Major
>             Fix For: 1.18.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to