Re: Flink Job Restarts if the metadata already exists for some Checkpoint

2023-05-11 Thread Weihua Hu
Hi, checkpoints are only used in failover for one job. Once a job is cancelled, the related checkpoint-count metadata (stored on HA) will be removed. But the checkpoint data could be retained if you configured it. IIUC, the redeploy/update job will cancel the old job and then start a new one.

Re: Flink Job Restarts if the metadata already exists for some Checkpoint

2023-05-10 Thread amenreet sodhi
Hey Hang, I am deploying my Flink Job in HA application mode, Whenever I redeploy my job, or deploy an updated version of the job, it's using the same job_id. I haven't configured anywhere to use a fixed job id, I think it's doing it by default. Can you share where I can configure this? I tried

Re: Flink Job Restarts if the metadata already exists for some Checkpoint

2023-05-10 Thread amenreet sodhi
Hi Weihua, I am deploying my flink job in HA application mode on a kubernetes cluster. I am using an external nfs mount for storing checkpoints. For some reason, whenever I deploy an updated version of my application, it uses the same job_id for the new job as for the previous job. Thus the flink

Re: Flink Job Restarts if the metadata already exists for some Checkpoint

2023-05-09 Thread Weihua Hu
Hi, if for some reason there exists a checkpoint by same name. > Could you give more details about your scenarios here? >From your description, I guess this problem occurred when a job restart, does this restart is triggered personally? In common restart processing, the job will retrieve the

Re: Flink Job Restarts if the metadata already exists for some Checkpoint

2023-05-09 Thread Hang Ruan
Hi, amenreet, As Hangxiang said, we should use a new checkpoint dir if the new job has the same jobId as the old one . Or else you should not use a fixed jobId and the checkpoint dir will not conflict. Best, Hang Hangxiang Yu 于2023年5月10日周三 10:35写道: > Hi, > I guess you used a fixed JOB_ID, and

Re: Flink Job Restarts if the metadata already exists for some Checkpoint

2023-05-09 Thread Hangxiang Yu
Hi, I guess you used a fixed JOB_ID, and configured the same checkpoint dir as before ? And you may also start the job without before state ? The new job cannot know anything about before checkpoints, that's why the new job will fail when it tries to generate a new checkpoint. I'd like to suggest

Flink Job Restarts if the metadata already exists for some Checkpoint

2023-05-09 Thread amenreet sodhi
Hi all, Is there any way to prevent restart of flink job, or override the checkpoint metadata, if for some reason there exists a checkpoint by same name. I get the following exception and my job restarts, have been trying to find solution for a very long time but havent found anything useful yet,