Re: Apache Flink - Question about application restart

Yang Wang Mon, 25 May 2020 01:01:50 -0700

Just share some additional information.

When deploying Flink application on Yarn and it exhausted restart policy,
then
the whole application will failed. If you start another instance(Yarn
application),
even the high availability is configured, we could not recover from the
latest
checkpoint because the clusterId(i.e. applicationId) has changed.



Best,
Yang

Zhu Zhu <reed...@gmail.com> 于2020年5月25日周一 上午11:17写道：

> Hi M,
>
> Regarding your questions:
> 1. yes. The id is fixed once the job graph is generated.
> 2. yes
>
> Regarding yarn mode:
> 1. the job id keeps the same because the job graph will be generated once
> at client side and persist in DFS for reuse
> 2. yes if high availability is enabled
>
> Thanks,
> Zhu Zhu
>
> M Singh <mans2si...@yahoo.com> 于2020年5月23日周六 上午4:06写道：
>
>> Hi Flink Folks:
>>
>> If I have a Flink Application with 10 restarts, if it fails and restarts,
>> then:
>>
>> 1. Does the job have the same id ?
>> 2. Does the automatically restarting application, pickup from the last
>> checkpoint ? I am assuming it does but just want to confirm.
>>
>> Also, if it is running on AWS EMR I believe EMR/Yarn is configured to
>> restart the job 3 times (after it has exhausted it's restart policy) .  If
>> that is the case:
>> 1. Does the job get a new id ? I believe it does, but just want to
>> confirm.
>> 2. Does the Yarn restart honor the last checkpoint ?  I believe, it does
>> not, but is there a way to make it restart from the last checkpoint of the
>> failed job (after it has exhausted its restart policy) ?
>>
>> Thanks
>>
>>
>>

Re: Apache Flink - Question about application restart

Reply via email to