Same error again today. Any tips ? I'm considering downgrading to Flink
1.14 ?
On Wed, Dec 14, 2022 at 11:51 AM Lars Skjærven wrote:
> As far as I understand we are not specifying anything on restore mode. so
> I guess default (NO_CLAIM) is what we're using.
>
> We're using ververica platform
As far as I understand we are not specifying anything on restore mode. so I
guess default (NO_CLAIM) is what we're using.
We're using ververica platform to handle deploys, and things are a bit
obscure on what happens underneath.
It happened again this morning:
Caused by:
Hi Lars,
Have you used any of the new restore modes that were introduced with 1.15?
https://flink.apache.org/2022/05/06/restore-modes.html
Best regards,
Martijn
On Fri, Dec 9, 2022 at 2:52 PM Lars Skjærven wrote:
> Lifecycle rulesNone
>
> On Fri, Dec 9, 2022 at 3:17 AM Hangxiang Yu wrote:
>
Lifecycle rulesNone
On Fri, Dec 9, 2022 at 3:17 AM Hangxiang Yu wrote:
> Hi, Lars.
> Could you check whether you have configured the lifecycle of google cloud
> storage[1] which is not recommended in the flink checkpoint usage?
>
> [1] https://cloud.google.com/storage/docs/lifecycle
>
> On Fri,
Hi, Lars.
Could you check whether you have configured the lifecycle of google cloud
storage[1] which is not recommended in the flink checkpoint usage?
[1] https://cloud.google.com/storage/docs/lifecycle
On Fri, Dec 9, 2022 at 2:02 AM Lars Skjærven wrote:
> Hello,
> We had an incident today
Hello,
We had an incident today with a job that could not restore after crash (for
unknown reason). Specifically, it fails due to a missing checkpoint file.
We've experienced this a total of three times with Flink 1.15.2, but never
with 1.14.x. Last time was during a node upgrade, but that was not
c885a8f1a64c1925e182562e3_op_KeyedProcessOperator_da2b90ef97e5c844980791c8fe08b926__1_2__uuid_772b4663-f633-4ed5-a67a-d1904760a160/db/001888.sst
>> >这个文件是不是存在 double check 下,如果是下载失败,你需要确认下下载失败的原因
>> >
>> >Best,
>> >Congxian
>> >
>> >
>>
gt; >/data/flink1_10/tmp/flink-io-01229972-48d4-4229-ac8c-33f0edfe5b7c/job_5ec178dc885a8f1a64c1925e182562e3_op_KeyedProcessOperator_da2b90ef97e5c844980791c8fe08b926__1_2__uuid_772b4663-f633-4ed5-a67a-d1904760a160/db/001888.sst
> >这个文件是不是存在 double check 下,如果是下载失败,你需要确认下下载失败的原因
> >
> >Best,
> >Congxian
> >
> >
> >chenxyz 于2020年4月1日周三 下午3:02写道:
> >
>
.sst
>这个文件是不是存在 double check 下,如果是下载失败,你需要确认下下载失败的原因
>
>Best,
>Congxian
>
>
>chenxyz 于2020年4月1日周三 下午3:02写道:
>
>> 任务启用rocksdb作为状态后端,任务出现异常重启时经常失败Could not restore keyed state backend for
>> KeyedProcessOperator。这个问题怎么解决呢?
>>
>> 版本:1.10 standalone
>
check 下,如果是下载失败,你需要确认下下载失败的原因
Best,
Congxian
chenxyz 于2020年4月1日周三 下午3:02写道:
> 任务启用rocksdb作为状态后端,任务出现异常重启时经常失败Could not restore keyed state backend for
> KeyedProcessOperator。这个问题怎么解决呢?
>
> 版本:1.10 standalone
>
> 配置信息:
>
> state.backend: rocksdb
>
> state.checkp
任务启用rocksdb作为状态后端,任务出现异常重启时经常失败Could not restore keyed state backend for
KeyedProcessOperator。这个问题怎么解决呢?
版本:1.10 standalone
配置信息:
state.backend: rocksdb
state.checkpoints.dir: hdfs://nameservice1/data/flink1_10/checkpoint
state.backend.incremental: true
jobmanager.execution.failover
11 matches
Mail list logo