Re: Failure to restore from last completed checkpoint

2023-09-08 Thread Alexis Sarda-Espinosa
Hello, Just a shot in the dark here, but could it be related to https://issues.apache.org/jira/browse/FLINK-32241 ? Such failures can cause many exceptions, but I think the ones you've included aren't pointing to the root cause, so I'm not sure if that issue applies to you. Regards, Alexis. On

Re: Failure to restore from last completed checkpoint

2023-09-08 Thread Jacqlyn Bender via user
Hi Yanfei, We were never able to restore from a checkpoint, we ended up restoring from a savepoint as fallback. Would those logs suggest we failed to take a checkpoint before the job manager restarted? Our observabillity monitors showed no failed checkpoints. Here is an exception that occurred

Re: Failure to restore from last completed checkpoint

2023-09-07 Thread Yanfei Lei
Hey Jacqlyn, According to the stack trace, it seems that there is a problem when the checkpoint is triggered. Is this the problem after the restore? would you like to share some logs related to restoring? Best, Yanfei Jacqlyn Bender via user 于2023年9月8日周五 05:11写道: > > Hey folks, > > > We

Failure to restore from last completed checkpoint

2023-09-07 Thread Jacqlyn Bender via user
Hey folks, We experienced a pipeline failure where our job manager restarted and we were for some reason unable to restore from our last successful checkpoint. We had regularly completed checkpoints every 10 minutes up to this failure and 0 failed checkpoints logged. Using Flink version 1.17.1.