Hi Robin,

this is a very good observation and maybe even unintended behavior. Maybe Arvid in CC is more familiar with the checkpointing?

Regards,
Timo


On 02.04.20 15:37, Robin Cassan wrote:
Hi all,

I am wondering if there is a way to make a flink job fail (not cancel it) when one or several checkpoints have failed due to being expired (taking longer than the timeout) ? I am using Flink 1.9.2 and have set `*setTolerableCheckpointFailureNumber(1)*` which doesn't do the trick. Looking into the CheckpointFailureManager.java class, it looks like this only works when the checkpoint failure reason is `*CHECKPOINT_DECLINED*`, but the number of failures isn't incremented on `*CHECKPOINT_EXPIRED*`.
Am I missing something?

Thanks!

Reply via email to