Hello,

This can be caused by several reasons such as back-pressure, large
snapshots or bugs.

Could you please share:
- the stats of the previous (successful) checkpoints
- back-pressure metrics for sources
- which Flink version do you use?

Regards,
Roman


On Thu, Mar 11, 2021 at 7:03 AM Alexey Trenikhun <yen...@msn.com> wrote:
>
> Hello,
> We are experiencing the problem with checkpoints failing due to timeout 
> (already set to 30 minute, still failing), checkpoints were not too big 
> before they started to fail, around 1.2Gb. Looks like one of sources (Kafka) 
> never acknowledged (see attached screenshot). What could be the reason?
>
> Thanks,
> Alexey
>
>

Reply via email to