We are tuning some of the flink jobs we have in production and we would
like to know what are the best numbers/considerations for checkpoint
interval. We have set a default of 30 seconds for checkpoint interval and
the checkpoint operation takes around 2 seconds.
We have also enabled incremental checkpoint. I understand there is a
tradeoff between recovery from failure time vs performance degradation on
having an aggressive checkpoint policy but would like to know about what
you guys think it is a good compromise.

I read this article as reference:

But what I would like is some formula or recipe in order to find out the
best value for checkpoint interval.


Reply via email to