Hi devs,

I'd like to start a discussion of FLIP-158: Generalized incremental
checkpoints [1]

FLIP motivation:
Low end-to-end latency is a much-demanded property in many Flink setups.
With exactly-once, this latency depends on checkpoint interval/duration
which in turn is defined by the slowest node (usually the one doing a full
non-incremental snapshot). In large setups with many nodes, the probability
of at least one node being slow gets higher, making almost every checkpoint
slow.

This FLIP proposes a mechanism to deal with this by materializing and
uploading state continuously and only uploading the changed part during the
checkpoint itself. It differs from other approaches in that 1) checkpoints
are always incremental; 2) works for any state backend.

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-158%3A+Generalized+incremental+checkpoints

Any feedback highly appreciated!

Regards,
Roman

Reply via email to