Just noticed that I forgot to include also a reference to the documentation about externalized checkpoints: https://ci.apache.org/projects/flink/flink-docs-master/ops/state/checkpoints.html <https://ci.apache.org/projects/flink/flink-docs-master/ops/state/checkpoints.html>
> Am 14.08.2017 um 14:17 schrieb Stefan Richter <s.rich...@data-artisans.com>: > > > Hi, > >> >> Also, in the same line, can someone detail the difference between State >> Backend & External checkpoint? >> > > Those are two very different things. If we talk about state backends in > Flink, we mean the entity that is responsible for storing and managing the > state inside an operator. This could for example be something like the > FsStateBackend that is based on hash maps and keeps state on the heap, or the > RocksDBStateBackend which is using RocksDB as a store internally and operates > on native memory and disk. > > An externalized checkpoint, like a normal checkpoint, is the collection of > all state in a job persisted to stable storage for recovery. A little more > concrete, this typically means writing out the contents of the state backends > to a save place so that we can restore them from there. > >> Also, programmatic API, thru which methods we can configure those. > > This explains how to set the backend programatically: > > https://ci.apache.org/projects/flink/flink-docs-master/ops/state/state_backends.html > > <https://ci.apache.org/projects/flink/flink-docs-master/ops/state/state_backends.html> > > To activate externalized checkpoints, you activate normal checkpoints, plus > the following line: > > env.getCheckpointConfig().enableExternalizedCheckpoints(CheckpointConfig.ExternalizedCheckpointCleanup.RETAIN_ON_CANCELLATION); > > where env is your StreamExecutionEnvironment. > > If you need an example, please take a look at the > org.apache.flink.test.checkpointing.ExternalizedCheckpointITCase. This class > configures everything you asked about programatically. > > Best, > Stefan >