Hi,

> 
> Also, in the same line, can someone detail the difference between State 
> Backend & External checkpoint?
>  

Those are two very different things. If we talk about state backends in Flink, 
we mean the entity that is responsible for storing and managing the state 
inside an operator. This could for example be something like the FsStateBackend 
that is based on hash maps and keeps state on the heap, or the 
RocksDBStateBackend which is using RocksDB as a store internally and operates 
on native memory and disk.

An externalized checkpoint, like a normal checkpoint, is the collection of all 
state in a job persisted to stable storage for recovery. A little more 
concrete, this typically means writing out the contents of the state backends 
to a save place so that we can restore them from there.

> Also, programmatic API, thru which methods we can configure those.

This explains how to set the backend programatically:

https://ci.apache.org/projects/flink/flink-docs-master/ops/state/state_backends.html
 
<https://ci.apache.org/projects/flink/flink-docs-master/ops/state/state_backends.html>

To activate externalized checkpoints, you activate normal checkpoints, plus the 
following line:

env.getCheckpointConfig().enableExternalizedCheckpoints(CheckpointConfig.ExternalizedCheckpointCleanup.RETAIN_ON_CANCELLATION);

where env is your StreamExecutionEnvironment.

If you need an example, please take a look at the 
org.apache.flink.test.checkpointing.ExternalizedCheckpointITCase. This class 
configures everything you asked about programatically.

Best,
Stefan

Reply via email to