We are using AWS EMR where we can submit our flink jobs to a long running
flink cluster on Yarn.

We wanted to configure RocksDBStateBackend as our state backend to store
our checkpoints.

So we have configured following properties in our flink-conf.yaml

   - state.backend.type: rocksdb
   - state.checkpoints.dir: file:///tmp
   - state.backend.incremental: true

My question here is regarding the checkpoint location: what is the
difference between the location if it is a local filesystem vs a hadoop
distributed file system (hdfs).

What advantages we get if we use:

*state.checkpoints.dir*: hdfs://namenode-host:port/flink-checkpoints
*state.checkpoints.dir*: file:///tmp

Also if we decide to use HDFS then from where we can get the value for
given we are running Flink on an EMR.


Reply via email to