When RocksDB holds a very large state, is there a concern over the time
takes in checkpointing the RocksDB data to HDFS? Is asynchronous
checkpointing a recommended practice here?


https://ci.apache.org/projects/flink/flink-docs-master/apis/streaming/state_backends.html

"The RocksDBStateBackend holds in-flight data in a RocksDB
<http://rocksdb.org/> data base that is (per default) stored in the
TaskManager data directories. Upon checkpointing, the whole RocksDB data
base will be checkpointed into the configured file system and directory.
Minimal metadata is stored in the JobManager’s memory (or, in
high-availability mode, in the metadata checkpoint).

The RocksDBStateBackend is encouraged for:

   - Jobs with very large state, long windows, large key/value states.
   - All high-availability setups."


thx
Daniel

Reply via email to