Myasuka commented on a change in pull request #10987: [FLINK-14495][docs]
(EXTENDED) Add documentation for memory control of RocksDB state backend
URL: https://github.com/apache/flink/pull/10987#discussion_r373792780
##########
File path: docs/ops/state/state_backends.md
##########
@@ -188,8 +188,145 @@ state.backend: filesystem
state.checkpoints.dir: hdfs://namenode:40010/flink/checkpoints
{% endhighlight %}
-#### RocksDB State Backend Config Options
-{% include generated/rocks_db_configuration.html %}
+# RocksDB State Backend Details
+
+*This section describes the RocksDB state backend in more detail.*
+
+### Incremental Checkpoints
+
+RocksDB supports *Incremental Checkpoints*, which can dramatically reduce the
checkpointing time in comparison to full checkpoints.
+Instead of producing a full, self-contained backup of the state backend,
incremental checkpoints only record the changes that happened since the latest
completed checkpoint.
+
+An incremental checkpoint builds upon (typically multiple) previous
checkpoints. Flink leverages RocksDB's internal compaction mechanism in a way
that is self-consolidating over time. As a result, the incremental checkpoint
history in Flink does not grow indefinitely, and old checkpoints are eventually
subsumed and pruned automatically.
+
+Recovery time of incremental checkpoints may be longer or shorter comapared to
full checkpoints. If your network bandwidth is the bottleneck, it may take a
bit longer to restore from an incremental checkpoint, because it implies
fetching more data (more deltas). Restoring from an incremental checkpoint is
faster, if the bottleneck is your CPU or IOPs, because restoring from an
incremental checkpoint means not re-building the local RocksDB tables from
Flink's canonical key/value snapshot format(used in savepoints and full
checkpoints).
+
+While we encourage the use of incremental checkpoints for large state, you
need to enable this feature manually:
+ - Setting a default in your `flink-conf.yaml`: `state.backend.incremental:
true`
+ - Configuring this in code (overrides the config default):
`RocksDBStateBackend backend = new RocksDBStateBackend(filebackend, true);`
Review comment:
As `RocksDBStateBackend(AbstractStateBackend, boolean)` has been tagged as
deprecated, how about using `RocksDBStateBackend(StateBackend, TernaryBoolean)`
to explain?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services