carp84 commented on a change in pull request #10498: [FLINK-14495][docs] Add 
documentation for memory control of RocksDB state backend
URL: https://github.com/apache/flink/pull/10498#discussion_r372819133
 
 

 ##########
 File path: docs/ops/state/large_state_tuning.md
 ##########
 @@ -210,6 +211,71 @@ and not from the JVM. Any memory you assign to RocksDB 
will have to be accounted
 of the TaskManagers by the same amount. Not doing that may result in 
YARN/Mesos/etc terminating the JVM processes for
 allocating more memory than configured.
 
+### Bounding RocksDB Memory Usage
+
+RocksDB allocates native memory outside of the JVM, which could lead the 
process to exceed the total memory budget.
+This can be especially problematic in containerized environments such as 
Kubernetes that kill processes who exceed their memory budgets.
+Flink limit total memory usage of RocksDB instance(s) per slot by leveraging 
shareable [cache](https://github.com/facebook/rocksdb/wiki/Block-Cache)
+and [write buffer 
manager](https://github.com/facebook/rocksdb/wiki/Write-Buffer-Manager) among 
all instances in a single slot by default.
+The shared cache will place an upper limit on the [three 
components](https://github.com/facebook/rocksdb/wiki/Memory-usage-in-RocksDB) 
that use the majority of memory
+when RocksDB is deployed as a state backend: block cache, index and bloom 
filters, and MemTables.
+This feature is enabled by default and could be controlled by two ways:
+  -  Integrate with managed memory of task manager: turn 
`state.backend.rocksdb.memory.managed` as true. If so, RocksDB state backend 
will use the managed memory budget of the task slot to set the capacity of that 
shared cache object.
+  This operation is enabled by default, which means Flink would always choose 
to integrate RocksDB memory usage with the managed memory first.
+  -  Not integrated with managed memory: configure the memory size of 
`state.backend.rocksdb.memory.fixed-per-slot` to set the fixed total amount of 
memory per slot.
+  This option will override `state.backend.rocksdb.memory.managed` option when 
configured and ignore calculated managed memory per slot from task manager.
+  User could also configure `taskmanager.memory.task.off-heap.size` to set 
additional quota in off-heap memory, which should be equal to 
`taskmanager.numberOfTaskSlots` * 
``state.backend.rocksdb.memory.fixed-per-slot``, to fit in Flink's memory model.
+
+Flink also provides two parameters to tune the memory fraction of MemTable and 
index & filters:
+  - `state.backend.rocksdb.memory.write-buffer-ratio`, by default `0.5`. If 
RocksDB memory bounded feature is turned on, 50% of memory size would be used 
by write buffer manager by default.
+  - `state.backend.rocksdb.memory.high-prio-pool-ratio`, by default `0.1`.
+  If RocksDB memory bounded feature is turned on, 10% 0f memory size would be 
set as high priority for index and filters in shared block cache by default.
+  By enabling this, index and filters would not need to compete against data 
blocks for staying in cache to minimize performance problem if those index and 
filters are evicted by data blocks frequently.
 
 Review comment:
   Flink also provides two parameters to tune the memory fraction of MemTable 
and index & filters along with the bounding RocksDB memory usage feature:
   - `state.backend.rocksdb.memory.write-buffer-ratio`, by default `0.5`, which 
means 50% of the given memory would be used by write buffer manager.
   - `state.backend.rocksdb.memory.high-prio-pool-ratio`, by default `0.1`, 
which means 10% 0f the given memory would be set as high priority for index and 
filters in shared block cache. We strongly suggest not to set this to zero, to 
prevent index and filters from competing against data blocks for staying in 
cache and causing performance issues.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to