[
https://issues.apache.org/jira/browse/FLINK-22752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17375162#comment-17375162
]
Yun Tang commented on FLINK-22752:
----------------------------------
[~sewen], you could refer to
[SPINNING_DISK_OPTIMIZED_HIGH_MEM|https://github.com/apache/flink/blob/92d9a46943da6b995fb7e3f0095bd4bc4d75f451/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/PredefinedOptions.java#L165-L180]
to know how to enable bloom filters.
> Add robust default state configuration to StateFun
> --------------------------------------------------
>
> Key: FLINK-22752
> URL: https://issues.apache.org/jira/browse/FLINK-22752
> Project: Flink
> Issue Type: Sub-task
> Reporter: Stephan Ewen
> Priority: Major
>
> We aim to reduce the state configuration complexity by applying a default
> configuration with robust settings, based on lessons learned in Flink.
> *(1) Always use RocksDB.*
> _That is already the case._
> We keep this for now, as long as the only other alternative are backends with
> Objects on the heap, which are tricky in terms of predictable JVM
> performance. RocksDB has a significant performance cost, but more robust
> behavior.
> *(2) Activate local recovery by default.*
> That makes recovery cheao for soft tasks failures and gracefully cancelled
> tasks.
> We need to set these options:
> - {{state.backend.local-recovery: true}}
> - {{taskmanager.state.local.root-dirs: <dir>}} - some local directory that
> will not possibly be wiped by the OS periodically, so typically some local
> directory that is not {{/tmp}}, for example {{/local/state/recovery}}.
> - {{state.backend.rocksdb.localdir: <dir>}} - a directory on the same FS /
> device as above, so that one can create hard links between them (required for
> RocksDB local checkpoints), for example {{/local/state/rocksdb}}.
> Flink will most likely adopt this as a default setting as well in the future.
> It still makes sense to pre-configer a different RocksDB working directory
> than {{/tmp}}.
> *(3) Activate partitioned indexes by default.*
> This may cost minimal performance in some cases, but can avoid massive
> performance regression in cases where the index blocks no longer fit into the
> memory cache (may happen more frequently when there are too many
> ColumnFamilies = states).
> Set {{state.backend.rocksdb.memory.partitioned-index-filters: true}}.
> See FLINK-20496 for details.
> *(4) Increase number of transfer threads by default.*
> This speeds up state recovery in many cases. The default value in Flink is a
> bit conservative, to avoid spamming DFS (like HDFS) by default. The more
> cloud-centric StateFun setups should be safe to use higher default value.
> Set {{state.backend.rocksdb.checkpoint.transfer.thread.num: 8}}.
> *(5) Increase RocksDB compaction threads by default.*
> The number of RocksDB compaction threads is frequently a bottleneck.
> Increasing it costs virtually nothing and mitigates that bottleneck in most
> cases.
> {{state.backend.rocksdb.thread.num: 4}}
> _(this value is chosen under the assumption that there is only one slot per
> TM in StateFun)._
--
This message was sent by Atlassian Jira
(v8.3.4#803005)