masteryhx commented on code in PR #24072:
URL: https://github.com/apache/flink/pull/24072#discussion_r1450005923
##########
flink-state-backends/flink-statebackend-rocksdb/src/test/java/org/apache/flink/contrib/streaming/state/RocksDBStateBackendConfigTest.java:
##########
@@ -557,6 +559,9 @@ public void testConfigurableOptionsFromConfig() throws
Exception {
configuration.setString(RocksDBConfigurableOptions.LOG_FILE_NUM.key(), "10");
configuration.setString(RocksDBConfigurableOptions.LOG_MAX_FILE_SIZE.key(),
"2MB");
configuration.setString(RocksDBConfigurableOptions.COMPACTION_STYLE.key(),
"level");
+ configuration.setString(
Review Comment:
Let's also test other compression type to guarantee these could work.
##########
flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBConfigurableOptions.java:
##########
@@ -139,6 +149,25 @@ public class RocksDBConfigurableOptions implements
Serializable {
NONE.name(),
LEVEL.name()));
+ public static final ConfigOption<CompressionType> COMPRESSION_TYPE =
Review Comment:
Could we also introduce CompressionPerLevel ?
From my experience, the space amplification may affect the performace more
obviously when state size is larger than L2 or higher level.
So maybe we have to configure different compression for different levels.
##########
flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBConfigurableOptions.java:
##########
@@ -139,6 +149,25 @@ public class RocksDBConfigurableOptions implements
Serializable {
NONE.name(),
LEVEL.name()));
+ public static final ConfigOption<CompressionType> COMPRESSION_TYPE =
+ key("state.backend.rocksdb.compression.type")
+ .enumType(CompressionType.class)
+ .defaultValue(SNAPPY_COMPRESSION)
+ .withDescription(
+ String.format(
+ "The specified compression type for DB.
Candidate compression type is %s, %s, %s, %s, %s, "
+ + "%s, %s, %s or %s, and RocksDB
choose '%s' as default style.",
+ NO_COMPRESSION.name(),
+ SNAPPY_COMPRESSION.name(),
+ ZLIB_COMPRESSION.name(),
Review Comment:
I'd suggest to:
1. just remain most common-used and lightweight algorithm e.g. NO/SNAPPY/LZ4
to reduce user choice.
2. add some description about how to choose among them.
WDYT ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]