dongjoon-hyun commented on a change in pull request #35034: URL: https://github.com/apache/spark/pull/35034#discussion_r775666511
########## File path: common/kvstore/src/main/java/org/apache/spark/util/kvstore/RocksDB.java ########## @@ -63,15 +59,38 @@ /** DB key where type aliases are stored. */ private static final byte[] TYPE_ALIASES_KEY = "__types__".getBytes(UTF_8); + /** + * Use full filter. + * + * https://github.com/facebook/rocksdb/wiki/RocksDB-Bloom-Filter#full-filters-new-format + */ + private static final BloomFilter fullFilter = + new BloomFilter(10.0D /* BloomFilter.DEFAULT_BITS_PER_KEY */, false); + + /** Disable compression in index data. */ private static final BlockBasedTableConfig tableFormatConfig = new BlockBasedTableConfig() + .setFilterPolicy(fullFilter) + .setEnableIndexCompression(false) .setFormatVersion(5); + /** + * - Use ZSTD at the bottom most level to reduce the disk space + * - Use LZ4 at the other levels because it's better than Snappy in general. + * + * https://github.com/facebook/rocksdb/wiki/Compression#configuration + */ private static final Options dbOptions = new Options() .setCreateIfMissing(true) - .setTableFormatConfig(tableFormatConfig) - .setStatistics(new Statistics()); Review comment: Thank you for review, @mridulm . Apache Spark doesn't use `Statistics` and it's removed due to the non-negligible overhead. - https://github.com/facebook/rocksdb/issues/2489 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
