dongjoon-hyun commented on a change in pull request #35034:
URL: https://github.com/apache/spark/pull/35034#discussion_r775666698



##########
File path: 
common/kvstore/src/main/java/org/apache/spark/util/kvstore/RocksDB.java
##########
@@ -63,15 +59,38 @@
   /** DB key where type aliases are stored. */
   private static final byte[] TYPE_ALIASES_KEY = "__types__".getBytes(UTF_8);
 
+  /**
+   * Use full filter.
+   *
+   * 
https://github.com/facebook/rocksdb/wiki/RocksDB-Bloom-Filter#full-filters-new-format
+   */
+  private static final BloomFilter fullFilter =
+    new BloomFilter(10.0D /* BloomFilter.DEFAULT_BITS_PER_KEY */, false);
+
+  /** Disable compression in index data. */
   private static final BlockBasedTableConfig tableFormatConfig = new 
BlockBasedTableConfig()
+    .setFilterPolicy(fullFilter)
+    .setEnableIndexCompression(false)
     .setFormatVersion(5);
 
+  /**
+   * - Use ZSTD at the bottom most level to reduce the disk space
+   * - Use LZ4 at the other levels because it's better than Snappy in general.
+   *
+   * https://github.com/facebook/rocksdb/wiki/Compression#configuration
+   */
   private static final Options dbOptions = new Options()
     .setCreateIfMissing(true)
-    .setTableFormatConfig(tableFormatConfig)
-    .setStatistics(new Statistics());

Review comment:
       Our benchmark suite is too small to show the overhead, but there was 
some websites about mentioning 5~10% overhead.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to