nsivabalan commented on code in PR #10457:
URL: https://github.com/apache/hudi/pull/10457#discussion_r3269711715


##########
hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieFileWriterFactory.java:
##########
@@ -128,4 +128,14 @@ public static BloomFilter createBloomFilter(HoodieConfig 
config) {
         
config.getIntOrDefault(HoodieStorageConfig.BLOOM_FILTER_DYNAMIC_MAX_ENTRIES),
         config.getStringOrDefault(HoodieStorageConfig.BLOOM_FILTER_TYPE));
   }
+
+  /**
+   * Check if need to enable bloom filter.
+   */
+  public static boolean enableBloomFilter(boolean populateMetaFields, 
HoodieConfig config) {
+    return populateMetaFields && 
(config.getBooleanOrDefault(HoodieStorageConfig.PARQUET_WITH_BLOOM_FILTER_ENABLED)
+        // HoodieIndexConfig is located in the package hudi-client-common, and 
the package hudi-client-common depends on the package hudi-common,
+        // so the class HoodieIndexConfig cannot be accessed in hudi-common, 
otherwise there will be a circular dependency problem
+        || (config.contains("hoodie.index.type") && 
config.getString("hoodie.index.type").contains("BLOOM")));

Review Comment:
   We generally wanted to have bloom filter enabled irrespective of the index 
type. I only see bucket index is an exception to that.
   
   why can't we adjust the conditions here. 
   for bucket index, why can't we go ahead and disable by default. 
   for all other indexes, we can leave it enabled by default unless someone 
explicitly overrides via 
`HoodieStorageConfig.PARQUET_WITH_BLOOM_FILTER_ENABLED` 
   
    @yihua @danny0405 : what do you guys think



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to