nsivabalan opened a new pull request #976: [HUDI-106] Adding support for 
DynamicBloomFilter
URL: https://github.com/apache/incubator-hudi/pull/976
 
 
   - Adding support for DynamicBloomFilter 
([link](https://hadoop.apache.org/docs/r2.4.1/api/org/apache/hadoop/util/bloom/DynamicBloomFilter.html))
 to tune bloom filter size based on total number of entries.
     - Added a BloomFilter interface and two implementations, namely 
SimpleBloomFilter(existing one) and HudiDynamicBloomFilter(new one). 
     - Added a BloomFilterFactory to assist in creating the right BloomFilter 
based on versions. 
     - Version is stored in parquet metadata footer. If version is not found, 
SimpleBloomFilter will be created.
     - Introduced a config named "hoodie.bloom.index.auto.tune.enable" in 
HoodieIndexConfig which when enabled, will create new BloomFilter as 
HudiDynamicBloomFilter. 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to