KarthickAN edited a comment on issue #2178:
URL: https://github.com/apache/hudi/issues/2178#issuecomment-710747888


   @nsivabalan I tried out Dynamic filter. It seems to be fine. It's growing 
along with the number of entries dynamically. That's a good feature. Thanks.
   
   However what's the recommended approach in terms of indexing here ? I see 
various features are available out of the box. As per the record size (35 
bytes) I could have more than 3.5 Million records in a file with max size 
120MB. Since in the doc it was recommended to have approximately half the size 
of total number of records I went with 1.5M for bloom filter.
   
   with index type - hoodie.index.type - How does this SIMPLE type work ?
   
   I see hoodie.bloom.index.prune.by.ranges, hoodie.bloom.index.use.caching, 
hoodie.bloom.index.use.treebased.filter, hoodie.bloom.index.bucketized.checking 
all these are enabled by default. Does this really help regardless of the 
hoodie key types used ? In my case I am using ComplexKeyGenerator with five 
different fields out of which one is timestamp.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to