alexeykudinkin commented on issue #6188:
URL: https://github.com/apache/hudi/issues/6188#issuecomment-1230832162

   Hey, @floriandaniel! Thanks for taking the time to file very detailed 
description.
   
   First of all i believe the crux of the problem is likely lying in the realms 
of using Bloom Index of the Metadata table: we've recently identified a 
performance gap in there and @yihua is currently working on addressing that 
(there's already a PR in progress). 
   
   Second, i'd recommend you to do following in your evaluation:
   
   1. Try Hudi 0.12 that has been recently released (we've done a lot of 
performance benchmarking/optimizations during last release cycle specifically 
to make sure Hudi's performance is top of the line)
   2. Disable `hoodie.bloom.index.use.metadata` for now (until above fix lands)
   3. Any particular reason you switching off 
`hoodie.bloom.index.prune.by.ranges`? It's very crucial aspect of using the 
Bloom Index that allows to prune the search space considerably for update-heavy 
workloads only checking the files that could contain the target records (and 
eliminating ones that couldn't)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to