bvaradar commented on issue #1833:
URL: https://github.com/apache/hudi/issues/1833#issuecomment-659892874


   
   @tooptoop4 : Can you provide us the spark DAGs with times (Job, Stage and 
Task level) between 0.5.3 (with bucketized bloom index on) and 0.5.3 (with 
bucketized bloom index off). We need to see why you are seeing such a massive 
performance difference. 
   
   Regarding your question, Please take a look at the comment in 
https://github.com/apache/hudi/blob/master/hudi-client/src/main/java/org/apache/hudi/index/bloom/HoodieBloomIndex.java#L249
 
   This is basically an exploded RDD of record-Key with files to be compared.
   
   Thanks,
   Balaji.V
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to