vinothchandar commented on issue #1694:
URL: https://github.com/apache/hudi/issues/1694#issuecomment-657091983


   If you want partition movement, then global index is the only option.. 
   
   >no certain ordering, I can order it only by timestamp.
   GLOBAL_BLOOM (or even BLOOM index) will work best if the files are sorted by 
a key .. so it can skip entire file ranges from being compared and then further 
reduce using bloom filters.. 
   
   >Any suggestion If I can achieve this without global bloom?
   We are working on record level indexes that should make it much faster in 
the mid term. But thats not an immediate option.. `master` branch has a 
`GLOBAL_SIMPLE` which can be faster than `GLOBAL_BLOOM` in cases where there is 
no specific range based pruning that can occur.. Give that a shot? 
   
   another optimization in `master` branch is : dynamic bloom filters that will 
auto tune itself for aspecific false positive rate. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to