Re: [I] [SUPPORT] RFC 63 Functional Index Hudi 0.1.0-beta [hudi]

via GitHub Wed, 07 Aug 2024 23:59:28 -0700


codope commented on issue #10110:
URL: https://github.com/apache/hudi/issues/10110#issuecomment-2275090149


   Hi @soumilshah1995 , there are two things you can look at. First when you do 
EXPLAIN ANALYZE, the plan should show that it is using `HoodieFileIndex`. 
Second, the "number of files read" in Spark UI should show lesser number of 
files if any files were skipped using functional index (or any index for that 
matter). Another way is to enable debug logs for org.apache.hudi package when 
you launch spark-sql, and then upon executing the query, you will see something 
called "skipping ratio" which tells you the percentage of files skipped. Note 
that if the files are already pruned due to parition pruning, and then all of 
those pruned files need to be scanned as per the query predicate, then skipping 
ratio can be 0. Only when there is additional file pruning on top of partition 
pruning, you will find that skipping ratio is positive.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [I] [SUPPORT] RFC 63 Functional Index Hudi 0.1.0-beta [hudi]

Reply via email to