codope commented on issue #10110: URL: https://github.com/apache/hudi/issues/10110#issuecomment-2275090149
Hi @soumilshah1995 , there are two things you can look at. First when you do EXPLAIN ANALYZE, the plan should show that it is using `HoodieFileIndex`. Second, the "number of files read" in Spark UI should show lesser number of files if any files were skipped using functional index (or any index for that matter). Another way is to enable debug logs for org.apache.hudi package when you launch spark-sql, and then upon executing the query, you will see something called "skipping ratio" which tells you the percentage of files skipped. Note that if the files are already pruned due to parition pruning, and then all of those pruned files need to be scanned as per the query predicate, then skipping ratio can be 0. Only when there is additional file pruning on top of partition pruning, you will find that skipping ratio is positive. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
