[PR] [HUDI-7841] RLI and secondary index should consider only pruned partitions for file skipping [hudi]


lokeshj1703 opened a new pull request, #11434:
URL: https://github.com/apache/hudi/pull/11434


   ### Change Logs
   
   Even though RLI scans only matching files, it tries to get those candidate 
files by iterating over all files from file index. See - 
https://github.com/apache/hudi/blob/f4be74c29471fbd6afff472f8db292e6b1f16f05/hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/RecordLevelIndexSupport.scala#L47
   
   Instead, it can use the `prunedPartitionsAndFileSlices` to only consider 
pruned partitions whenever there is a partition predicate.
   
   ### Impact
   
   NA
   
   ### Risk level (write none, low medium or high below)
   
   low
   
   ### Documentation Update
   
   NA
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[PR] [HUDI-7841] RLI and secondary index should consider only pruned partitions for file skipping [hudi]

Reply via email to