sivabalan narayanan created HUDI-5245:
-----------------------------------------
Summary: Honor pruned partitions while looking up in col stats
partition in MDT
Key: HUDI-5245
URL: https://issues.apache.org/jira/browse/HUDI-5245
Project: Apache Hudi
Issue Type: Improvement
Components: metadata
Reporter: sivabalan narayanan
When looking up in col stats for data skipping, we are passing in only the list
of columns in the predicate. We don't leverage the pruned list of partitions in
this call.
For eg, if there are 1000 partitions and 100 cols, and only 10 partitions are
matched after pruning,
exiting call will fetch 100 cols * 1000 partitions = 10k entries from col_stats
partition in MDT to do file skipping.
where as if we wire in pruned list of partitions, then we only need to do file
skipping from 1000 entries.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)