sivabalan narayanan created HUDI-5245:
-----------------------------------------

             Summary: Honor pruned partitions while looking up in col stats 
partition in MDT
                 Key: HUDI-5245
                 URL: https://issues.apache.org/jira/browse/HUDI-5245
             Project: Apache Hudi
          Issue Type: Improvement
          Components: metadata
            Reporter: sivabalan narayanan


When looking up in col stats for data skipping, we are passing in only the list 
of columns in the predicate. We don't leverage the pruned list of partitions in 
this call.

 

For eg, if there are 1000 partitions and 100 cols, and only 10 partitions are 
matched after pruning,

exiting call will fetch 100 cols * 1000 partitions = 10k entries from col_stats 
partition in MDT to do file skipping.
where as if we wire in pruned list of partitions, then we only need to do file 
skipping from 1000 entries. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to