Alexey Kudinkin created HUDI-4202:
-------------------------------------

             Summary: Make sure Column Stats partition is cached after first 
time being read
                 Key: HUDI-4202
                 URL: https://issues.apache.org/jira/browse/HUDI-4202
             Project: Apache Hudi
          Issue Type: Bug
            Reporter: Alexey Kudinkin
            Assignee: Alexey Kudinkin


Currently when applying Data Skipping we will be reading Metadata Table every 
time we will be executing the query and doing the file-listing.

As measured from recent benchmarking Metadata Table reading is actually taking 
non-trivial amount of time for complex queries and therefore would benefit from 
caching the Metadata Table content.

We can approach this in a fashion similar to how Spark performs caching of the 
parsed & resolved `LogicalPlan`s (see getCachedPlan for context) and store 
Metadata Table in between the modifications of the table.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to