Alexey Kudinkin created HUDI-4202:
-------------------------------------
Summary: Make sure Column Stats partition is cached after first
time being read
Key: HUDI-4202
URL: https://issues.apache.org/jira/browse/HUDI-4202
Project: Apache Hudi
Issue Type: Bug
Reporter: Alexey Kudinkin
Assignee: Alexey Kudinkin
Currently when applying Data Skipping we will be reading Metadata Table every
time we will be executing the query and doing the file-listing.
As measured from recent benchmarking Metadata Table reading is actually taking
non-trivial amount of time for complex queries and therefore would benefit from
caching the Metadata Table content.
We can approach this in a fashion similar to how Spark performs caching of the
parsed & resolved `LogicalPlan`s (see getCachedPlan for context) and store
Metadata Table in between the modifications of the table.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)