[
https://issues.apache.org/jira/browse/KYLIN-5693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
liyang closed KYLIN-5693.
-------------------------
> Reduce the number of times Spark reads Parquet Footer to improve query
> performance
> ----------------------------------------------------------------------------------
>
> Key: KYLIN-5693
> URL: https://issues.apache.org/jira/browse/KYLIN-5693
> Project: Kylin
> Issue Type: Improvement
> Components: Query Engine
> Affects Versions: 5.0-beta
> Reporter: Yaguang Jia
> Assignee: Yaguang Jia
> Priority: Critical
> Fix For: 5.0.0
>
>
> h2. Dev Design
> Parquet footer metadata is now always read twice in vectorized parquet reader.
> When the NameNode is under high pressure, it will cost time to read twice.
> Actually we can avoid reading the footer twice by reading all row groups in
> advance and filter row groups according to filters that require push down (no
> need to read the footer metadata again the second time).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)