[ 
https://issues.apache.org/jira/browse/KYLIN-5693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liyang closed KYLIN-5693.
-------------------------

> Reduce the number of times Spark reads Parquet Footer to improve query 
> performance
> ----------------------------------------------------------------------------------
>
>                 Key: KYLIN-5693
>                 URL: https://issues.apache.org/jira/browse/KYLIN-5693
>             Project: Kylin
>          Issue Type: Improvement
>          Components: Query Engine
>    Affects Versions: 5.0-beta
>            Reporter: Yaguang Jia
>            Assignee: Yaguang Jia
>            Priority: Critical
>             Fix For: 5.0.0
>
>
> h2. Dev Design
> Parquet footer metadata is now always read twice in vectorized parquet reader.
> When the NameNode is under high pressure, it will cost time to read twice. 
> Actually we can avoid reading the footer twice by reading all row groups in 
> advance and filter row groups according to filters that require push down (no 
> need to read the footer metadata again the second time).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to