[
https://issues.apache.org/jira/browse/HIVE-26115?focusedWorklogId=795561&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-795561
]
ASF GitHub Bot logged work on HIVE-26115:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 27/Jul/22 08:39
Start Date: 27/Jul/22 08:39
Worklog Time Spent: 10m
Work Description: szlta opened a new pull request, #3480:
URL: https://github.com/apache/hive/pull/3480
This change makes LLAP caching available for Parquet formatted tables stored
with Iceberg.
For Parquet we can only rely on the LlapCacheAwareFS, which is a wrapper for
InputStreams opened from files to be swapped by cache buffer reading streams.
I also refactored the FileID generation to remove some code duplication, as
previously invoked from ORC, Serde and Parquet readers.
Issue Time Tracking
-------------------
Worklog Id: (was: 795561)
Remaining Estimate: 0h
Time Spent: 10m
> LLAP cache utilization for Iceberg Parquet files
> ------------------------------------------------
>
> Key: HIVE-26115
> URL: https://issues.apache.org/jira/browse/HIVE-26115
> Project: Hive
> Issue Type: Bug
> Reporter: Rajesh Balamohan
> Assignee: Ádám Szita
> Priority: Major
> Attachments: Screenshot 2022-04-05 at 10.08.27 AM.png, Screenshot
> 2022-04-05 at 10.08.35 AM.png, Screenshot 2022-04-05 at 10.08.50 AM.png,
> Screenshot 2022-04-05 at 10.09.03 AM.png
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Originally:
> Parquet footer is read 3 times when reading iceberg data
> !Screenshot 2022-04-05 at 10.08.27 AM.png|width=627,height=331!
> Here is the breakup of 3 footer reads per file.
> !Screenshot 2022-04-05 at 10.08.35 AM.png|width=1109,height=500!
>
>
> !Screenshot 2022-04-05 at 10.08.50 AM.png|width=1067,height=447!
>
>
> !Screenshot 2022-04-05 at 10.09.03 AM.png|width=827,height=303!
>
> HIVE-25827 already talks about the initial 2 footer reads per file.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)