[ 
https://issues.apache.org/jira/browse/IMPALA-11577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gergely Fürnstáhl updated IMPALA-11577:
---------------------------------------
    Component/s: Frontend
         Labels: impala-iceberg  (was: )

> Optimize getting stored file types for Iceberg tables
> -----------------------------------------------------
>
>                 Key: IMPALA-11577
>                 URL: https://issues.apache.org/jira/browse/IMPALA-11577
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Frontend
>            Reporter: Gergely Fürnstáhl
>            Priority: Major
>              Labels: impala-iceberg
>
> Spawned from IMPALA-10610
> Impala supports mixed file formats for Iceberg tables, which means every file 
> can have different file format and it uses the set of existing file formats 
> for planning purposes. Currently Impala goes through all file's metadata to 
> aggregate this information, which can be slow if there are lots of data files.
> We could optimized this by storing this aggregated information somewhere 
> (e.g. in Iceberg - yet to be implemented - 
> [https://github.com/apache/iceberg/blob/master/core/src/main/java/org/apache/iceberg/SnapshotSummary.java])



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to