[
https://issues.apache.org/jira/browse/IMPALA-11577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gergely Fürnstáhl updated IMPALA-11577:
---------------------------------------
Component/s: Frontend
Labels: impala-iceberg (was: )
> Optimize getting stored file types for Iceberg tables
> -----------------------------------------------------
>
> Key: IMPALA-11577
> URL: https://issues.apache.org/jira/browse/IMPALA-11577
> Project: IMPALA
> Issue Type: Improvement
> Components: Frontend
> Reporter: Gergely Fürnstáhl
> Priority: Major
> Labels: impala-iceberg
>
> Spawned from IMPALA-10610
> Impala supports mixed file formats for Iceberg tables, which means every file
> can have different file format and it uses the set of existing file formats
> for planning purposes. Currently Impala goes through all file's metadata to
> aggregate this information, which can be slow if there are lots of data files.
> We could optimized this by storing this aggregated information somewhere
> (e.g. in Iceberg - yet to be implemented -
> [https://github.com/apache/iceberg/blob/master/core/src/main/java/org/apache/iceberg/SnapshotSummary.java])
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]