Gergely Fürnstáhl created IMPALA-11577:
------------------------------------------

             Summary: Optimize getting stored file types for Iceberg tables
                 Key: IMPALA-11577
                 URL: https://issues.apache.org/jira/browse/IMPALA-11577
             Project: IMPALA
          Issue Type: Improvement
            Reporter: Gergely Fürnstáhl


Impala supports mixed file formats for Iceberg tables, which means every file 
can have different file format and it uses the set of existing file formats for 
planning purposes. Currently Impala goes through all file's metadata to 
aggregate this information, which can be slow if there are lots of data files.

We could optimized this by storing this aggregated information somewhere (e.g. 
in Iceberg - yet to be implemented - 
https://github.com/apache/iceberg/blob/master/core/src/main/java/org/apache/iceberg/SnapshotSummary.java)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to