[ 
https://issues.apache.org/jira/browse/IMPALA-12298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltán Borók-Nagy resolved IMPALA-12298.
----------------------------------------
    Fix Version/s: Impala 4.3.0
       Resolution: Fixed

> Improve incremental load of Iceberg tables
> ------------------------------------------
>
>                 Key: IMPALA-12298
>                 URL: https://issues.apache.org/jira/browse/IMPALA-12298
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Catalog
>            Reporter: Zoltán Borók-Nagy
>            Assignee: Zoltán Borók-Nagy
>            Priority: Major
>              Labels: impala-iceberg, performance
>             Fix For: Impala 4.3.0
>
>
> *The followings mostly affect HDFS/Ozone where we need to contact the 
> NameNode to create file descriptors with block locations. On cloud object 
> stores where there are no block locations, we only need the Iceberg metadata 
> to create the file descriptors.*
> Currently we always reload all the metadata belonging to an Iceberg table.
> This means we recreate all the file descriptors even if only a few of them 
> have changed.
> We could check the amount of the newly added files, and if there's only a few 
> of them then we should only load the file descriptors for those one by one.
> We can fallback to a full reload if a significant amount of files have 
> changed, i.e. when it is better to use a recursive file listing.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to