[ 
https://issues.apache.org/jira/browse/IMPALA-13314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Rozsa reassigned IMPALA-13314:
------------------------------------

    Assignee: Nándor Kollár

> Create a store for HadoopCatalogs to avoid creating a new one for each table
> ----------------------------------------------------------------------------
>
>                 Key: IMPALA-13314
>                 URL: https://issues.apache.org/jira/browse/IMPALA-13314
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Frontend
>            Reporter: Gabor Kaszab
>            Assignee: Nándor Kollár
>            Priority: Minor
>              Labels: impala-iceberg, ramp-up
>
> Currently when we create a new Iceberg table in HadoopCatalog we create a new 
> HadoopCatalog instance for each of these tables 
> [here|https://github.com/apache/impala/blob/4b500a55cbfcdd311a1c766e33849f7ae05a1a8e/fe/src/main/java/org/apache/impala/util/IcebergUtil.java#L145]
> The issue with this is that a catalog object such as HadoopCatalog holds an 
> Iceberg FileIO instance where the size of such an instance can be measured in 
> MBs in terms of memory consumption. This can blow up the catalog/localCatalog 
> memory even if we have empty Iceberg tables in HadoopCatalog.
> So as a solution we should have a kind of HadoopCatalog store, where based on 
> a location string we could cache HadoopCatalog objects for later use or cache 
> a new HadoopCatalog in the store. With this approach tables under the sane 
> HadoopCatalog location would be in the same HadoopCatalog instance and we 
> won't end up having as many FileIO instance as many tables we have in 
> HadoopCatalog.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to