[
https://issues.apache.org/jira/browse/IMPALA-13314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Peter Rozsa reassigned IMPALA-13314:
------------------------------------
Assignee: Nándor Kollár
> Create a store for HadoopCatalogs to avoid creating a new one for each table
> ----------------------------------------------------------------------------
>
> Key: IMPALA-13314
> URL: https://issues.apache.org/jira/browse/IMPALA-13314
> Project: IMPALA
> Issue Type: Improvement
> Components: Frontend
> Reporter: Gabor Kaszab
> Assignee: Nándor Kollár
> Priority: Minor
> Labels: impala-iceberg, ramp-up
>
> Currently when we create a new Iceberg table in HadoopCatalog we create a new
> HadoopCatalog instance for each of these tables
> [here|https://github.com/apache/impala/blob/4b500a55cbfcdd311a1c766e33849f7ae05a1a8e/fe/src/main/java/org/apache/impala/util/IcebergUtil.java#L145]
> The issue with this is that a catalog object such as HadoopCatalog holds an
> Iceberg FileIO instance where the size of such an instance can be measured in
> MBs in terms of memory consumption. This can blow up the catalog/localCatalog
> memory even if we have empty Iceberg tables in HadoopCatalog.
> So as a solution we should have a kind of HadoopCatalog store, where based on
> a location string we could cache HadoopCatalog objects for later use or cache
> a new HadoopCatalog in the store. With this approach tables under the sane
> HadoopCatalog location would be in the same HadoopCatalog instance and we
> won't end up having as many FileIO instance as many tables we have in
> HadoopCatalog.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]