Re: [Discuss][HIVE-28879] Federated Catalog Support in Apache Hive

Denys Kuzmenko Fri, 20 Mar 2026 02:45:33 -0700

1. Catalogs are responsible for metadata management. If a table from an 
external data source is registered in a catalog, engines integrated with that 
catalog can discover the metadata, but actual query execution depends on 
whether the engine has the appropriate connector/format support.


This is already the case today. For example, when Apache Spark reads metadata 
from Hive Metastore, it may see Apache Iceberg tables. However, querying those 
tables requires the Iceberg catalog/connector to be configured in Spark. 

Therefore this behavior is not unhealthy or new—it is simply a consequence of 
separating metadata discovery (catalog) from execution capabilities (engine 
connectors). Engines are expected to configure the appropriate connector for 
the table formats or external catalogs they want to query.

2.The second concern about circular catalogs is primarily a configuration issue 
rather than an architectural problem. Modern query engines like Trino already 
operate with multiple catalogs and connectors. The query planner resolves a 
table reference to one catalog/connector, and that connector is responsible for 
accessing the underlying data source. The engine itself does not recursively 
resolve catalogs through other catalogs.

Re: [Discuss][HIVE-28879] Federated Catalog Support in Apache Hive

Reply via email to