pvary commented on a change in pull request #2129:
URL: https://github.com/apache/iceberg/pull/2129#discussion_r570443398
##########
File path:
hive-metastore/src/main/java/org/apache/iceberg/hive/HiveCatalogs.java
##########
@@ -36,4 +38,11 @@ public static HiveCatalog loadCatalog(Configuration conf) {
String metastoreUri = conf.get(HiveConf.ConfVars.METASTOREURIS.varname,
"");
return CATALOG_CACHE.get(metastoreUri, uri -> new HiveCatalog(conf));
}
+
+ public static HiveCatalog loadCatalog(String catalogName, Map<String,
String> properties, Configuration conf) {
+ // metastore URI can be null in local mode
+ String metastoreUri = conf.get(HiveConf.ConfVars.METASTOREURIS.varname,
"");
+ return CATALOG_CACHE.get(metastoreUri, uri -> (HiveCatalog)
CatalogUtil.loadCatalog(HiveCatalog.class.getName(),
+ catalogName, properties, conf));
+ }
Review comment:
> I'll take a look at what you have here, but it seems like a problem
that needs to be solved in this PR. HiveCatalogs can only solve that problem
for the default Hive catalog, not for other catalogs that this PR adds support
for. I think that caching for alternative catalogs is just as important for the
same reasons.
If we would cache the Catalogs by name+session as it is done in Spark, then
it would mean 4 extra HMS connections (2 HS2, 2 Tez/AM) per user. On bigger
clusters this could cause issues.
Also since Hive users can change the configuration on session level, caching
becomes even more complicated, since old cached Catalog might have a different
configuration than the one the user actually requests.
One solution could be to require the catalog implementation to provide its
own key for caching based on the configuration. Adding this to this PR might
increase the size of it seriously.
Do I overcomplicate things? Do you see a simplier solution?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]