pvary commented on a change in pull request #2129:
URL: https://github.com/apache/iceberg/pull/2129#discussion_r570443398



##########
File path: 
hive-metastore/src/main/java/org/apache/iceberg/hive/HiveCatalogs.java
##########
@@ -36,4 +38,11 @@ public static HiveCatalog loadCatalog(Configuration conf) {
     String metastoreUri = conf.get(HiveConf.ConfVars.METASTOREURIS.varname, 
"");
     return CATALOG_CACHE.get(metastoreUri, uri -> new HiveCatalog(conf));
   }
+
+  public static HiveCatalog loadCatalog(String catalogName, Map<String, 
String> properties, Configuration conf) {
+    // metastore URI can be null in local mode
+    String metastoreUri = conf.get(HiveConf.ConfVars.METASTOREURIS.varname, 
"");
+    return CATALOG_CACHE.get(metastoreUri, uri -> (HiveCatalog) 
CatalogUtil.loadCatalog(HiveCatalog.class.getName(),
+            catalogName, properties, conf));
+  }

Review comment:
       > I'll take a look at what you have here, but it seems like a problem 
that needs to be solved in this PR. HiveCatalogs can only solve that problem 
for the default Hive catalog, not for other catalogs that this PR adds support 
for. I think that caching for alternative catalogs is just as important for the 
same reasons.
   
   If we would cache the Catalogs by name+session as it is done in Spark, then 
it would mean 4 extra HMS connections (2 HS2, 2 Tez/AM) per user. On bigger 
clusters this could cause issues.
   
   Also since Hive users can change the configuration on session level, caching 
becomes even more complicated, since old cached Catalog might have a different 
configuration than the one the user actually requests.
   
   One solution could be to require the catalog implementation to provide its 
own key for caching based on the configuration. Adding this to this PR might 
increase the size of it seriously.
   
   Do I overcomplicate things? Do you see a simplier solution?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to