[GitHub] [iceberg] rdblue commented on a change in pull request #2325: Hive: Use HiveClientPool cache instead of HiveCatalog global cache.

GitBox Mon, 15 Mar 2021 12:46:08 -0700


rdblue commented on a change in pull request #2325:
URL: https://github.com/apache/iceberg/pull/2325#discussion_r594634855




##########
File path: hive-metastore/src/main/java/org/apache/iceberg/hive/HiveCatalog.java
##########
@@ -547,4 +541,42 @@ public void setConf(Configuration conf) {
   public Configuration getConf() {
     return conf;
   }
+
+  @VisibleForTesting
+  HiveClientPool clientPool() {
+    synchronized (CLIENT_POOL_CACHE) {
+      String metastoreUri = conf.get(HiveConf.ConfVars.METASTOREURIS.varname, 
"");
+      Pair<HiveClientPool, Long> cacheEntry = 
CLIENT_POOL_CACHE.getIfPresent(metastoreUri);
+      HiveClientPool clientPool = cacheEntry == null ? new 
HiveClientPool(clientPoolSize, conf) : cacheEntry.first();
+      CLIENT_POOL_CACHE.put(metastoreUri, Pair.of(clientPool, 
System.currentTimeMillis() + evictionInterval));
+      return clientPool;
+    }
+  }
+
+  private void scheduleCacheCleaner() {

Review comment:
       I think it would be cleaner to always keep the pool around and never 
close it until the cache is cleaned up. Then the pool would always be shared 
and could be kept as an instance variable inside the catalog. It would also 
make cleanup less coarse. We could expire individual connections that are 
unused instead of expiring whole pools. That aligns better with usage patterns 
and keeps the complexity inside the pool rather than in the cache.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] rdblue commented on a change in pull request #2325: Hive: Use HiveClientPool cache instead of HiveCatalog global cache.

Reply via email to