[ 
https://issues.apache.org/jira/browse/HIVE-26794?focusedWorklogId=837600&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-837600
 ]

ASF GitHub Bot logged work on HIVE-26794:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 06/Jan/23 21:09
            Start Date: 06/Jan/23 21:09
    Worklog Time Spent: 10m 
      Work Description: cnauroth commented on code in PR #3817:
URL: https://github.com/apache/hive/pull/3817#discussion_r1063738486


##########
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/datasource/DbCPDataSourceProvider.java:
##########
@@ -112,9 +113,17 @@ public DataSource create(Configuration hdpConfig, int 
maxPoolSize) throws SQLExc
     
objectPool.setSoftMinEvictableIdleTimeMillis(softMinEvictableIdleTimeMillis);
     objectPool.setLifo(lifo);
 
+    if ("mutex".equalsIgnoreCase(poolName)) {

Review Comment:
   I suggest adding a comment to explain why we are limiting database 
connections related to mutexes, similar to the comment you already put in 
`HikariCPDataSourceProvider`.



##########
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/datasource/DbCPDataSourceProvider.java:
##########
@@ -112,9 +113,17 @@ public DataSource create(Configuration hdpConfig, int 
maxPoolSize) throws SQLExc
     
objectPool.setSoftMinEvictableIdleTimeMillis(softMinEvictableIdleTimeMillis);
     objectPool.setLifo(lifo);
 
+    if ("mutex".equalsIgnoreCase(poolName)) {
+      if (timeBetweenEvictionRuns < 0) {
+        objectPool.setTimeBetweenEvictionRunsMillis(30 * 1000);
+      }
+      if (softMinEvictableIdleTimeMillis < 0) {
+        objectPool.setSoftMinEvictableIdleTimeMillis(600 * 1000);
+      }
+    }
     String stmt = dbProduct.getPrepareTxnStmt();
     if (stmt != null) {
-      poolableConnFactory.setValidationQuery(stmt);
+      poolableConnFactory.setConnectionInitSql(Arrays.asList(stmt));

Review Comment:
   Nitpick: consider `Collections#singletonList` for a single element.



##########
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/datasource/TestDataSourceProviderFactory.java:
##########
@@ -143,6 +148,62 @@ public void testCreateDbCpDataSource() throws SQLException 
{
     Assert.assertTrue(ds instanceof PoolingDataSource);
   }
 
+  @Test
+  public void testEvictIdleConnection() throws Exception {
+    String[] dataSourceType = {HikariCPDataSourceProvider.HIKARI, 
DbCPDataSourceProvider.DBCP};
+    try (DataSourceProvider.DataSourceNameConfigurator configurator =
+             new DataSourceProvider.DataSourceNameConfigurator(conf, "mutex")) 
{
+      for (final String type: dataSourceType) {
+        MetastoreConf.setVar(conf, ConfVars.CONNECTION_POOLING_TYPE, type);
+        boolean isHikari = HikariCPDataSourceProvider.HIKARI.equals(type);
+        if (isHikari) {
+          conf.unset("hikaricp.connectionInitSql");
+          // The minimum of idleTimeout is 10s
+          conf.set("hikaricp.idleTimeout", "10000");
+          System.setProperty("com.zaxxer.hikari.housekeeping.periodMs", 
"1000");
+        } else {
+          conf.set("dbcp.timeBetweenEvictionRunsMillis", "1000");
+          conf.set("dbcp.softMinEvictableIdleTimeMillis", "3000");
+          conf.set("dbcp.maxIdle", "0");
+        }
+        DataSourceProvider dsp = 
DataSourceProviderFactory.tryGetDataSourceProviderOrNull(conf);
+        DataSource ds = dsp.create(conf, 5);
+        List<Connection> connections = new ArrayList<>();
+        for (int i = 0; i < 5; i++) {
+          connections.add(ds.getConnection());
+        }
+        HikariPoolMXBean poolMXBean = null;
+        GenericObjectPool objectPool = null;
+        if (isHikari) {
+          poolMXBean = ((HikariDataSource) ds).getHikariPoolMXBean();
+          Assert.assertEquals(5, poolMXBean.getTotalConnections());
+          Assert.assertEquals(5, poolMXBean.getActiveConnections());
+        } else {
+          Method getPool = 
PoolingDataSource.class.getDeclaredMethod("getPool");

Review Comment:
   These 3 lines could be shortened to 1 using `invokeMethod` from Commons Lang 
[MethodUtils](https://commons.apache.org/proper/commons-lang/apidocs/org/apache/commons/lang3/reflect/MethodUtils.html).



##########
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/datasource/HikariCPDataSourceProvider.java:
##########
@@ -72,6 +72,17 @@ public DataSource create(Configuration hdpConfig, int 
maxPoolSize) throws SQLExc
       config.setPoolName(poolName);
     }
 
+    // It's kind of a waste to create a fixed size connection pool as same as 
the TxnHandler#connPool,
+    // TxnHandler#connPoolMutex is mostly used for MutexAPI that is primarily 
designed to
+    // provide coarse-grained mutex support to maintenance tasks running 
inside the Metastore,
+    // add minimumIdle=2 and idleTimeout=5min to the pool, so that the 
connection pool can retire
+    // the idle connection aggressively, this will make Metastore more 
scalable especially if
+    // there is a leader in the warehouse.
+    if ("mutex".equals(poolName)) {
+      config.setMinimumIdle(Math.min(maxPoolSize, 2));

Review Comment:
   It looks like the behavior is different between `DbCPDataSourceProvider` and 
`HikariCPDataSourceProvider` now:
   
   * **Hikari**: Set minimum idle connections to 2 (or smaller). Note that this 
will override any user configuration of `hikaricp.minimumIdle` in 
metastore-site.xml or hive-site.xml.
   * **DbCP**: Keep using the same user-configured minimum idle connections, 
and instead override the eviction timing.
   
   Can you please explain why they are different? FWIW, the Hikari logic is the 
one that looks right to me: decrease idle connections for mutexes.



##########
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/datasource/TestDataSourceProviderFactory.java:
##########
@@ -143,6 +148,62 @@ public void testCreateDbCpDataSource() throws SQLException 
{
     Assert.assertTrue(ds instanceof PoolingDataSource);
   }
 
+  @Test
+  public void testEvictIdleConnection() throws Exception {
+    String[] dataSourceType = {HikariCPDataSourceProvider.HIKARI, 
DbCPDataSourceProvider.DBCP};
+    try (DataSourceProvider.DataSourceNameConfigurator configurator =
+             new DataSourceProvider.DataSourceNameConfigurator(conf, "mutex")) 
{
+      for (final String type: dataSourceType) {

Review Comment:
   If a test fails, it will be hard to tell if it failed while testing Hikari 
vs. DbCP. I suggest using customized assert messages that include 
`dataSourceType`.



##########
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/datasource/HikariCPDataSourceProvider.java:
##########
@@ -72,6 +72,17 @@ public DataSource create(Configuration hdpConfig, int 
maxPoolSize) throws SQLExc
       config.setPoolName(poolName);
     }
 
+    // It's kind of a waste to create a fixed size connection pool as same as 
the TxnHandler#connPool,
+    // TxnHandler#connPoolMutex is mostly used for MutexAPI that is primarily 
designed to
+    // provide coarse-grained mutex support to maintenance tasks running 
inside the Metastore,
+    // add minimumIdle=2 and idleTimeout=5min to the pool, so that the 
connection pool can retire
+    // the idle connection aggressively, this will make Metastore more 
scalable especially if
+    // there is a leader in the warehouse.
+    if ("mutex".equals(poolName)) {

Review Comment:
   `"mutex"` is repeated in a few places, and correctness of this logic depends 
on the `DataSourceProvider` code being in sync with the pool name chosen in 
`TxnHandler`. I guess ideally, all of the data source name strings should be 
moved to shared constants somewhere, maybe as a separate refactoring patch.





Issue Time Tracking
-------------------

    Worklog Id:     (was: 837600)
    Time Spent: 4h 10m  (was: 4h)

> Explore retiring TxnHandler#connPoolMutex idle connections
> ----------------------------------------------------------
>
>                 Key: HIVE-26794
>                 URL: https://issues.apache.org/jira/browse/HIVE-26794
>             Project: Hive
>          Issue Type: Improvement
>          Components: Standalone Metastore
>            Reporter: Zhihua Deng
>            Assignee: Zhihua Deng
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> Instead of creating a fixed size connection pool for TxnHandler#MutexAPI, the 
> pool can be assigned to a more dynamic size pool due to: 
>  * TxnHandler#MutexAPI is primarily designed to provide coarse-grained mutex 
> support to maintenance tasks running inside the Metastore, these tasks are 
> not user faced;
>  * A fixed size connection pool as same as the pool used in ObjectStore is a 
> waste for other non leaders in the warehouse; 
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to