github-actions[bot] commented on code in PR #61553:
URL: https://github.com/apache/doris/pull/61553#discussion_r2977922786


##########
fe/fe-core/src/main/java/org/apache/doris/datasource/hive/ThriftHMSCachedClient.java:
##########
@@ -715,17 +809,21 @@ public void updateTableStatistics(
             String tableName,
             Function<HivePartitionStatistics, HivePartitionStatistics> update) 
{
         try (ThriftHMSClient client = getClient()) {
-
-            Table originTable = getTable(dbName, tableName);
-            Map<String, String> originParams = originTable.getParameters();
-            HivePartitionStatistics updatedStats = 
update.apply(HiveUtil.toHivePartitionStatistics(originParams));
-
-            Table newTable = originTable.deepCopy();
-            Map<String, String> newParams =
-                    HiveUtil.updateStatisticsParameters(originParams, 
updatedStats.getCommonStatistics());
-            newParams.put("transient_lastDdlTime", 
String.valueOf(System.currentTimeMillis() / 1000));
-            newTable.setParameters(newParams);
-            client.client.alter_table(dbName, tableName, newTable);
+            try {
+                Table originTable = ugiDoAs(() -> 
client.client.getTable(dbName, tableName));
+                Map<String, String> originParams = originTable.getParameters();
+                HivePartitionStatistics updatedStats = 
update.apply(HiveUtil.toHivePartitionStatistics(originParams));
+
+                Table newTable = originTable.deepCopy();
+                Map<String, String> newParams =
+                        HiveUtil.updateStatisticsParameters(originParams, 
updatedStats.getCommonStatistics());
+                newParams.put("transient_lastDdlTime", 
String.valueOf(System.currentTimeMillis() / 1000));
+                newTable.setParameters(newParams);
+                client.client.alter_table(dbName, tableName, newTable);

Review Comment:
   **[Low] Inconsistent `ugiDoAs` wrapping**: `getTable` (line 813) is wrapped 
in `ugiDoAs()`, but `alter_table` here is called directly on `client.client` 
without `ugiDoAs`. If security delegation is needed for reads, it is likely 
also needed for writes.
   
   The same inconsistency exists in `updatePartitionStatistics` (lines 840-855: 
`getPartitionsByNames` and `alter_partition` both called without `ugiDoAs`), 
`addPartitions` (line 872), and `dropPartition` (line 886).
   
   This is largely a pre-existing issue from the base code, but since this PR 
rewrote `getTable` to use `ugiDoAs` directly, it would be good to be consistent 
and wrap `alter_table` too.



##########
fe/fe-core/src/main/java/org/apache/doris/datasource/hive/ThriftHMSCachedClient.java:
##########
@@ -636,53 +639,144 @@ private static LockComponent 
createLockComponentForRead(TableNameInfo tblName, O
         return builder.build();
     }
 
+    /**
+     * The Doris HMS pool only manages client object lifecycle in FE:
+     * 1. Create clients.
+     * 2. Borrow and return clients.
+     * 3. Invalidate borrowers that have already failed.
+     * 4. Destroy clients when the pool is closed.
+     *
+     * The pool does not manage Hive-side socket lifetime or reconnect:
+     * 1. RetryingMetaStoreClient handles 
hive.metastore.client.socket.lifetime itself.
+     * 2. The pool does not interpret that config.
+     * 3. The pool does not probe remote socket health.
+     */
+    private GenericObjectPoolConfig createPoolConfig(int poolSize) {
+        GenericObjectPoolConfig config = new GenericObjectPoolConfig();
+        config.setMaxTotal(poolSize);
+        config.setMaxIdle(poolSize);
+        config.setMinIdle(0);
+        config.setBlockWhenExhausted(true);
+        config.setMaxWaitMillis(-1L);
+        config.setTestOnBorrow(false);

Review Comment:
   **[Medium] Indefinite blocking risk**: `maxWaitMillis = -1` means 
`borrowObject()` will block forever when all pool objects are in use. The old 
implementation created clients on demand (no pool cap), so it never blocked. 
With a bounded pool, a stuck HMS call can now cascade and block all threads 
indefinitely.
   
   Consider setting a finite timeout (e.g., 30-60 seconds) and throwing a clear 
exception on timeout, or at least document this behavioral change. The PR 
description says behavior changed, but this specific aspect (callers can now 
block indefinitely) should be explicitly called out.



##########
fe/fe-core/src/main/java/org/apache/doris/datasource/hive/ThriftHMSCachedClient.java:
##########
@@ -636,53 +639,144 @@ private static LockComponent 
createLockComponentForRead(TableNameInfo tblName, O
         return builder.build();
     }
 
+    /**
+     * The Doris HMS pool only manages client object lifecycle in FE:
+     * 1. Create clients.
+     * 2. Borrow and return clients.
+     * 3. Invalidate borrowers that have already failed.
+     * 4. Destroy clients when the pool is closed.
+     *
+     * The pool does not manage Hive-side socket lifetime or reconnect:
+     * 1. RetryingMetaStoreClient handles 
hive.metastore.client.socket.lifetime itself.
+     * 2. The pool does not interpret that config.
+     * 3. The pool does not probe remote socket health.
+     */
+    private GenericObjectPoolConfig createPoolConfig(int poolSize) {

Review Comment:
   **[Medium] Raw type warning**: `GenericObjectPoolConfig` is used as a raw 
type here and on line 655. It should be parameterized:
   ```java
   private GenericObjectPoolConfig<ThriftHMSClient> createPoolConfig(int 
poolSize) {
       GenericObjectPoolConfig<ThriftHMSClient> config = new 
GenericObjectPoolConfig<>();
   ```
   This avoids unchecked assignment warnings and aligns with commons-pool2 best 
practices.



##########
fe/fe-core/src/main/java/org/apache/doris/datasource/hive/ThriftHMSCachedClient.java:
##########
@@ -89,31 +92,35 @@ public class ThriftHMSCachedClient implements 
HMSCachedClient {
     // -1 means no limit on the partitions returned.
     private static final short MAX_LIST_PARTITION_NUM = 
Config.max_hive_list_partition_num;
 
-    private Queue<ThriftHMSClient> clientPool = new LinkedList<>();
-    private boolean isClosed = false;
-    private final int poolSize;
+    private final GenericObjectPool<ThriftHMSClient> clientPool;
+    private volatile boolean isClosed = false;
     private final HiveConf hiveConf;
-    private ExecutionAuthenticator executionAuthenticator;
+    private final ExecutionAuthenticator executionAuthenticator;
+    private final MetaStoreClientProvider metaStoreClientProvider;
 
     public ThriftHMSCachedClient(HiveConf hiveConf, int poolSize, 
ExecutionAuthenticator executionAuthenticator) {
+        this(hiveConf, poolSize, executionAuthenticator, new 
DefaultMetaStoreClientProvider());
+    }
+
+    ThriftHMSCachedClient(HiveConf hiveConf, int poolSize, 
ExecutionAuthenticator executionAuthenticator,
+            MetaStoreClientProvider metaStoreClientProvider) {
         Preconditions.checkArgument(poolSize > 0, poolSize);
         this.hiveConf = hiveConf;
-        this.poolSize = poolSize;
-        this.isClosed = false;
         this.executionAuthenticator = executionAuthenticator;
+        this.metaStoreClientProvider = 
Preconditions.checkNotNull(metaStoreClientProvider, "metaStoreClientProvider");
+        this.clientPool = new GenericObjectPool<>(new 
ThriftHMSClientFactory(), createPoolConfig(poolSize));
     }
 
     @Override
     public void close() {
-        synchronized (clientPool) {
-            this.isClosed = true;
-            while (!clientPool.isEmpty()) {
-                try {
-                    clientPool.poll().close();
-                } catch (Exception e) {
-                    LOG.warn("failed to close thrift client", e);
-                }
-            }
+        if (isClosed) {
+            return;

Review Comment:
   **[Nit] Non-atomic close guard**: Two concurrent `close()` calls can both 
pass the `if (isClosed)` check. This is benign because 
`GenericObjectPool.close()` is idempotent, but if you want strict 
single-invocation semantics, consider using `AtomicBoolean.compareAndSet` 
instead of volatile + check-then-set.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to