[ 
https://issues.apache.org/jira/browse/HIVE-25048?focusedWorklogId=634243&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-634243
 ]

ASF GitHub Bot logged work on HIVE-25048:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 05/Aug/21 11:29
            Start Date: 05/Aug/21 11:29
    Worklog Time Spent: 10m 
      Work Description: dengzhhu653 commented on a change in pull request #2441:
URL: https://github.com/apache/hive/pull/2441#discussion_r683067748



##########
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/RetryingHMSHandler.java
##########
@@ -202,6 +215,8 @@ public Result invokeInternal(final Object proxy, final 
Method method, final Obje
           LOG.error(ExceptionUtils.getStackTrace(e.getCause()));
           throw e.getCause();
         }
+      } finally {
+        endFunction(method, object, ex, args);

Review comment:
       The `startFunctions` takes care of these:
   1.  API Metrics,  which is duplicated with 
[PerfLogger](https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/metrics/PerfLogger.java#L188-L207)
 in RetryingHMSHandler, this part cloud be elimated from the `startFunctions`.
   2. Audit Logs, it is difficult to move these to a standalone function, as 
the log is related with the input of the method. For example, we have two 
`startPartitionFunction` for the partition related methods:
   ```
     private void startPartitionFunction(String function, String cat, String 
db, String tbl,
                                         List<String> partVals) {
       startFunction(function, " : tbl=" +
           TableName.getQualified(cat, db, tbl) + "[" + join(partVals, ",") + 
"]");
     }
   
     private void startPartitionFunction(String function, String catName, 
String db, String tbl,
                                         Map<String, String> partName) {
       startFunction(function, " : tbl=" +
           TableName.getQualified(catName, db, tbl) + "partition=" + partName);
     }
   ```
   
   In some cases, we also use only log the table name by using 
`startTableFunction`, like 
[get_partitions](https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java#L5453)
   These functions make the audit log of partitions different, another concern 
is that if we add/remove a method, the audit log should be changed elsewhere, 
this may be upsetting.
   

##########
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/RetryingHMSHandler.java
##########
@@ -202,6 +215,8 @@ public Result invokeInternal(final Object proxy, final 
Method method, final Obje
           LOG.error(ExceptionUtils.getStackTrace(e.getCause()));
           throw e.getCause();
         }
+      } finally {
+        endFunction(method, object, ex, args);

Review comment:
       The `startFunctions` takes care of these:
   1.  API Metrics,  which is duplicated with 
[PerfLogger](https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/metrics/PerfLogger.java#L188-L207)
 in RetryingHMSHandler, this part cloud be elimated from the `startFunctions`.
   2. Audit Logs, it is difficult to move these to a standalone function, as 
the log is related with the input of the method and the input varies among 
different methods. For example, we have two `startPartitionFunction` for the 
partition related methods:
   ```
     private void startPartitionFunction(String function, String cat, String 
db, String tbl,
                                         List<String> partVals) {
       startFunction(function, " : tbl=" +
           TableName.getQualified(cat, db, tbl) + "[" + join(partVals, ",") + 
"]");
     }
   
     private void startPartitionFunction(String function, String catName, 
String db, String tbl,
                                         Map<String, String> partName) {
       startFunction(function, " : tbl=" +
           TableName.getQualified(catName, db, tbl) + "partition=" + partName);
     }
   ```
   
   In some cases, we also use only log the table name by using 
`startTableFunction`, like 
[get_partitions](https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java#L5453)
   These functions make the audit log of partitions different, another concern 
is that if we add/remove a method, the audit log should be changed elsewhere, 
this may be upsetting.
   

##########
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/RetryingHMSHandler.java
##########
@@ -202,6 +215,8 @@ public Result invokeInternal(final Object proxy, final 
Method method, final Obje
           LOG.error(ExceptionUtils.getStackTrace(e.getCause()));
           throw e.getCause();
         }
+      } finally {
+        endFunction(method, object, ex, args);

Review comment:
       The `startFunctions` takes care of these:
   1.  API Metrics,  which is duplicated with 
[PerfLogger](https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/metrics/PerfLogger.java#L188-L207)
 in RetryingHMSHandler, this part cloud be eliminated from the `startFunctions`.
   2. Audit Logs, it is difficult to move these to a standalone function, as 
the log is related with the input of the method and the input varies among 
different methods. For example, we have two `startPartitionFunction` for the 
partition related methods:
   ```
     private void startPartitionFunction(String function, String cat, String 
db, String tbl,
                                         List<String> partVals) {
       startFunction(function, " : tbl=" +
           TableName.getQualified(cat, db, tbl) + "[" + join(partVals, ",") + 
"]");
     }
   
     private void startPartitionFunction(String function, String catName, 
String db, String tbl,
                                         Map<String, String> partName) {
       startFunction(function, " : tbl=" +
           TableName.getQualified(catName, db, tbl) + "partition=" + partName);
     }
   ```
   
   In some cases, we also use only log the table name by using 
`startTableFunction`, like 
[get_partitions](https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java#L5453)
   These functions make the audit log of partitions different, another concern 
is that if we add/remove a method, the audit log should be changed elsewhere, 
this may be upsetting.
   

##########
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/RetryingHMSHandler.java
##########
@@ -61,6 +66,7 @@ public Result(Object result, int numRetries) {
 
   private final Configuration origConf;            // base configuration
   private final Configuration activeConf;  // active configuration
+  private final List<MetaStoreEndFunctionListener> endFunctionListeners; // 
the end function listener

Review comment:
       Thank you for the comments!
   The `startFunctions` could be eliminated for the reasons listing above, 
leaving the `MetaStoreEndFunctionListener` on `endFunction` that should be 
taken care of, but introducing a `MetaStoreFunctionListener` makes things much 
better and straightforward. A user can monitor the input or output of the 
method, or implement his own audit logics.
   This change also introduce some incompatibility in 
[MetaStoreEndFunctionContext](https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreEndFunctionContext.java#L55-L57),
 the `getInputTableName` would return null for the old implemention.
   

##########
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/RetryingHMSHandler.java
##########
@@ -61,6 +66,7 @@ public Result(Object result, int numRetries) {
 
   private final Configuration origConf;            // base configuration
   private final Configuration activeConf;  // active configuration
+  private final List<MetaStoreEndFunctionListener> endFunctionListeners; // 
the end function listener

Review comment:
       Thank you for the comments!
   The `startFunctions` could be eliminated for the reasons listing above, 
leaving the `MetaStoreEndFunctionListener` on `endFunction` that should be 
taken care of. Introducing a `MetaStoreFunctionListener` makes things much 
better and straightforward. A user can monitor the input or output of the 
method, or implement his own audit logics.
   This change also introduce some incompatibility in 
[MetaStoreEndFunctionContext](https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreEndFunctionContext.java#L55-L57),
 the `getInputTableName` would return null for the old implemention.
   

##########
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/RetryingHMSHandler.java
##########
@@ -202,6 +215,8 @@ public Result invokeInternal(final Object proxy, final 
Method method, final Obje
           LOG.error(ExceptionUtils.getStackTrace(e.getCause()));
           throw e.getCause();
         }
+      } finally {
+        endFunction(method, object, ex, args);

Review comment:
       The `startFunctions` takes care of these:
   1.  API Metrics,  which is duplicated with 
[PerfLogger](https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/metrics/PerfLogger.java#L188-L207)
 in RetryingHMSHandler, this part cloud be eliminated from the `startFunctions`.
   2. Audit Logs, it is difficult to move these to a standalone function, as 
the log is related with the input of the method and the input varies among 
different methods. For example, we have two `startPartitionFunction` for the 
partition related methods:
   ```
     private void startPartitionFunction(String function, String cat, String 
db, String tbl,
                                         List<String> partVals) {
       startFunction(function, " : tbl=" +
           TableName.getQualified(cat, db, tbl) + "[" + join(partVals, ",") + 
"]");
     }
   
     private void startPartitionFunction(String function, String catName, 
String db, String tbl,
                                         Map<String, String> partName) {
       startFunction(function, " : tbl=" +
           TableName.getQualified(catName, db, tbl) + "partition=" + partName);
     }
   ```
   
   In some cases, we only log the table name by using `startTableFunction`, 
like 
[get_partitions](https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java#L5453)
   These functions make the audit log of partitions different, another concern 
is that if we add/remove a method, the audit log should be changed elsewhere, 
this may be upsetting.
   

##########
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/RetryingHMSHandler.java
##########
@@ -61,6 +66,7 @@ public Result(Object result, int numRetries) {
 
   private final Configuration origConf;            // base configuration
   private final Configuration activeConf;  // active configuration
+  private final List<MetaStoreEndFunctionListener> endFunctionListeners; // 
the end function listener

Review comment:
       Thank you for the comments!
   The `startFunctions` could be eliminated for the reasons listing above, 
leaving the `MetaStoreEndFunctionListener` on `endFunction` that should be 
taken care of. Introducing a `MetaStoreFunctionListener` makes things much 
straightforward, but we can get the context of the current method from 
`MetaStoreEndFunctionContext`, making it usless to have an extra start function 
listener.
   This change also introduce some incompatibility in 
[MetaStoreEndFunctionContext](https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreEndFunctionContext.java#L55-L57),
 the `getInputTableName` would return null for the old implemention.
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 634243)
    Time Spent: 3h 40m  (was: 3.5h)

> Refine the start/end functions in HMSHandler
> --------------------------------------------
>
>                 Key: HIVE-25048
>                 URL: https://issues.apache.org/jira/browse/HIVE-25048
>             Project: Hive
>          Issue Type: Improvement
>          Components: Standalone Metastore
>            Reporter: Zhihua Deng
>            Assignee: Zhihua Deng
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> Some start/end functions are incomplete or wrong in the HMSHandler, these 
> functions audit actions, monitor the performance, and notify the end function 
> listeners. We have already measured the performance of the HMSHandler in 
> PerfLogger,  and covered more methods than these functions that have done, so 
> we can remove the monitoring from the start/end functions, move the end 
> function listeners to the RetryingHMSHandler to eliminate the try-finally 
> blocks that spread across many different methods. After these, we can try to 
> cleanup the functions to make HMSHandler be more simplified.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to