[
https://issues.apache.org/jira/browse/HIVE-26789?focusedWorklogId=829875&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-829875
]
ASF GitHub Bot logged work on HIVE-26789:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 29/Nov/22 20:14
Start Date: 29/Nov/22 20:14
Worklog Time Spent: 10m
Work Description: cnauroth commented on code in PR #3813:
URL: https://github.com/apache/hive/pull/3813#discussion_r1035232269
##########
service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java:
##########
@@ -328,7 +328,8 @@ public Object run() throws HiveSQLException {
if (!embedded) {
LogUtils.registerLoggingContext(queryState.getConf());
}
-
ShimLoader.getHadoopShims().setHadoopQueryContext(queryState.getQueryId());
+ ShimLoader.getHadoopShims()
+ .setHadoopQueryContext(queryState.getQueryId() + " User:" +
parentSessionState.getUserName());
Review Comment:
For consistency with the other call sites setting query context, please add
a space after the colon:
```
... + " User: " + ...`
```
(However, also see my other comment about whether or not we should use
spaces. Whatever is decided for the format should be consistent at all call
sites.)
##########
service/src/java/org/apache/hive/service/cli/operation/Operation.java:
##########
@@ -237,7 +238,9 @@ protected void createOperationLog() {
* Set up some preconditions, or configurations.
*/
protected void beforeRun() {
- ShimLoader.getHadoopShims().setHadoopQueryContext(queryState.getQueryId());
+ CallerContext.setCurrent(new CallerContext.Builder("Check").build());
Review Comment:
Should this line be removed? Unless I'm mistaken, the call to
`setHadoopQueryContext` on the next line will overwrite the value set by this
line.
##########
cli/src/java/org/apache/hadoop/hive/cli/CliDriver.java:
##########
@@ -250,7 +250,8 @@ CommandProcessorResponse processLocalCmd(String cmd,
CommandProcessor proc, CliS
}
// Set HDFS CallerContext to queryId and reset back to sessionId after
the query is done
-
ShimLoader.getHadoopShims().setHadoopQueryContext(qp.getQueryState().getQueryId());
+ ShimLoader.getHadoopShims()
+ .setHadoopQueryContext(qp.getQueryState().getQueryId() + " User: "
+ ss.getUserName());
Review Comment:
I wonder if we should avoid embedding spaces in the format. Prior usage of
caller context that I've seen uses an underscore-delimited format. The Hadoop
compatibility guidelines state that the HDFS audit log format should be
considered public and stable:
https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Compatibility.html#Audit_Log_Output
Embedding spaces could break existing scripts that perform positional
parsing using utilities like `cut` and `awk`.
Issue Time Tracking
-------------------
Worklog Id: (was: 829875)
Time Spent: 0.5h (was: 20m)
> Add UserName in CallerContext for queries
> -----------------------------------------
>
> Key: HIVE-26789
> URL: https://issues.apache.org/jira/browse/HIVE-26789
> Project: Hive
> Issue Type: Improvement
> Reporter: Ayush Saxena
> Assignee: Ayush Saxena
> Priority: Major
> Labels: pull-request-available
> Time Spent: 0.5h
> Remaining Estimate: 0h
>
> HDFS Audit logs if impersonation is false, tracks only the Hive user in the
> audit log, Can pass the actual user as part of the CallerContext, so that can
> be logged as well for better tracking
--
This message was sent by Atlassian Jira
(v8.20.10#820010)