Mingmin Xu created HIVE-27677:
---------------------------------
Summary: print out HiveMetaStore.audit to json format
Key: HIVE-27677
URL: https://issues.apache.org/jira/browse/HIVE-27677
Project: Hive
Issue Type: Improvement
Components: Standalone Metastore
Reporter: Mingmin Xu
Assignee: Mingmin Xu
This task aims to print a new[1] line of HiveMetaStore audit log in JSON
format, similar as [https://github.com/apache/hive/pull/1582] but extend to
`cmd` details as well.
# existing audit log
```
HiveMetaStore.audit: ugi=xxx ip=xx.xx.xx.xx cmd=source:xx.xx.xx.xx get_table :
db=xxx tbl=xxx
HiveMetaStore.audit: ugi=xxx ip=xx.xx.xx.xx cmd=source:xx.xx.xx.xx
get_partition_with_auth : db=xx tbl=xx[xxx]
```
# The new audit log
```
HiveMetaStore.audit: \{ugi: "xxx", ip: "xx.xx.xx.xx", cmd={source:
"xx.xx.xx.xx", api="get_table", params={db: "xxx", tbl: "xxx"}}}
HiveMetaStore.audit: \{ugi: "xxx", ip: "xx.xx.xx.xx", cmd={source:
"xx.xx.xx.xx", api="get_partition_with_auth", params={db: "xxx", tbl: "xxx",
key=["xxx"]}}}
```
----------------
For some context, we're tracking the usage of the shared Hive Metastore
Service. HiveMetaStore auditLog is the raw data we reply on, to understand the
traffic on different dimensions, source(IP), API, database, table, etc.
Currently the audit log is in raw string without a standard format, especially
for
extraLogInfo, code point
[here|https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java#L182-L200],
makes it harder to analyze.
[1] should we print another line instead of replacing the existing one, to
avoid a breaking-change?
--
This message was sent by Atlassian Jira
(v8.20.10#820010)