[
https://issues.apache.org/jira/browse/HIVE-24802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17289424#comment-17289424
]
Zhihua Deng edited comment on HIVE-24802 at 3/1/21, 8:43 AM:
-------------------------------------------------------------
Thanks for your feedbacks.
{quote}I worry about controlling the amount of data that gets stored here on
the local host. On a busy system, this could be hundreds (thousands?) of
operations stored.
{quote}
The data is deleted once it is evicted from the QueryInfoCache, controlled by
the configuration: hive.server2.webui.max.historic.queries. A daemon will
remove the corruption operation log files periodicity(which the queries are no
longer reached on the webui). we can set both(max historic queries and scan
intervals) to control the amount of data on local host.
{quote}Also, what if HS2 crashes? Will HS2 load the directory and look at the
existing operations or does it only act on new ones that come in?
{quote}
When HS2 restarts, as the QueryInfoCache has nothing cached(also nothing shows
on webui), HS2 will delete all the operation log files/session log directories
and only act on new ones that come in.
{quote}I worry about exposing too much data to one admin user. How are access
controls enforced on viewing these logs through the UI?
{quote}
The log is exposed to the user by the drilldown link, together with the _Query
Plan_ etc, also we only fetch the latest logs(which the size can be set) to
display, as shown in attached picture.
was (Author: dengzh):
Thanks for your feedbacks.
{quote}I worry about controlling the amount of data that gets stored here on
the local host. On a busy system, this could be hundreds (thousands?) of
operations stored.
{quote}
The data is deleted once it is evicted from the QueryInfoCache, controlled by
the configuration: hive.server2.webui.max.historic.queries, a daemon scans the
root dir and removes the corruption operation log files periodicity(which the
queries are no longer reached on the webui). we can set both(max historic
queries and scan intervals) to controlling the amount of data.
{quote}Also, what if HS2 crashes? Will HS2 load the directory and look at the
existing operations or does it only act on new ones that come in?
{quote}
When HS2 restarts, as the QueryInfoCache has nothing cached(also nothing shows
on webui), HS2 will delete all the files and only act on new ones that come in.
{quote}I worry about exposing too much data to one admin user. How are access
controls enforced on viewing these logs through the UI?
{quote}
The data is exposed to the user by the drilldown link, together with the _Query
Plan_ etc, also we only fetch the latest logs(which the size can be set) to
display, as shown in attached picture.
> Show operation log at webui
> ---------------------------
>
> Key: HIVE-24802
> URL: https://issues.apache.org/jira/browse/HIVE-24802
> Project: Hive
> Issue Type: Improvement
> Components: HiveServer2
> Reporter: Zhihua Deng
> Assignee: Zhihua Deng
> Priority: Minor
> Labels: pull-request-available
> Attachments: operationlog.png
>
> Time Spent: 50m
> Remaining Estimate: 0h
>
> Currently we provide getQueryLog in HiveStatement to fetch the operation log,
> and the operation log would be deleted on operation closing(delay for the
> canceled operation). Sometimes it's would be not easy for the user(jdbc) or
> administrators to deep into the details of the finished(failed) operation, so
> we present the operation log on webui and keep the operation log for some
> time for latter analysis.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)