[
https://issues.apache.org/jira/browse/HIVE-28770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Indhumathi Muthumurugesh updated HIVE-28770:
--------------------------------------------
Description:
When *HiveServer2 (HS2)* starts, it creates multiple {{SessionState}} objects
as part of its initialization and background processes. Specifically, the
following components instantiate {{{}SessionState{}}}:
# *{{CliService.applyAuthorizationConfigPolicy}}* – This method is responsible
for applying authorization policies during HS2 startup. It creates a new
{{{}SessionState{}}}, which in turn creates session directories for that
session.
# *{{HiveMaterializedViewsRegistry}}* – This component maintains metadata for
*materialized views* and runs in a {*}background thread{*}. It also creates a
{{{}SessionState{}}}, which in turn creates session directories for that
session.
h4. *The Problem*
Each {{SessionState}} dynamically generates a temporary directory for
*{{hive.downloaded.resources.dir}}* {{and}} *{{}}*
{*}hive.exec.local.scratchdir{*}, typically located in {{/tmp}} or
/tmp/\{user.dir} or another location configured in {{{}hive-site.xml{}}}.
However, {*}these directories are not automatically cleaned up when HiveServer2
shuts down{*}. This leads to the following issues:
* *Accumulation of orphaned directories* (e.g.,
{{{}/tmp/b0aa54f0-3ca7-4897-abd7-5fb79bcafd2b_resources{}}}), consuming disk
space unnecessarily.
* *Performance degradation* over time due to an excessive number of leftover
directories.
* *Potential security risks* if sensitive downloaded JARs or resources are
left behind.
h4. *Root Cause*
* The {{SessionState}} objects *do not trigger cleanup logic* for
{{hive.downloaded.resources.dir and hive.exec.local.scratchdir}} when they are
created in {*}HS2 startup methods or background threads{*}.
* Unlike interactive Hive CLI sessions, which naturally terminate and clean up
their resources, these startup sessions remain active as long as HS2 runs and
{*}do not remove their associated directories upon shutdown{*}.
was:
When *HiveServer2 (HS2)* starts, it creates multiple {{SessionState}} objects
as part of its initialization and background processes. Specifically, the
following components instantiate {{{}SessionState{}}}:
# *{{CliService.applyAuthorizationConfigPolicy}}* – This method is responsible
for applying authorization policies during HS2 startup. It creates a new
{{{}SessionState{}}}, which in turn initializes a unique
*{{hive.downloaded.resources.dir}}* for that session.
# *{{HiveMaterializedViewsRegistry}}* – This component maintains metadata for
*materialized views* and runs in a {*}background thread{*}. It also creates a
{{{}SessionState{}}}, leading to another instance of
{{{}hive.downloaded.resources.dir{}}}.
h4. *The Problem*
Each {{SessionState}} dynamically generates a temporary directory for
{*}{{hive.downloaded.resources.dir}}{*}, typically located in {{/tmp}} or
another location configured in {{{}hive-site.xml{}}}.
However, {*}these directories are not automatically cleaned up when HiveServer2
shuts down{*}. This leads to the following issues:
* *Accumulation of orphaned directories* (e.g.,
{{{}/tmp/b0aa54f0-3ca7-4897-abd7-5fb79bcafd2b_resources{}}}), consuming disk
space unnecessarily.
* *Performance degradation* over time due to an excessive number of leftover
directories.
* *Potential security risks* if sensitive downloaded JARs or resources are
left behind.
h4. *Root Cause*
* The {{SessionState}} objects *do not trigger cleanup logic* for
{{hive.downloaded.resources.dir}} when they are created in {*}HS2 startup
methods or background threads{*}.
* Unlike interactive Hive CLI sessions, which naturally terminate and clean up
their resources, these startup sessions remain active as long as HS2 runs and
{*}do not remove their associated directories upon shutdown{*}.
> Unused Downloaded Resources Directory Persisting After HiveServer2 Shutdown
> ---------------------------------------------------------------------------
>
> Key: HIVE-28770
> URL: https://issues.apache.org/jira/browse/HIVE-28770
> Project: Hive
> Issue Type: Bug
> Reporter: Indhumathi Muthumurugesh
> Assignee: Indhumathi Muthumurugesh
> Priority: Major
> Labels: pull-request-available
>
> When *HiveServer2 (HS2)* starts, it creates multiple {{SessionState}} objects
> as part of its initialization and background processes. Specifically, the
> following components instantiate {{{}SessionState{}}}:
> # *{{CliService.applyAuthorizationConfigPolicy}}* – This method is
> responsible for applying authorization policies during HS2 startup. It
> creates a new {{{}SessionState{}}}, which in turn creates session directories
> for that session.
> # *{{HiveMaterializedViewsRegistry}}* – This component maintains metadata
> for *materialized views* and runs in a {*}background thread{*}. It also
> creates a {{{}SessionState{}}}, which in turn creates session directories for
> that session.
> h4. *The Problem*
> Each {{SessionState}} dynamically generates a temporary directory for
> *{{hive.downloaded.resources.dir}}* {{and}} *{{}}*
> {*}hive.exec.local.scratchdir{*}, typically located in {{/tmp}} or
> /tmp/\{user.dir} or another location configured in {{{}hive-site.xml{}}}.
> However, {*}these directories are not automatically cleaned up when
> HiveServer2 shuts down{*}. This leads to the following issues:
> * *Accumulation of orphaned directories* (e.g.,
> {{{}/tmp/b0aa54f0-3ca7-4897-abd7-5fb79bcafd2b_resources{}}}), consuming disk
> space unnecessarily.
> * *Performance degradation* over time due to an excessive number of leftover
> directories.
> * *Potential security risks* if sensitive downloaded JARs or resources are
> left behind.
> h4. *Root Cause*
> * The {{SessionState}} objects *do not trigger cleanup logic* for
> {{hive.downloaded.resources.dir and hive.exec.local.scratchdir}} when they
> are created in {*}HS2 startup methods or background threads{*}.
> * Unlike interactive Hive CLI sessions, which naturally terminate and clean
> up their resources, these startup sessions remain active as long as HS2 runs
> and {*}do not remove their associated directories upon shutdown{*}.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)