[ 
https://issues.apache.org/jira/browse/HIVE-28770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Indhumathi Muthumurugesh updated HIVE-28770:
--------------------------------------------
    Description: 
When *HiveServer2 (HS2)* starts, it creates multiple {{SessionState}} objects 
as part of its initialization and background processes. Specifically, the 
following components instantiate {{{}SessionState{}}}:
 # *{{CliService.applyAuthorizationConfigPolicy}}* – This method is responsible 
for applying authorization policies during HS2 startup. It creates a new 
{{{}SessionState{}}}, which in turn creates session directories for that 
session.
 # *{{HiveMaterializedViewsRegistry}}* – This component maintains metadata for 
*materialized views* and runs in a {*}background thread{*}. It also creates a 
{{{}SessionState{}}}, which in turn creates session directories for that 
session.

h4. *The Problem*

Each {{SessionState}} dynamically generates a temporary directory for 
*{{hive.downloaded.resources.dir}}* {{and}} *{{}}* 
{*}hive.exec.local.scratchdir{*}, typically located in {{/tmp}} or 
/tmp/\{user.dir} or another location configured in {{{}hive-site.xml{}}}.
However, {*}these directories are not automatically cleaned up when HiveServer2 
shuts down{*}. This leads to the following issues:
 * *Accumulation of orphaned directories* (e.g., 
{{{}/tmp/b0aa54f0-3ca7-4897-abd7-5fb79bcafd2b_resources{}}}), consuming disk 
space unnecessarily.
 * *Performance degradation* over time due to an excessive number of leftover 
directories.
 * *Potential security risks* if sensitive downloaded JARs or resources are 
left behind.

h4. *Root Cause*
 * The {{SessionState}} objects *do not trigger cleanup logic* for 
{{hive.downloaded.resources.dir and hive.exec.local.scratchdir}} when they are 
created in {*}HS2 startup methods or background threads{*}.
 * Unlike interactive Hive CLI sessions, which naturally terminate and clean up 
their resources, these startup sessions remain active as long as HS2 runs and 
{*}do not remove their associated directories upon shutdown{*}.

  was:
When *HiveServer2 (HS2)* starts, it creates multiple {{SessionState}} objects 
as part of its initialization and background processes. Specifically, the 
following components instantiate {{{}SessionState{}}}:
 # *{{CliService.applyAuthorizationConfigPolicy}}* – This method is responsible 
for applying authorization policies during HS2 startup. It creates a new 
{{{}SessionState{}}}, which in turn initializes a unique 
*{{hive.downloaded.resources.dir}}* for that session.
 # *{{HiveMaterializedViewsRegistry}}* – This component maintains metadata for 
*materialized views* and runs in a {*}background thread{*}. It also creates a 
{{{}SessionState{}}}, leading to another instance of 
{{{}hive.downloaded.resources.dir{}}}.

h4. *The Problem*

Each {{SessionState}} dynamically generates a temporary directory for 
{*}{{hive.downloaded.resources.dir}}{*}, typically located in {{/tmp}} or 
another location configured in {{{}hive-site.xml{}}}.
However, {*}these directories are not automatically cleaned up when HiveServer2 
shuts down{*}. This leads to the following issues:
 * *Accumulation of orphaned directories* (e.g., 
{{{}/tmp/b0aa54f0-3ca7-4897-abd7-5fb79bcafd2b_resources{}}}), consuming disk 
space unnecessarily.
 * *Performance degradation* over time due to an excessive number of leftover 
directories.
 * *Potential security risks* if sensitive downloaded JARs or resources are 
left behind.

h4. *Root Cause*
 * The {{SessionState}} objects *do not trigger cleanup logic* for 
{{hive.downloaded.resources.dir}} when they are created in {*}HS2 startup 
methods or background threads{*}.
 * Unlike interactive Hive CLI sessions, which naturally terminate and clean up 
their resources, these startup sessions remain active as long as HS2 runs and 
{*}do not remove their associated directories upon shutdown{*}.


> Unused Downloaded Resources Directory Persisting After HiveServer2 Shutdown
> ---------------------------------------------------------------------------
>
>                 Key: HIVE-28770
>                 URL: https://issues.apache.org/jira/browse/HIVE-28770
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Indhumathi Muthumurugesh
>            Assignee: Indhumathi Muthumurugesh
>            Priority: Major
>              Labels: pull-request-available
>
> When *HiveServer2 (HS2)* starts, it creates multiple {{SessionState}} objects 
> as part of its initialization and background processes. Specifically, the 
> following components instantiate {{{}SessionState{}}}:
>  # *{{CliService.applyAuthorizationConfigPolicy}}* – This method is 
> responsible for applying authorization policies during HS2 startup. It 
> creates a new {{{}SessionState{}}}, which in turn creates session directories 
> for that session.
>  # *{{HiveMaterializedViewsRegistry}}* – This component maintains metadata 
> for *materialized views* and runs in a {*}background thread{*}. It also 
> creates a {{{}SessionState{}}}, which in turn creates session directories for 
> that session.
> h4. *The Problem*
> Each {{SessionState}} dynamically generates a temporary directory for 
> *{{hive.downloaded.resources.dir}}* {{and}} *{{}}* 
> {*}hive.exec.local.scratchdir{*}, typically located in {{/tmp}} or 
> /tmp/\{user.dir} or another location configured in {{{}hive-site.xml{}}}.
> However, {*}these directories are not automatically cleaned up when 
> HiveServer2 shuts down{*}. This leads to the following issues:
>  * *Accumulation of orphaned directories* (e.g., 
> {{{}/tmp/b0aa54f0-3ca7-4897-abd7-5fb79bcafd2b_resources{}}}), consuming disk 
> space unnecessarily.
>  * *Performance degradation* over time due to an excessive number of leftover 
> directories.
>  * *Potential security risks* if sensitive downloaded JARs or resources are 
> left behind.
> h4. *Root Cause*
>  * The {{SessionState}} objects *do not trigger cleanup logic* for 
> {{hive.downloaded.resources.dir and hive.exec.local.scratchdir}} when they 
> are created in {*}HS2 startup methods or background threads{*}.
>  * Unlike interactive Hive CLI sessions, which naturally terminate and clean 
> up their resources, these startup sessions remain active as long as HS2 runs 
> and {*}do not remove their associated directories upon shutdown{*}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to