[
https://issues.apache.org/jira/browse/HIVE-28770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated HIVE-28770:
----------------------------------
Labels: pull-request-available (was: )
> Unused Downloaded Resources Directory Persisting After HiveServer2 Shutdown
> ---------------------------------------------------------------------------
>
> Key: HIVE-28770
> URL: https://issues.apache.org/jira/browse/HIVE-28770
> Project: Hive
> Issue Type: Bug
> Reporter: Indhumathi Muthumurugesh
> Assignee: Indhumathi Muthumurugesh
> Priority: Major
> Labels: pull-request-available
>
> When *HiveServer2 (HS2)* starts, it creates multiple {{SessionState}} objects
> as part of its initialization and background processes. Specifically, the
> following components instantiate {{{}SessionState{}}}:
> # *{{CliService.applyAuthorizationConfigPolicy}}* – This method is
> responsible for applying authorization policies during HS2 startup. It
> creates a new {{{}SessionState{}}}, which in turn initializes a unique
> *{{hive.downloaded.resources.dir}}* for that session.
> # *{{HiveMaterializedViewsRegistry}}* – This component maintains metadata
> for *materialized views* and runs in a {*}background thread{*}. It also
> creates a {{{}SessionState{}}}, leading to another instance of
> {{{}hive.downloaded.resources.dir{}}}.
> h4. *The Problem*
> Each {{SessionState}} dynamically generates a temporary directory for
> {*}{{hive.downloaded.resources.dir}}{*}, typically located in {{/tmp}} or
> another location configured in {{{}hive-site.xml{}}}.
> However, {*}these directories are not automatically cleaned up when
> HiveServer2 shuts down{*}. This leads to the following issues:
> * *Accumulation of orphaned directories* (e.g.,
> {{{}/tmp/b0aa54f0-3ca7-4897-abd7-5fb79bcafd2b_resources{}}}), consuming disk
> space unnecessarily.
> * *Performance degradation* over time due to an excessive number of leftover
> directories.
> * *Potential security risks* if sensitive downloaded JARs or resources are
> left behind.
> h4. *Root Cause*
> * The {{SessionState}} objects *do not trigger cleanup logic* for
> {{hive.downloaded.resources.dir}} when they are created in {*}HS2 startup
> methods or background threads{*}.
> * Unlike interactive Hive CLI sessions, which naturally terminate and clean
> up their resources, these startup sessions remain active as long as HS2 runs
> and {*}do not remove their associated directories upon shutdown{*}.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)