[
https://issues.apache.org/jira/browse/IMPALA-14605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yida Wu resolved IMPALA-14605.
------------------------------
Target Version: Impala 5.0.0
Resolution: Fixed
> Memory leak in global admissiond when dequeuing cancelled queries
> -----------------------------------------------------------------
>
> Key: IMPALA-14605
> URL: https://issues.apache.org/jira/browse/IMPALA-14605
> Project: IMPALA
> Issue Type: Bug
> Components: Backend
> Affects Versions: Impala 4.5.0
> Reporter: Yida Wu
> Assignee: Yida Wu
> Priority: Major
> Labels: admission-control
>
> We have identified a memory leak scenario in the global admissiond. The issue
> occurs when a query waiting in the admission queue is cancelled due to
> backpressure failures but is not properly removed from the admission state
> map during the dequeue process.
> Sequence of Events:
> A GetQueryStatus() call from coord fails due to backpressure in admissiond.
> {code:java}
> I20251203 05:01:47.795506 3938873 status.cc:129]
> c0476ba9e0acf5c3:012f334b00000000] GetQueryStatus rpc failed: Remote error:
> Service unavailable: GetQueryStatus request on impala.AdmissionControlService
> from 127.0.0.6:43351 dropped due to backpressure. The service queue contains
> 5 items out of a maximum of 2147483647; memory consumption is 68.54 MB.
> {code}
> Consequently, the coord sends a cancel request for the queued query. The
> CancelAdmission function sets the cancel flag in the admission state, code
> ref:
> https://github.com/apache/impala/blob/master/be/src/scheduling/admission-control-service.cc#L282-L289
> {code:java}
> I20251203 05:11:47.975906 104 admission-control-service.cc:284]
> CancelAdmission: query_id=c0476ba9e0acf5c3:012f334b00000000
> {code}
> The admissiond tries to dequeue the query. It correctly identifies that the
> query has been cancelled.
> {code:java}
> I20251203 05:11:48.116552 117 admission-controller.cc:2650] Dequeued
> cancelled query=c0476ba9e0acf5c3:012f334b00000000
> {code}
> The memory leak is located in this dequeue logic. While the admissiond
> recognizes the query is cancelled, it fails to remove the query entry from
> the state map before finishing the process.
> https://github.com/apache/impala/blob/master/be/src/scheduling/admission-controller.cc#L2655-L2658
--
This message was sent by Atlassian Jira
(v8.20.10#820010)