[ 
https://issues.apache.org/jira/browse/IMPALA-14605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yida Wu resolved IMPALA-14605.
------------------------------
    Target Version: Impala 5.0.0
        Resolution: Fixed

> Memory leak in global admissiond when dequeuing cancelled queries
> -----------------------------------------------------------------
>
>                 Key: IMPALA-14605
>                 URL: https://issues.apache.org/jira/browse/IMPALA-14605
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Backend
>    Affects Versions: Impala 4.5.0
>            Reporter: Yida Wu
>            Assignee: Yida Wu
>            Priority: Major
>              Labels: admission-control
>
> We have identified a memory leak scenario in the global admissiond. The issue 
> occurs when a query waiting in the admission queue is cancelled due to 
> backpressure failures but is not properly removed from the admission state 
> map during the dequeue process.
> Sequence of Events:
> A GetQueryStatus() call from coord fails due to backpressure in admissiond.
> {code:java}
> I20251203 05:01:47.795506 3938873 status.cc:129] 
> c0476ba9e0acf5c3:012f334b00000000] GetQueryStatus rpc failed: Remote error: 
> Service unavailable: GetQueryStatus request on impala.AdmissionControlService 
> from 127.0.0.6:43351 dropped due to backpressure. The service queue contains 
> 5 items out of a maximum of 2147483647; memory consumption is 68.54 MB.
> {code}
> Consequently, the coord sends a cancel request for the queued query. The 
> CancelAdmission function sets the cancel flag in the admission state, code 
> ref: 
> https://github.com/apache/impala/blob/master/be/src/scheduling/admission-control-service.cc#L282-L289
> {code:java}
> I20251203 05:11:47.975906  104 admission-control-service.cc:284] 
> CancelAdmission: query_id=c0476ba9e0acf5c3:012f334b00000000
> {code}
> The admissiond tries to dequeue the query. It correctly identifies that the 
> query has been cancelled.
> {code:java}
> I20251203 05:11:48.116552  117 admission-controller.cc:2650] Dequeued 
> cancelled query=c0476ba9e0acf5c3:012f334b00000000
> {code}
> The memory leak is located in this dequeue logic. While the admissiond 
> recognizes the query is cancelled, it fails to remove the query entry from 
> the state map before finishing the process.
> https://github.com/apache/impala/blob/master/be/src/scheduling/admission-controller.cc#L2655-L2658



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to