Bikramjeet Vig created IMPALA-10767:
---------------------------------------
Summary: Fix handling of queued queries for coordinator failure
modes and during cancellation
Key: IMPALA-10767
URL: https://issues.apache.org/jira/browse/IMPALA-10767
Project: IMPALA
Issue Type: Sub-task
Reporter: Bikramjeet Vig
Assignee: Bikramjeet Vig
IMPALA-10594 and IMPALA-10590 do not ensure that queued queries are removed
from the admission-controller and admission_state_map_ . A situation can arise
where the coordinator that got killed did not get a chance of calling
GetQueryStatus() which calls WaitOnQueued() for queued queries. This results in
a memory leak where the queue_node in admission-controller and the
admission_state in admission_state_map_ are never removed.
Moreover, queued queries can get into an undesirable state where if the failed
coord is not in the cluster_membership, the query will stay in the queue
indefinitely as it would keep hitting the unable to deque condition where the
coordinator is not registered in the cluster_membership yet.
Another undesirable condition can arise for queued queries that were canceled,
these never get removed from the admission_state_map_ as entries in it are only
removed when a running query is released, running queries are synced via
admission heartbeat, and all running queries are removed when the coordinator
goes down.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]