[ 
https://issues.apache.org/jira/browse/IMPALA-12602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17875347#comment-17875347
 ] 

ASF subversion and git services commented on IMPALA-12602:
----------------------------------------------------------

Commit 4b500a55cbfcdd311a1c766e33849f7ae05a1a8e in impala's branch 
refs/heads/master from Michael Smith
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=4b500a55c ]

IMPALA-13313: Fix ExpireQueries deadlock

IMPALA-12602 introduced registering idle queries with a session so that
we can expire queries while still making their status available, and
clean up the idle query status when sessions are closed. That happens in
ImpalaServer::ExpireQueries, where it needs to acquire the
query_expiration_lock_ then a session_state->lock.

However that violated the lock order documented in impala-server.h, and
led to a deadlock when a query is expired at the same time another query
is registering expiration timers (which follows the documented order).
When the deadlock occurs, SetQueryInFlight holds a session_state->lock
and tries to acquire query_expiration_lock_, while ExpireQueries holds
the query_expiration_lock_ and tries to acquire session_state->lock.

The prior order between query_expiration_lock_ and session_state->lock
was largely arbitrary. query_expiration_lock_ operations don't
inherently require holding the session_state->lock. However expiration
operations work on a queue of ClientRequestStates that map to different
session states, so when we need to operate on a session state as part of
expiration we pretty much have to take query_expiration_lock_ first.

Updates lock order to take query_expiration_lock_ before
session_state->lock, and modifies SetQueryInFlight to release the
session_state->lock before registering expiration timers. The expiration
timers aren't related to the session, and query lifetime is maintained
by the QueryHandle reference.

Adds a custom cluster test that uses debug actions to reproduce the
deadlock scenario.

Change-Id: I6fce4103f6eeb7e9a4320ba1da817cab81071ba3
Reviewed-on: http://gerrit.cloudera.org:8080/21699
Reviewed-by: Michael Smith <[email protected]>
Tested-by: Michael Smith <[email protected]>


> Timed out queries are not unregistered until session is closed
> --------------------------------------------------------------
>
>                 Key: IMPALA-12602
>                 URL: https://issues.apache.org/jira/browse/IMPALA-12602
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Backend
>    Affects Versions: Impala 4.0.0
>            Reporter: Michael Smith
>            Assignee: Michael Smith
>            Priority: Major
>             Fix For: Impala 4.4.0
>
>
> When Impala triggers 
> [ExpireQuery|https://github.com/apache/impala/blob/master/be/src/service/impala-server.cc#L3062]
>  - via reaching resource limits, EXEC_TIME_LIMIT_S, or 
> idle_query_timeout/QUERY_TIMEOUT_S - it cancels the query, but does not 
> unregister it.  It will only be unregistered when the session is closed. That 
> means Impala continues to report the query as inflight until the session 
> ends; in some multi-user scenarios, a session may be in-use for hours or days 
> and keep these queries active for the same.
> This can be confusing for admins, who see a list of queries waiting to be 
> closed - some of which have been cancelled by EXEC_TIME_LIMIT_S (for example) 
> - and are unclear why they're still there.
> One thing we could do is modify the behavior of {{{}idle_query_timeout{}}}. 
> {{idle_session_timeout}} causes the session to close. Queries that time out 
> due to {{idle_query_timeout}} should similarly be abandoned and unregistered. 
> Any other query that expires should still be checked for 
> {{idle_query_timeout}} and unregistered once it hits that timeout (as it is 
> clearly an idle query).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to