Michael Smith has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/21699 )
Change subject: IMPALA-13313: Fix ExpireQueries deadlock ...................................................................... IMPALA-13313: Fix ExpireQueries deadlock IMPALA-12602 introduced registering idle queries with a session so that we can expire queries while still making their status available, and clean up the idle query status when sessions are closed. That happens in ImpalaServer::ExpireQueries, where it needs to acquire the query_expiration_lock_ then a session_state->lock. However that violated the lock order documented in impala-server.h, and led to a deadlock when a query is expired at the same time another query is registering expiration timers (which follows the documented order). When the deadlock occurs, SetQueryInFlight holds a session_state->lock and tries to acquire query_expiration_lock_, while ExpireQueries holds the query_expiration_lock_ and tries to acquire session_state->lock. The prior order between query_expiration_lock_ and session_state->lock was largely arbitrary. query_expiration_lock_ operations don't inherently require holding the session_state->lock. However expiration operations work on a queue of ClientRequestStates that map to different session states, so when we need to operate on a session state as part of expiration we pretty much have to take query_expiration_lock_ first. Updates lock order to take query_expiration_lock_ before session_state->lock, and modifies SetQueryInFlight to release the session_state->lock before registering expiration timers. The expiration timers aren't related to the session, and query lifetime is maintained by the QueryHandle reference. Adds a custom cluster test that uses debug actions to reproduce the deadlock scenario. Change-Id: I6fce4103f6eeb7e9a4320ba1da817cab81071ba3 Reviewed-on: http://gerrit.cloudera.org:8080/21699 Reviewed-by: Michael Smith <[email protected]> Tested-by: Michael Smith <[email protected]> --- M be/src/service/impala-server.cc M be/src/service/impala-server.h M tests/custom_cluster/test_query_expiration.py 3 files changed, 58 insertions(+), 25 deletions(-) Approvals: Michael Smith: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/21699 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I6fce4103f6eeb7e9a4320ba1da817cab81071ba3 Gerrit-Change-Number: 21699 Gerrit-PatchSet: 5 Gerrit-Owner: Michael Smith <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Joe McDonnell <[email protected]> Gerrit-Reviewer: Michael Smith <[email protected]> Gerrit-Reviewer: Riza Suminto <[email protected]> Gerrit-Reviewer: Yida Wu <[email protected]>
