Hello Marcel Kornacker, Matthew Jacobs, Sailesh Mukil,
I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/4896
to look at the new patch set (#5).
Change subject: IMPALA-4409: respect lock order in
QueryExecState::CancelInternal()
......................................................................
IMPALA-4409: respect lock order in QueryExecState::CancelInternal()
The code previously violated the (partially documented) lock order
in ImpalaServer. An example of a possible cycle in the dependency
graph is:
* SetQueryInFlight() holds SessionState::lock_ and waits for
'query_expiration_lock_'
* ExpireQueries() holds 'query_expiration_lock_' and waits for
'query_exec_state_map_lock_'
* GetQueryExecState() holds 'query_exec_state_map_lock_' and
waits for QueryExecState::lock_
* QES::Cancel() holds QueryExecState::lock_
and waits for SessionState::lock
It's not clear how likely the above scenario is, but it's hard to rule
it out.
We have not seen this hang in the wild but have seen similar ones.
Testing:
Ran local stress test on 3-node minicluster with TPC-H 20 and 50%
of queries being cancelled.
Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110
---
M be/src/runtime/coordinator.h
M be/src/service/impala-server.h
M be/src/service/query-exec-state.cc
M be/src/service/query-exec-state.h
4 files changed, 66 insertions(+), 32 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/96/4896/5
--
To view, visit http://gerrit.cloudera.org:8080/4896
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <[email protected]>
Gerrit-Reviewer: Henry Robinson <[email protected]>
Gerrit-Reviewer: Marcel Kornacker <[email protected]>
Gerrit-Reviewer: Matthew Jacobs <[email protected]>
Gerrit-Reviewer: Sailesh Mukil <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>