Hello Quanlong Huang, Riza Suminto, Jason Fehr, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/20956

to look at the new patch set (#6).

Change subject: IMPALA-12747: Atomic update of execution state
......................................................................

IMPALA-12747: Atomic update of execution state

QueryStateRecord owns instances of ClientRequestState and TExecRequest.
The ClientRequestState is used to track execution state of the
client-facing side of a query. TExecRequest encapsulates context about
the query produced by the planner.

When a QueryDriver is created, it creates an instance of
ClientRequestState, but has not yet executed planning. It would create
an empty TExecRequest and pass a pointer to it to ClientRequestState,
then update the content of TExecRequest when RunFrontendPlanner is
called from ImpalaServer::ExecuteInternal.

Updating TExecRequest was not atomic, so it was possible other
operations - like producing a QueryStateRecord for /queries in the web
UI - would try to read the content of TExecRequest while updating. This
caused TSAN errors and occasional crashes in internal-server-test, which
runs concurrent requests and examines them through calls to /queries.

Changes ClientRequestState to
- Provide a static placeholder for TExecRequest during creation that
  represents an empty context for an UNKNOWN statement type (default
  initialized in Thrift).
- Make all references to TExecRequest const so its content cannot be
  updated in a non-thread-safe manner.
- ClientRequestState uses an AtomicPtr which is updated atomically when
  the filled TExecRequest is available.

QueryDriver does not publicly expose access to TExecRequest, so we can
ensure its use is thread-safe without atomics.

ClientRequestState::exec_request() will return either a reference to the
static placeholder or the value provided after - which is never changed
- so this reference will always be valid for the lifetime of the
ClientRequestState.

Updates user_has_profile_access to be AtomicBool for the same reason.

Reverts tsan-suppressions for IMPALA-12660 so we get TSAN coverage. Adds
suppression for a lock-order-inversion bug (IMPALA-12757) that was
uncovered after fixing this data race.

Testing:
- InternalServerTest.SimultaneousMultipleQueriesOneSession would fail
  after ~10 test runs. Ran 90 times without failure.
- Passed TSAN run of backend tests.

Change-Id: I9a967c5c84b6a401f8f5764373f6cd7ee807545f
---
M be/src/runtime/query-driver.cc
M be/src/runtime/query-driver.h
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M be/src/service/impala-http-handler.cc
M be/src/service/impala-server.cc
M bin/tsan-suppressions.txt
M common/thrift/Frontend.thrift
M common/thrift/Types.thrift
9 files changed, 208 insertions(+), 185 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/56/20956/6
--
To view, visit http://gerrit.cloudera.org:8080/20956
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9a967c5c84b6a401f8f5764373f6cd7ee807545f
Gerrit-Change-Number: 20956
Gerrit-PatchSet: 6
Gerrit-Owner: Michael Smith <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Jason Fehr <[email protected]>
Gerrit-Reviewer: Michael Smith <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Riza Suminto <[email protected]>

Reply via email to