Michael Smith has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/20956 )

Change subject: IMPALA-12747: Atomic update of execution state
......................................................................

IMPALA-12747: Atomic update of execution state

QueryDriver owns instances of ClientRequestState and TExecRequest. The
ClientRequestState is used to track execution state of the client-facing
side of a query. TExecRequest encapsulates context about the query
produced by the planner.

When a QueryDriver is created, it creates an instance of
ClientRequestState, but has not yet executed planning. It would create
an empty TExecRequest and pass a pointer to it to ClientRequestState,
then update the content of TExecRequest when RunFrontendPlanner is
called from ImpalaServer::ExecuteInternal.

Updating TExecRequest was not atomic, so it was possible other
operations - like producing a QueryStateRecord for /queries in the web
UI - would try to read the content of TExecRequest while updating. This
caused TSAN errors and occasional crashes in internal-server-test, which
runs concurrent requests and examines them through calls to /queries.

Changes ClientRequestState to
- Provide a static placeholder for TExecRequest during creation that
  represents an empty context for an UNKNOWN statement type (default
  initialized in Thrift).
- Make all references to TExecRequest const so its content cannot be
  updated in a non-thread-safe manner.
- ClientRequestState uses an AtomicPtr which is updated atomically when
  the filled TExecRequest is available.

QueryDriver does not publicly expose access to TExecRequest, so we can
ensure its use is thread-safe without atomics.

ClientRequestState::exec_request() will return either a reference to the
static placeholder or the value provided after - which is never changed
- so this reference will always be valid for the lifetime of the
ClientRequestState.

Updates user_has_profile_access to be AtomicBool for the same reason.

Reverts tsan-suppressions for IMPALA-12660 so we get TSAN coverage. Adds
suppression for a lock-order-inversion bug (IMPALA-12757) that was
uncovered after fixing this data race.

Testing:
- InternalServerTest.SimultaneousMultipleQueriesOneSession would fail
  after ~10 test runs. Ran 90 times without failure.
- Passed TSAN run of backend tests.

Change-Id: I9a967c5c84b6a401f8f5764373f6cd7ee807545f
Reviewed-on: http://gerrit.cloudera.org:8080/20956
Reviewed-by: Jason Fehr <[email protected]>
Reviewed-by: Riza Suminto <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
---
M be/src/runtime/query-driver.cc
M be/src/runtime/query-driver.h
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M be/src/service/impala-http-handler.cc
M be/src/service/impala-server.cc
M bin/tsan-suppressions.txt
M common/thrift/Frontend.thrift
M common/thrift/Types.thrift
9 files changed, 209 insertions(+), 185 deletions(-)

Approvals:
  Jason Fehr: Looks good to me, but someone else must approve
  Riza Suminto: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/20956
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I9a967c5c84b6a401f8f5764373f6cd7ee807545f
Gerrit-Change-Number: 20956
Gerrit-PatchSet: 8
Gerrit-Owner: Michael Smith <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Jason Fehr <[email protected]>
Gerrit-Reviewer: Michael Smith <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Riza Suminto <[email protected]>

Reply via email to