Michael Smith has uploaded this change for review. ( http://gerrit.cloudera.org:8080/20956
Change subject: IMPALA-12747: Atomic update of TExecRequest ...................................................................... IMPALA-12747: Atomic update of TExecRequest QueryStateRecord owns instances of ClientRequestState and TExecRequest. The ClientRequestState is used to track execution state of the client-facing side of a query. TExecRequest encapsulates context about the query produced by the planner. When a QueryDriver is created, it creates an instance of ClientRequestState, but has not yet executed planning. It would create an empty TExecRequest and pass a pointer to it to ClientRequestState, then update the content of TExecRequest when RunFrontendPlanner is called. Updating TExecRequest was not atomic, so it was possible other operations - like producing a QueryStateRecord for /queries in the web UI - would try to read the content of TExecRequest while updating. This caused TSAN errors and occasional crashes in internal-server-test, which runs concurrent requests and examines them through calls to /queries. Changes ClientRequestState to - Provide a static placeholder for TExecRequest during creation that represents an empty context for an UNKNOWN statement type (Thrift default initializes enums to their first value). - Make all references to TExecRequest const so its content cannot be updated in a non-thread-safe manner. - ClientRequestState uses an AtomicPtr which is updated atomically when the filled TExecRequest is available. QueryDriver does not publicly expose access to TExecRequest, so we can ensure its use is thread-safe without atomics. ClientRequestState::exec_request() will return either a reference to the static placeholder or the value provided after - which is never changed - so this reference will always be valid for the lifetime of the ClientRequestState. Updates user_has_profile_access to be AtomicBool for the same reason. Reverts tsan-suppressions for IMPALA-12660 so we get TSAN coverage. Adds suppression for a lock-order-inversion bug (IMPALA-12757) that was uncovered after fixing this data race. Testing: - InternalServerTest.SimultaneousMultipleQueriesOneSession would fail after ~10 test runs. Ran 90 times without failure. - Passed TSAN run of backend tests. Change-Id: I9a967c5c84b6a401f8f5764373f6cd7ee807545f --- M be/src/runtime/query-driver.cc M be/src/runtime/query-driver.h M be/src/service/client-request-state.cc M be/src/service/client-request-state.h M be/src/service/impala-http-handler.cc M be/src/service/impala-server.cc M bin/tsan-suppressions.txt M common/thrift/Types.thrift 8 files changed, 210 insertions(+), 190 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/56/20956/1 -- To view, visit http://gerrit.cloudera.org:8080/20956 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I9a967c5c84b6a401f8f5764373f6cd7ee807545f Gerrit-Change-Number: 20956 Gerrit-PatchSet: 1 Gerrit-Owner: Michael Smith <[email protected]>
