Michael Smith has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/20956 )
Change subject: IMPALA-12747: Atomic update of execution state ...................................................................... IMPALA-12747: Atomic update of execution state QueryDriver owns instances of ClientRequestState and TExecRequest. The ClientRequestState is used to track execution state of the client-facing side of a query. TExecRequest encapsulates context about the query produced by the planner. When a QueryDriver is created, it creates an instance of ClientRequestState, but has not yet executed planning. It would create an empty TExecRequest and pass a pointer to it to ClientRequestState, then update the content of TExecRequest when RunFrontendPlanner is called from ImpalaServer::ExecuteInternal. Updating TExecRequest was not atomic, so it was possible other operations - like producing a QueryStateRecord for /queries in the web UI - would try to read the content of TExecRequest while updating. This caused TSAN errors and occasional crashes in internal-server-test, which runs concurrent requests and examines them through calls to /queries. Changes ClientRequestState to - Provide a static placeholder for TExecRequest during creation that represents an empty context for an UNKNOWN statement type (default initialized in Thrift). - Make all references to TExecRequest const so its content cannot be updated in a non-thread-safe manner. - ClientRequestState uses an AtomicPtr which is updated atomically when the filled TExecRequest is available. QueryDriver does not publicly expose access to TExecRequest, so we can ensure its use is thread-safe without atomics. ClientRequestState::exec_request() will return either a reference to the static placeholder or the value provided after - which is never changed - so this reference will always be valid for the lifetime of the ClientRequestState. Updates user_has_profile_access to be AtomicBool for the same reason. Reverts tsan-suppressions for IMPALA-12660 so we get TSAN coverage. Adds suppression for a lock-order-inversion bug (IMPALA-12757) that was uncovered after fixing this data race. Testing: - InternalServerTest.SimultaneousMultipleQueriesOneSession would fail after ~10 test runs. Ran 90 times without failure. - Passed TSAN run of backend tests. Change-Id: I9a967c5c84b6a401f8f5764373f6cd7ee807545f Reviewed-on: http://gerrit.cloudera.org:8080/20956 Reviewed-by: Jason Fehr <[email protected]> Reviewed-by: Riza Suminto <[email protected]> Tested-by: Impala Public Jenkins <[email protected]> --- M be/src/runtime/query-driver.cc M be/src/runtime/query-driver.h M be/src/service/client-request-state.cc M be/src/service/client-request-state.h M be/src/service/impala-http-handler.cc M be/src/service/impala-server.cc M bin/tsan-suppressions.txt M common/thrift/Frontend.thrift M common/thrift/Types.thrift 9 files changed, 209 insertions(+), 185 deletions(-) Approvals: Jason Fehr: Looks good to me, but someone else must approve Riza Suminto: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/20956 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I9a967c5c84b6a401f8f5764373f6cd7ee807545f Gerrit-Change-Number: 20956 Gerrit-PatchSet: 8 Gerrit-Owner: Michael Smith <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Jason Fehr <[email protected]> Gerrit-Reviewer: Michael Smith <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Reviewer: Riza Suminto <[email protected]>
