[
https://issues.apache.org/jira/browse/IMPALA-12747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17811417#comment-17811417
]
ASF subversion and git services commented on IMPALA-12747:
----------------------------------------------------------
Commit f3ac2ddbfef0d7cd359b7c9ae47d424791327c6d in impala's branch
refs/heads/master from Michael Smith
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=f3ac2ddbf ]
IMPALA-12747: Atomic update of execution state
QueryDriver owns instances of ClientRequestState and TExecRequest. The
ClientRequestState is used to track execution state of the client-facing
side of a query. TExecRequest encapsulates context about the query
produced by the planner.
When a QueryDriver is created, it creates an instance of
ClientRequestState, but has not yet executed planning. It would create
an empty TExecRequest and pass a pointer to it to ClientRequestState,
then update the content of TExecRequest when RunFrontendPlanner is
called from ImpalaServer::ExecuteInternal.
Updating TExecRequest was not atomic, so it was possible other
operations - like producing a QueryStateRecord for /queries in the web
UI - would try to read the content of TExecRequest while updating. This
caused TSAN errors and occasional crashes in internal-server-test, which
runs concurrent requests and examines them through calls to /queries.
Changes ClientRequestState to
- Provide a static placeholder for TExecRequest during creation that
represents an empty context for an UNKNOWN statement type (default
initialized in Thrift).
- Make all references to TExecRequest const so its content cannot be
updated in a non-thread-safe manner.
- ClientRequestState uses an AtomicPtr which is updated atomically when
the filled TExecRequest is available.
QueryDriver does not publicly expose access to TExecRequest, so we can
ensure its use is thread-safe without atomics.
ClientRequestState::exec_request() will return either a reference to the
static placeholder or the value provided after - which is never changed
- so this reference will always be valid for the lifetime of the
ClientRequestState.
Updates user_has_profile_access to be AtomicBool for the same reason.
Reverts tsan-suppressions for IMPALA-12660 so we get TSAN coverage. Adds
suppression for a lock-order-inversion bug (IMPALA-12757) that was
uncovered after fixing this data race.
Testing:
- InternalServerTest.SimultaneousMultipleQueriesOneSession would fail
after ~10 test runs. Ran 90 times without failure.
- Passed TSAN run of backend tests.
Change-Id: I9a967c5c84b6a401f8f5764373f6cd7ee807545f
Reviewed-on: http://gerrit.cloudera.org:8080/20956
Reviewed-by: Jason Fehr <[email protected]>
Reviewed-by: Riza Suminto <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
> internal-server-test crashes with SIGSEGV
> -----------------------------------------
>
> Key: IMPALA-12747
> URL: https://issues.apache.org/jira/browse/IMPALA-12747
> Project: IMPALA
> Issue Type: Bug
> Components: Backend
> Affects Versions: Impala 4.4.0
> Reporter: Laszlo Gaal
> Assignee: Michael Smith
> Priority: Blocker
> Labels: arm64, broken-build
> Fix For: Impala 4.4.0
>
>
> internal-server-test crashed in
> InternalServerTest.SimultaneousMultipleQueriesOneSession with minidumps in a
> release-mode build on ARM.
> The minidump was resolved to the below call stack:
> {code}
> Operating system: Linux
> 0.0.0 Linux 4.18.0-477.15.1.el8_8.aarch64 #1 SMP Fri Jun 2
> 08:39:44 EDT 2023 aarch64
> CPU: arm64
> 16 CPUs
> GPU: UNKNOWN
> Crash reason: SIGSEGV /SEGV_MAPERR
> Crash address: 0x8
> Process uptime: not available
> Thread 538 (crashed)
> 0 internal-server-test!impala::TPlanNode::TPlanNode(impala::TPlanNode
> const&) [PlanNodes_types.cpp : 3124 + 0x0]
> x0 = 0x000000004d600020 x1 = 0x0000000000000020
> x2 = 0x0000000000000000 x3 = 0x0000000000000000
> x4 = 0x000000004d600900 x5 = 0x000000004d600a00
> x6 = 0x0000000000000000 x7 = 0x000000004d600918
> x8 = 0x000000004d600060 x9 = 0x000000004d600828
> x10 = 0x0000000004a3ab48 x11 = 0x000000004d600838
> x12 = 0x000000004d600868 x13 = 0x000000004d6007f8
> x14 = 0x000000004d600818 x15 = 0x000000004d6007d0
> x16 = 0x000000004d600848 x17 = 0x000000004d600878
> x18 = 0x0000000004a3ab98 x19 = 0x000000004d600000
> x20 = 0x0000000000000000 x21 = 0x000000004d600098
> x22 = 0x000000004d6001e0 x23 = 0x000000004d600038
> x24 = 0x000000004d600190 x25 = 0x000000004d6003d8
> x26 = 0x000000004d600560 x27 = 0x000000004d600080
> x28 = 0x000000004d6008b8 fp = 0x0000fffe4df61990
> lr = 0x000000004d600798 sp = 0x0000fffe4df61990
> pc = 0x0000000001023dc4
> Found by: given as instruction pointer in context
> 1 internal-server-test!std::vector<impala::TPlanNode,
> std::allocator<impala::TPlanNode> >::operator=(std::vector<impala::TPlanNode,
> std::allocator<impala::TPlanNode> > const&) [stl_construct.h : 109 + 0x8]
> x19 = 0x0000000000000000 x20 = 0x000000004d600000
> x21 = 0x000000004497b038 x22 = 0x000000004d600000
> x23 = 0x0000000045a9bb30 x24 = 0x0000000045a9bb30
> x25 = 0x000000004497b020 x26 = 0x00000000390da838
> x27 = 0x00000000461d7000 x28 = 0x0000fffe4df62018
> fp = 0x0000fffe4df61a50 sp = 0x0000fffe4df61a50
> pc = 0x00000000010242f4
> Found by: call frame info
> 2 internal-server-test!impala::TPlan::operator=(impala::TPlan const&)
> [PlanNodes_types.cpp : 3305 + 0x0]
> x19 = 0x000000004497b030 x20 = 0x00000000390da800
> x21 = 0x000000004497b010 x22 = 0x000000004497b400
> x23 = 0x000000004497b050 x24 = 0x000000004497b328
> x25 = 0x000000004497b020 x26 = 0x000000004497b030
> x27 = 0x00000000461d7000 x28 = 0x0000fffe4df62018
> fp = 0x0000fffe4df61aa0 sp = 0x0000fffe4df61aa0
> pc = 0x00000000010244ac
> Found by: call frame info
> 3
> internal-server-test!impala::TPlanFragment::TPlanFragment(impala::TPlanFragment
> const&) [Planner_types.cpp : 110 + 0x8]
> x19 = 0x000000004497b000 x20 = 0x00000000390da800
> x21 = 0x000000004497b010 x22 = 0x000000004497b400
> x23 = 0x000000004497b050 x24 = 0x000000004497b328
> x25 = 0x000000004497b020 x26 = 0x000000004497b030
> x27 = 0x00000000461d7000 x28 = 0x0000fffe4df62018
> fp = 0x0000fffe4df61ac0 sp = 0x0000fffe4df61ac0
> pc = 0x000000000102cd40
> Found by: call frame info
> 4
> internal-server-test!impala::ImpalaServer::QueryStateRecord::Init(impala::ClientRequestState
> const&) [stl_construct.h : 109 + 0x8]
> x19 = 0x00000000390da800 x20 = 0x000000004497b000
> x21 = 0x00000000390daf30 x22 = 0x0000000000000730
> x23 = 0x000000004497b000 x24 = 0x0000fffe4df61d50
> x25 = 0x0000000000000000 x26 = 0x00000000462ed320
> x27 = 0x00000000461d7000 x28 = 0x0000fffe4df62018
> fp = 0x0000fffe4df61b10 sp = 0x0000fffe4df61b10
> pc = 0x00000000014d80a4
> Found by: call frame info
> 5
> internal-server-test!impala::ImpalaServer::QueryStateRecord::QueryStateRecord(impala::ClientRequestState
> const&) [impala-server.cc : 2529 + 0x0]
> x19 = 0x0000fffe4df61d50 x20 = 0x0000fffe4df61eb8
> x21 = 0x0000fffe4df61df8 x22 = 0x0000fffe4df61e38
> x23 = 0x0000fffe4df61e48 x24 = 0x0000fffe4df61e98
> x25 = 0x0000fffe4df61ea8 x26 = 0x0000fffe4df62018
> x27 = 0x0000fffe4df62040 x28 = 0x0000fffe4df62050
> fp = 0x0000fffe4df61c10 sp = 0x0000fffe4df61c10
> pc = 0x00000000014d97ec
> Found by: call frame info
> 6 internal-server-test!<name omitted> [impala-http-handler.cc : 626 + 0x8]
> x19 = 0x000000004553c440 x20 = 0x0000fffe4df62420
> x21 = 0x0000fffe4df622b0 x22 = 0x0000000037e0a278
> x23 = 0x0000fffe4df61d50 x24 = 0x0000fffe4df62958
> x25 = 0x000000003909cca0 x26 = 0x0000fffe4df629b8
> x27 = 0x0000fffe4df62998 x28 = 0x0000fffe4df62c60
> fp = 0x0000fffe4df61cb0 sp = 0x0000fffe4df61cb0
> pc = 0x00000000014bb214
> Found by: call frame info
> 7
> internal-server-test!impala::ImpalaHttpHandler::QueryStateHandler(kudu::WebCallbackRegistry::WebRequest
> const&, rapidjson::GenericDocument<rapidjson::UTF8<char>,
> rapidjson::MemoryPoolAllocator<rapidjson::CrtAllocator>,
> rapidjson::CrtAllocator>*) [std_function.h : 622 + 0x8]
> x19 = 0x000000004553c440 x20 = 0x0000fffe4df62420
> x21 = 0x0000fffe4df62290 x22 = 0x0000000037e0a278
> x23 = 0x0000000037e0a378 x24 = 0x0000fffe4df62958
> x25 = 0x000000003909cca0 x26 = 0x0000fffe4df629b8
> x27 = 0x0000fffe4df62998 x28 = 0x0000fffe4df62c60
> fp = 0x0000fffe4df62070 sp = 0x0000fffe4df62070
> pc = 0x00000000014b9910
> Found by: call frame info
> 8
> internal-server-test!impala::Webserver::RenderUrlWithTemplate(sq_connection
> const*, kudu::WebCallbackRegistry::WebRequest const&,
> impala::Webserver::UrlHandler const&, std::__cxx11::basic_stringstream<char,
> std::char_traits<char>, std::allocator<char> >*, impala::ContentType*,
> std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>
> > const&) [function_template.hpp : 763 + 0xc]
> x19 = 0x0000000037ccfc00 x20 = 0x0000fffe4df629b8
> x21 = 0x0000fffe4df62420 x22 = 0x0000fffe4df62ac8
> x23 = 0x000000003bb25000 x24 = 0x0000fffe4df62958
> x25 = 0x000000003909cca0 x26 = 0x0000fffe4df629b8
> x27 = 0x0000fffe4df62998 x28 = 0x0000fffe4df62c60
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]