Joe McDonnell created IMPALA-14079:
--------------------------------------
Summary: Catalog hits TSAN error during dataload
Key: IMPALA-14079
URL: https://issues.apache.org/jira/browse/IMPALA-14079
Project: IMPALA
Issue Type: Bug
Components: Catalog
Affects Versions: Impala 5.0.0
Reporter: Joe McDonnell
The TSAN job is failing during dataload with this error in the catalogd ERROR
log:
{noformat}
WARNING: ThreadSanitizer: data race (pid=25180)
Read of size 1 at 0x7b10000bd0f0 by thread T84:
#0
impala::CatalogServiceThriftIf::AcceptRequest(impala::CatalogServiceVersion::type)
/data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/catalog/catalog-server.cc:566:28
(impalad+0x2107287)
#1 impala::CatalogServiceThriftIf::ExecDdl(impala::TDdlExecResponse&,
impala::TDdlExecRequest const&)
/data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/catalog/catalog-server.cc:348:21
(impalad+0x2104302)
#2
impala::CatalogServiceProcessorT<apache::thrift::protocol::TDummyProtocol>::process_ExecDdl(int,
apache::thrift::protocol::TProtocol*, apache::thrift::protocol::TProtocol*,
void*)
/data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/generated-sources/gen-cpp/CatalogService.tcc:3172:13
(impalad+0x2017d63)
#3
impala::CatalogServiceProcessorT<apache::thrift::protocol::TDummyProtocol>::dispatchCall(apache::thrift::protocol::TProtocol*,
apache::thrift::protocol::TProtocol*, std::__cxx11::basic_string<char,
std::char_traits<char>, std::allocator<char> > const&, int, void*)
/data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/generated-sources/gen-cpp/CatalogService.tcc:3124:3
(impalad+0x2021822)
#4
apache::thrift::TDispatchProcessor::process(std::shared_ptr<apache::thrift::protocol::TProtocol>,
std::shared_ptr<apache::thrift::protocol::TProtocol>, void*)
/data/jenkins/workspace/impala-asf-master-core-tsan/Impala-Toolchain/toolchain-packages-gcc10.4.0/thrift-0.16.0-p7/include/thrift/TDispatchProcessor.h:121:12
(impalad+0x202159e)
#5 apache::thrift::server::TAcceptQueueServer::Task::run()
/data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/rpc/TAcceptQueueServer.cpp:84:26
.(impalad+0x265a5b4)
...
p1/libs/thread/src/pthread/thread.cpp:179:37 (impalad+0x3c80856) Previous
write of size 1 at 0x7b10000bd0f0 by thread T82:
#0
impala::CatalogServiceThriftIf::AcceptRequest(impala::CatalogServiceVersion::type)
/data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/catalog/catalog-server.cc:571:36
(impalad+0x2107309)
#1 impala::CatalogServiceThriftIf::ExecDdl(impala::TDdlExecResponse&,
impala::TDdlExecRequest const&)
/data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/catalog/catalog-server.cc:348:21
(impalad+0x2104302)
#2
impala::CatalogServiceProcessorT<apache::thrift::protocol::TDummyProtocol>::process_ExecDdl(int,
apache::thrift::protocol::TProtocol*, apache::thrift::protocol::TProtocol*,
void*)
/data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/generated-sources/gen-cpp/CatalogService.tcc:3172:13
(impalad+0x2017d63)
#3
impala::CatalogServiceProcessorT<apache::thrift::protocol::TDummyProtocol>::dispatchCall(apache::thrift::protocol::TProtocol*,
apache::thrift::protocol::TProtocol*, std::__cxx11::basic_string<char,
std::char_traits<char>, std::allocator<char> > const&, int, void*)
/data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/generated-sources/gen-cpp/CatalogService.tcc:3124:3
(impalad+0x2021822)
#4
apache::thrift::TDispatchProcessor::process(std::shared_ptr<apache::thrift::protocol::TProtocol>,
std::shared_ptr<apache::thrift::protocol::TProtocol>, void*)
/data/jenkins/workspace/impala-asf-master-core-tsan/Impala-Toolchain/toolchain-packages-gcc10.4.0/thrift-0.16.0-p7/include/thrift/TDispatchProcessor.h:121:12
(impalad+0x202159e)
#5 apache::thrift::server::TAcceptQueueServer::Task::run()
/data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/rpc/TAcceptQueueServer.cpp:84:26
(impalad+0x265a5b4){noformat}
It looks like a race on has_initiated_first_reset_ and came in with
"IMPALA-13850 (part 2): Fix bug found by test_restart_services.py".
has_initiated_first_reset_ could be made into a std::atomic<bool>.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]