Adar Dembo created KUDU-2439:
--------------------------------
Summary: There's no way to safely clean up a KuduClient or
Messenger
Key: KUDU-2439
URL: https://issues.apache.org/jira/browse/KUDU-2439
Project: Kudu
Issue Type: Bug
Components: client, rpc
Affects Versions: 1.8.0
Reporter: Adar Dembo
KuduClient has shared ownership, and its only "shutdown" knob is to drop its
last ref. This drops the last ref on the underlying Messenger object, but
Messenger itself has a funky "internal" vs. "external" ref system, and
destroying the KuduClient only drops the last external ref. The Messenger is
only destroyed when the last internal ref is dropped, and that only happens
when outstanding reactor threads finish whatever processing they were busy
doing. So, there's no way for a user to "destroy this KuduClient and wait for
all outstanding resources to be cleaned up".
Why is this important? For one, there's a known data race with outstanding work
done by the KuduClient's DnsResolver. For two, there's a similar issue with
OpenSSL. OpenSSL 1.1 registers an atexit() handler that tears down its global
library state. For this to be safe, all allocated OpenSSL resources must have
been released by the time atexit() handlers run (which is to say, all OpenSSL
resources must be released by the time main() returns or exit() is called).
Because we can't wait on the KuduClient to destroy itself and its OpenSSL
resources, applications using the KuduClient may run the atexit() handler at an
unsafe time.
Here's the TSAN output for the OpenSSL data race:. It's trivial to reproduce
via reactor-test once the appropriate suppression is removed from
tsan-suppressions.txt:
{noformat}
WARNING: ThreadSanitizer: data race (pid=7914)
Write of size 1 at 0x7b1000004340 by main thread:
#0 pthread_rwlock_destroy
/home/adar/Source/kudu/thirdparty/src/llvm-6.0.0.src/projects/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:1313
(reactor-test+0x48205e)
#1 CRYPTO_THREAD_lock_free
/home/adar/openssl/openssl-1.1.0g/build_shared/../crypto/threads_pthread.c:95
(libcrypto.so.1.1+0x20d0e5)
Previous atomic read of size 1 at 0x7b1000004340 by thread T16:
#0 pthread_rwlock_wrlock
/home/adar/Source/kudu/thirdparty/src/llvm-6.0.0.src/projects/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:1352
(reactor-test+0x45ffe3)
#1 CRYPTO_THREAD_write_lock
/home/adar/openssl/openssl-1.1.0g/build_shared/../crypto/threads_pthread.c:66
(libcrypto.so.1.1+0x20d08a)
#2 std::__1::__function::__func<void (*)(ssl_ctx_st*), std::__1::allocator<void
(*)(ssl_ctx_st*)>, void (ssl_ctx_st*)>::operator()(ssl_ctx_st*&&)
/home/adar/Source/kudu/thirdparty/installed/tsan/include/c++/v1/functional:1562:12
(libsecurity.so+0x56b14)
#3 std::__1::function<void (ssl_ctx_st*)>::operator()(ssl_ctx_st*) const
/home/adar/Source/kudu/thirdparty/installed/tsan/include/c++/v1/functional:1916:12
(libkrpc.so+0xb139d)
#4 std::__1::unique_ptr<ssl_ctx_st, std::__1::function<void (ssl_ctx_st*)>
>::reset(ssl_ctx_st*)
/home/adar/Source/kudu/thirdparty/installed/tsan/include/c++/v1/memory:2598:7
(libkrpc.so+0xb0f9e)
#5 std::__1::unique_ptr<ssl_ctx_st, std::__1::function<void (ssl_ctx_st*)>
>::~unique_ptr()
/home/adar/Source/kudu/thirdparty/installed/tsan/include/c++/v1/memory:2552
(libkrpc.so+0xb0f9e)
#6 kudu::security::TlsContext::~TlsContext()
/home/adar/Source/kudu/build/tsan/../../src/kudu/security/tls_context.h:77
(libkrpc.so+0xb0f9e)
#7
std::__1::default_delete<kudu::security::TlsContext>::operator()(kudu::security::TlsContext*)
const
/home/adar/Source/kudu/thirdparty/installed/tsan/include/c++/v1/memory:2285:5
(libkrpc.so+0xab32c)
#8 std::__1::unique_ptr<kudu::security::TlsContext,
std::__1::default_delete<kudu::security::TlsContext>
>::reset(kudu::security::TlsContext*)
/home/adar/Source/kudu/thirdparty/installed/tsan/include/c++/v1/memory:2598
(libkrpc.so+0xab32c)
#9 std::__1::unique_ptr<kudu::security::TlsContext,
std::__1::default_delete<kudu::security::TlsContext> >::~unique_ptr()
/home/adar/Source/kudu/thirdparty/installed/tsan/include/c++/v1/memory:2552
(libkrpc.so+0xab32c)
#10 kudu::rpc::Messenger::~Messenger()
/home/adar/Source/kudu/build/tsan/../../src/kudu/rpc/messenger.cc:433
(libkrpc.so+0xab32c)
#11
std::__1::default_delete<kudu::rpc::Messenger>::operator()(kudu::rpc::Messenger*)
const
/home/adar/Source/kudu/thirdparty/installed/tsan/include/c++/v1/memory:2285:5
(libkrpc.so+0xb0ddb)
#12 std::__1::__shared_ptr_pointer<kudu::rpc::Messenger*,
std::__1::default_delete<kudu::rpc::Messenger>,
std::__1::allocator<kudu::rpc::Messenger> >::__on_zero_shared()
/home/adar/Source/kudu/thirdparty/installed/tsan/include/c++/v1/memory:3586
(libkrpc.so+0xb0ddb)
#13 std::__1::__shared_count::__release_shared()
/home/adar/Source/kudu/thirdparty/installed/tsan/include/c++/v1/memory:3490:9
(reactor-test+0x4d0a2e)
#14 std::__1::__shared_weak_count::__release_shared()
/home/adar/Source/kudu/thirdparty/installed/tsan/include/c++/v1/memory:3532
(reactor-test+0x4d0a2e)
#15 std::__1::shared_ptr<kudu::rpc::Messenger>::~shared_ptr()
/home/adar/Source/kudu/thirdparty/installed/tsan/include/c++/v1/memory:4468
(reactor-test+0x4d0a2e)
#16 std::__1::shared_ptr<kudu::rpc::Messenger>::reset()
/home/adar/Source/kudu/thirdparty/installed/tsan/include/c++/v1/memory:4603:5
(libkrpc.so+0xbefd1)
#17 kudu::rpc::ReactorThread::RunThread()
/home/adar/Source/kudu/build/tsan/../../src/kudu/rpc/reactor.cc:482
(libkrpc.so+0xbefd1)
#18 boost::_mfi::mf0<void,
kudu::rpc::ReactorThread>::operator()(kudu::rpc::ReactorThread*) const
/home/adar/Source/kudu/build/tsan/../../thirdparty/installed/tsan/include/boost/bind/mem_fn_template.hpp:49:29
(libkrpc.so+0xc8c69)
#19 void boost::_bi::list1<boost::_bi::value<kudu::rpc::ReactorThread*>
>::operator()<boost::_mfi::mf0<void, kudu::rpc::ReactorThread>,
boost::_bi::list0>(boost::_bi::type<void>, boost::_mfi::mf0<void,
kudu::rpc::ReactorThread>&, boost::_bi::list0&, int)
/home/adar/Source/kudu/build/tsan/../../thirdparty/installed/tsan/include/boost/bind/bind.hpp:259:9
(libkrpc.so+0xc8bba)
#20 boost::_bi::bind_t<void, boost::_mfi::mf0<void, kudu::rpc::ReactorThread>,
boost::_bi::list1<boost::_bi::value<kudu::rpc::ReactorThread*> >
>::operator()()
/home/adar/Source/kudu/build/tsan/../../thirdparty/installed/tsan/include/boost/bind/bind.hpp:1222:16
(libkrpc.so+0xc8b43)
#21
boost::detail::function::void_function_obj_invoker0<boost::_bi::bind_t<void,
boost::_mfi::mf0<void, kudu::rpc::ReactorThread>,
boost::_bi::list1<boost::_bi::value<kudu::rpc::ReactorThread*> > >,
void>::invoke(boost::detail::function::function_buffer&)
/home/adar/Source/kudu/build/tsan/../../thirdparty/installed/tsan/include/boost/function/function_template.hpp:159:11
(libkrpc.so+0xc8939)
#22 boost::function0<void>::operator()() const
/home/adar/Source/kudu/build/tsan/../../thirdparty/installed/tsan/include/boost/function/function_template.hpp:770:14
(libkrpc.so+0xb8a21)
#23 kudu::Thread::SuperviseThread(void*)
/home/adar/Source/kudu/build/tsan/../../src/kudu/util/thread.cc:603:3
(libkudu_util.so+0x1c1264)
Location is heap block of size 56 at 0x7b1000004340 allocated by main thread:
#0 malloc
/home/adar/Source/kudu/thirdparty/src/llvm-6.0.0.src/projects/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:666
(reactor-test+0x47cc9c)
#1 CRYPTO_malloc
/home/adar/openssl/openssl-1.1.0g/build_shared/../crypto/mem.c:92
(libcrypto.so.1.1+0x1a6327)
#2 CRYPTO_THREAD_run_once
/home/adar/openssl/openssl-1.1.0g/build_shared/../crypto/threads_pthread.c:106
(libcrypto.so.1.1+0x20d126)
#3 CRYPTO_THREAD_run_once
/home/adar/openssl/openssl-1.1.0g/build_shared/../crypto/threads_pthread.c:106
(libcrypto.so.1.1+0x20d126)
#4 kudu::rpc::Messenger::Init()
/home/adar/Source/kudu/build/tsan/../../src/kudu/rpc/messenger.cc:444:3
(libkrpc.so+0xa9589)
#5
kudu::rpc::MessengerBuilder::Build(std::__1::shared_ptr<kudu::rpc::Messenger>*)
/home/adar/Source/kudu/build/tsan/../../src/kudu/rpc/messenger.cc:205:3
(libkrpc.so+0xa903d)
#6 kudu::rpc::RpcTestBase::CreateMessenger(std::__1::basic_string<char,
std::__1::char_traits<char>, std::__1::allocator<char> > const&,
std::__1::shared_ptr<kudu::rpc::Messenger>*, int, bool,
std::__1::basic_string<char, std::__1::char_traits<char>,
std::__1::allocator<char> > const&, std::__1::basic_string<char,
std::__1::char_traits<char>, std::__1::allocator<char> > const&,
std::__1::basic_string<char, std::__1::char_traits<char>,
std::__1::allocator<char> > const&, std::__1::basic_string<char,
std::__1::char_traits<char>, std::__1::allocator<char> > const&)
/home/adar/Source/kudu/build/tsan/../../src/kudu/rpc/rpc-test-base.h:462:16
(reactor-test+0x4d3412)
#7 kudu::rpc::ReactorTest::SetUp()
/home/adar/Source/kudu/build/tsan/../../src/kudu/rpc/reactor-test.cc:46:5
(reactor-test+0x4d02ae)
#8 void
testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test,
void>(testing::Test*, void (testing::Test::*)(), char const*)
/home/adar/Source/kudu/thirdparty/src/googletest-release-1.8.0/googletest/src/gtest.cc:2402:10
(libgmock.so+0x551df)
#9 void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test,
void>(testing::Test*, void (testing::Test::*)(), char const*)
/home/adar/Source/kudu/thirdparty/src/googletest-release-1.8.0/googletest/src/gtest.cc:2438
(libgmock.so+0x551df)
#10 testing::Test::Run()
/home/adar/Source/kudu/thirdparty/src/googletest-release-1.8.0/googletest/src/gtest.cc:2470:3
(libgmock.so+0x342b1)
#11 testing::TestInfo::Run()
/home/adar/Source/kudu/thirdparty/src/googletest-release-1.8.0/googletest/src/gtest.cc:2656:11
(libgmock.so+0x3563c)
#12 testing::TestCase::Run()
/home/adar/Source/kudu/thirdparty/src/googletest-release-1.8.0/googletest/src/gtest.cc:2774:28
(libgmock.so+0x36116)
#13 testing::internal::UnitTestImpl::RunAllTests()
/home/adar/Source/kudu/thirdparty/src/googletest-release-1.8.0/googletest/src/gtest.cc:4649:43
(libgmock.so+0x424ea)
#14 bool
testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl,
bool>(testing::internal::UnitTestImpl*, bool
(testing::internal::UnitTestImpl::*)(), char const*)
/home/adar/Source/kudu/thirdparty/src/googletest-release-1.8.0/googletest/src/gtest.cc:2402:10
(libgmock.so+0x5614f)
#15 bool
testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl,
bool>(testing::internal::UnitTestImpl*, bool
(testing::internal::UnitTestImpl::*)(), char const*)
/home/adar/Source/kudu/thirdparty/src/googletest-release-1.8.0/googletest/src/gtest.cc:2438
(libgmock.so+0x5614f)
#16 testing::UnitTest::Run()
/home/adar/Source/kudu/thirdparty/src/googletest-release-1.8.0/googletest/src/gtest.cc:4257:10
(libgmock.so+0x41dd2)
#17 RUN_ALL_TESTS()
/home/adar/Source/kudu/build/tsan/../../thirdparty/installed/tsan/include/gtest/gtest.h:2233:46
(libkudu_test_main.so+0x33fb)
#18 main
/home/adar/Source/kudu/build/tsan/../../src/kudu/util/test_main.cc:106:13
(libkudu_test_main.so+0x2bb6)
Thread T16 'rpc reactor-793' (tid=7934, finished) created by main thread at:
#0 pthread_create
/home/adar/Source/kudu/thirdparty/src/llvm-6.0.0.src/projects/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:992
(reactor-test+0x45f7d6)
#1 kudu::Thread::StartThread(std::__1::basic_string<char,
std::__1::char_traits<char>, std::__1::allocator<char> > const&,
std::__1::basic_string<char, std::__1::char_traits<char>,
std::__1::allocator<char> > const&, boost::function<void ()> const&, unsigned
long, scoped_refptr<kudu::Thread>*)
/home/adar/Source/kudu/build/tsan/../../src/kudu/util/thread.cc:556:15
(libkudu_util.so+0x1c0c8f)
#2 kudu::Status kudu::Thread::Create<void (kudu::rpc::ReactorThread::*)(),
kudu::rpc::ReactorThread*>(std::__1::basic_string<char,
std::__1::char_traits<char>, std::__1::allocator<char> > const&,
std::__1::basic_string<char, std::__1::char_traits<char>,
std::__1::allocator<char> > const&, void (kudu::rpc::ReactorThread::*
const&)(), kudu::rpc::ReactorThread* const&, scoped_refptr<kudu::Thread>*)
/home/adar/Source/kudu/build/tsan/../../src/kudu/util/thread.h:164:12
(libkrpc.so+0xc4095)
#3 kudu::rpc::ReactorThread::Init()
/home/adar/Source/kudu/build/tsan/../../src/kudu/rpc/reactor.cc:168:10
(libkrpc.so+0xbeace)
#4 kudu::rpc::Reactor::Init()
/home/adar/Source/kudu/build/tsan/../../src/kudu/rpc/reactor.cc:728:18
(libkrpc.so+0xc2f81)
#5 kudu::rpc::Messenger::Init()
/home/adar/Source/kudu/build/tsan/../../src/kudu/rpc/messenger.cc:446:5
(libkrpc.so+0xa95e2)
#6
kudu::rpc::MessengerBuilder::Build(std::__1::shared_ptr<kudu::rpc::Messenger>*)
/home/adar/Source/kudu/build/tsan/../../src/kudu/rpc/messenger.cc:205:3
(libkrpc.so+0xa903d)
#7 kudu::rpc::RpcTestBase::CreateMessenger(std::__1::basic_string<char,
std::__1::char_traits<char>, std::__1::allocator<char> > const&,
std::__1::shared_ptr<kudu::rpc::Messenger>*, int, bool,
std::__1::basic_string<char, std::__1::char_traits<char>,
std::__1::allocator<char> > const&, std::__1::basic_string<char,
std::__1::char_traits<char>, std::__1::allocator<char> > const&,
std::__1::basic_string<char, std::__1::char_traits<char>,
std::__1::allocator<char> > const&, std::__1::basic_string<char,
std::__1::char_traits<char>, std::__1::allocator<char> > const&)
/home/adar/Source/kudu/build/tsan/../../src/kudu/rpc/rpc-test-base.h:462:16
(reactor-test+0x4d3412)
#8 kudu::rpc::ReactorTest::SetUp()
/home/adar/Source/kudu/build/tsan/../../src/kudu/rpc/reactor-test.cc:46:5
(reactor-test+0x4d02ae)
#9 void
testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test,
void>(testing::Test*, void (testing::Test::*)(), char const*)
/home/adar/Source/kudu/thirdparty/src/googletest-release-1.8.0/googletest/src/gtest.cc:2402:10
(libgmock.so+0x551df)
#10 void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test,
void>(testing::Test*, void (testing::Test::*)(), char const*)
/home/adar/Source/kudu/thirdparty/src/googletest-release-1.8.0/googletest/src/gtest.cc:2438
(libgmock.so+0x551df)
#11 testing::Test::Run()
/home/adar/Source/kudu/thirdparty/src/googletest-release-1.8.0/googletest/src/gtest.cc:2470:3
(libgmock.so+0x342b1)
#12 testing::TestInfo::Run()
/home/adar/Source/kudu/thirdparty/src/googletest-release-1.8.0/googletest/src/gtest.cc:2656:11
(libgmock.so+0x3563c)
#13 testing::TestCase::Run()
/home/adar/Source/kudu/thirdparty/src/googletest-release-1.8.0/googletest/src/gtest.cc:2774:28
(libgmock.so+0x36116)
#14 testing::internal::UnitTestImpl::RunAllTests()
/home/adar/Source/kudu/thirdparty/src/googletest-release-1.8.0/googletest/src/gtest.cc:4649:43
(libgmock.so+0x424ea)
#15 bool
testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl,
bool>(testing::internal::UnitTestImpl*, bool
(testing::internal::UnitTestImpl::*)(), char const*)
/home/adar/Source/kudu/thirdparty/src/googletest-release-1.8.0/googletest/src/gtest.cc:2402:10
(libgmock.so+0x5614f)
#16 bool
testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl,
bool>(testing::internal::UnitTestImpl*, bool
(testing::internal::UnitTestImpl::*)(), char const*)
/home/adar/Source/kudu/thirdparty/src/googletest-release-1.8.0/googletest/src/gtest.cc:2438
(libgmock.so+0x5614f)
#17 testing::UnitTest::Run()
/home/adar/Source/kudu/thirdparty/src/googletest-release-1.8.0/googletest/src/gtest.cc:4257:10
(libgmock.so+0x41dd2)
#18 RUN_ALL_TESTS()
/home/adar/Source/kudu/build/tsan/../../thirdparty/installed/tsan/include/gtest/gtest.h:2233:46
(libkudu_test_main.so+0x33fb)
#19 main
/home/adar/Source/kudu/build/tsan/../../src/kudu/util/test_main.cc:106:13
(libkudu_test_main.so+0x2bb6)
SUMMARY: ThreadSanitizer: data race
/home/adar/openssl/openssl-1.1.0g/build_shared/../crypto/threads_pthread.c:95
in CRYPTO_THREAD_lock_free
{noformat}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)