Alexey Serbin has uploaded this change for review. (
http://gerrit.cloudera.org:8080/21417
Change subject: [master] fix race in auto leader rebalancing
......................................................................
[master] fix race in auto leader rebalancing
It turned out that auto leader rebalancing task wasn't explicitly
shutdown upon shutting down catalog manager. That lead to race
conditions as reported by TSAN, at least in test scenarios (see below).
This patch addresses the issue.
WARNING: ThreadSanitizer: data race (pid=23827)
Write of size 1 at 0x7b4000008208 by main thread:
#0 AnnotateRWLockDestroy
thirdparty/src/llvm-11.0.0.src/projects/compiler-rt/lib/tsan/rtl/tsan_interface_ann.cpp:264
(auto_rebalancer-test+0x33575e)
#1 kudu::rw_spinlock::~rw_spinlock() src/kudu/util/locks.h:89:5
(libmaster.so+0x359376)
#2 kudu::master::TSManager::~TSManager()
src/kudu/master/ts_manager.cc:108:1 (libmaster.so+0x4ad201)
#3 kudu::master::TSManager::~TSManager()
src/kudu/master/ts_manager.cc:107:25 (libmaster.so+0x4ad229)
#4
std::__1::default_delete<kudu::master::TSManager>::operator()(kudu::master::TSManager*)
const thirdparty/installed/tsan/include/c++/v1/memory:2262:5
(libmaster.so+0x407ce7)
#5 std::__1::unique_ptr<kudu::master::TSManager,
std::__1::default_delete<kudu::master::TSManager>
>::reset(kudu::master::TSManager*)
thirdparty/installed/tsan/include/c++/v1/memory:2517:7 (libmaster.so+0x40157d)
#6 std::__1::unique_ptr<kudu::master::TSManager,
std::__1::default_delete<kudu::master::TSManager> >::~unique_ptr()
thirdparty/installed/tsan/include/c++/v1/memory:2471:19 (libmaster.so+0x4015eb)
#7 kudu::master::Master::~Master() src/kudu/master/master.cc:263:1
(libmaster.so+0x3f7a4a)
#8 kudu::master::Master::~Master() src/kudu/master/master.cc:261:19
(libmaster.so+0x3f7dc9)
#9
std::__1::default_delete<kudu::master::Master>::operator()(kudu::master::Master*)
const thirdparty/installed/tsan/include/c++/v1/memory:2262:5
(libmaster.so+0x435627)
#10 std::__1::unique_ptr<kudu::master::Master,
std::__1::default_delete<kudu::master::Master> >::reset(kudu::master::Master*)
thirdparty/installed/tsan/include/c++/v1/memory:2517:7 (libmaster.so+0x42e6ed)
#11 kudu::master::MiniMaster::Shutdown()
src/kudu/master/mini_master.cc:120:13 (libmaster.so+0x4c2612)
...
Previous atomic write of size 4 at 0x7b4000008208 by thread T439 (mutexes:
write M1141235379631443968):
#0 __tsan_atomic32_compare_exchange_strong
thirdparty/src/llvm-11.0.0.src/projects/compiler-rt/lib/tsan/rtl/tsan_interface_atomic.cpp:780
(auto_rebalancer-test+0x33eb60)
#1 base::subtle::Release_CompareAndSwap(int volatile*, int, int)
/src/kudu/gutil/atomicops-internals-tsan.h:88:3 (libmaster.so+0x2e2b34)
#2 kudu::rw_semaphore::unlock_shared() src/kudu/util/rw_semaphore.h:91:19
(libmaster.so+0x2e29c8)
#3 kudu::rw_spinlock::unlock_shared() src/kudu/util/locks.h:99:10
(libmaster.so+0x2e28ef)
#4 std::__1::shared_lock<kudu::rw_spinlock>::~shared_lock()
/thirdparty/installed/tsan/include/c++/v1/shared_mutex:369:19
(libmaster.so+0x2e23e0)
#5
kudu::master::TSManager::GetAllDescriptors(std::__1::vector<std::__1::shared_ptr<kudu::master::TSDescriptor>,
std::__1::allocator<std::__1::shared_ptr<kudu::master::TSDescriptor> > >*)
const src/kudu/master/ts_manager.cc:206:1 (libmaster.so+0x4adeb6)
#6 kudu::master::AutoLeaderRebalancerTask::RunLeaderRebalancer()
src/kudu/master/auto_leader_rebalancer.cc:405:16 (libmaster.so+0x2fb51b)
#7 kudu::master::AutoLeaderRebalancerTask::RunLoop()
src/kudu/master/auto_leader_rebalancer.cc:445:7 (libmaster.so+0x2fbaa9)
This is a follow-up to 10efaf2c77dfe5e4474505e0267c583c011703be.
Change-Id: Iccd66d00280d22b37386230874937e5260f07f3b
---
M src/kudu/master/auto_leader_rebalancer.cc
M src/kudu/master/catalog_manager.cc
2 files changed, 9 insertions(+), 1 deletion(-)
git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/17/21417/1
--
To view, visit http://gerrit.cloudera.org:8080/21417
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Iccd66d00280d22b37386230874937e5260f07f3b
Gerrit-Change-Number: 21417
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin <[email protected]>