Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/16648 )
Change subject: (wip) KUDU-2612: restrict TxnStatusManager calls to be made by the leader only ...................................................................... Patch Set 2: (7 comments) http://gerrit.cloudera.org:8080/#/c/16648/2/src/kudu/integration-tests/txn_status_table-itest.cc File src/kudu/integration-tests/txn_status_table-itest.cc: http://gerrit.cloudera.org:8080/#/c/16648/2/src/kudu/integration-tests/txn_status_table-itest.cc@756 PS2, Line 756: ASSERT_OK(s); : ASSERT_TRUE(txn_status.has_user()); : ASSERT_STREQ(kUser, txn_status.user().c_str()); : ASSERT_TRUE(txn_status.has_state()); : ASSERT_EQ(TxnStatePB::OPEN, txn_status.state()); Does this work as expected when an assertion triggers? I remember there were issues with that, so in this sort of construction we tended to use CHECK_ equivalents. http://gerrit.cloudera.org:8080/#/c/16648/2/src/kudu/integration-tests/txn_status_table-itest.cc@764 PS2, Line 764: std::for_each(threads.begin(), threads.end(), [&] (thread& t) { t.join(); }); If leave those ASSERTs above (i.e. not converting them into CHECKs), this should be put into a scoped cleanup, right? Otherwise, if an assertion above triggers, the threads would not be joined? http://gerrit.cloudera.org:8080/#/c/16648/2/src/kudu/master/txn_manager-test.cc File src/kudu/master/txn_manager-test.cc: http://gerrit.cloudera.org:8080/#/c/16648/2/src/kudu/master/txn_manager-test.cc@150 PS2, Line 150: nit: remove the extra space http://gerrit.cloudera.org:8080/#/c/16648/2/src/kudu/master/txn_manager-test.cc@151 PS2, Line 151: cluster_->mini_tablet_server(0)->server()->tablet_manager()->GetTabletReplicas(&replicas) I'm not sure this waits or assumes any leadership status, saw GetTabletReplicas() simply calls AppendValuesFromMap() under the hood, no? http://gerrit.cloudera.org:8080/#/c/16648/2/src/kudu/master/txn_manager-test.cc@161 PS2, Line 161: tablet_replica_ What tablet replica is it and where is it used besides TearDown() and Start()? http://gerrit.cloudera.org:8080/#/c/16648/2/src/kudu/transactions/txn_status_manager.h File src/kudu/transactions/txn_status_manager.h: http://gerrit.cloudera.org:8080/#/c/16648/2/src/kudu/transactions/txn_status_manager.h@95 PS2, Line 95: txn_status nit: txn_status_manager http://gerrit.cloudera.org:8080/#/c/16648/2/src/kudu/transactions/txn_status_manager.cc File src/kudu/transactions/txn_status_manager.cc: http://gerrit.cloudera.org:8080/#/c/16648/2/src/kudu/transactions/txn_status_manager.cc@202 PS2, Line 202: const string& uuid = txn_status_manager_->status_tablet_.tablet_replica_->permanent_uuid(); What happens if an instance of this lock is being constructed and the tablet replica is reset upon calling https://gerrit.cloudera.org/#/c/16648/2/src/kudu/tablet/tablet_replica.cc@211 TSTabletManager::CreateNewTablet() ? This has triggered the following TSAN warnings: WARNING: ThreadSanitizer: data race (pid=972) Write of size 8 at 0x7b5400056548 by thread T152 (mutexes: write M634861673190352280, write M58672 #0 std::__1::enable_if<(is_move_constructible<kudu::tablet::Tablet*>::value) && (is_move_assigna #1 std::__1::shared_ptr<kudu::tablet::Tablet>::swap(std::__1::shared_ptr<kudu::tablet::Tablet>&) #2 std::__1::shared_ptr<kudu::tablet::Tablet>::operator=(std::__1::shared_ptr<kudu::tablet::Tabl #3 kudu::tablet::TabletReplica::Start(kudu::consensus::ConsensusBootstrapInfo const&, std::__1:: #4 kudu::tserver::TSTabletManager::OpenTablet(scoped_refptr<kudu::tablet::TabletReplica> const&, #5 kudu::tserver::TSTabletManager::CreateNewTablet(std::__1::basic_string<char, std::__1::char_t #6 decltype(std::__1::forward<kudu::tserver::TSTabletManager::CreateNewTablet(std::__1::basic_st #7 void std::__1::__invoke_void_return_wrapper<void>::__call<kudu::tserver::TSTabletManager::Cre #8 std::__1::__function::__alloc_func<kudu::tserver::TSTabletManager::CreateNewTablet(std::__1:: #9 std::__1::__function::__func<kudu::tserver::TSTabletManager::CreateNewTablet(std::__1::basic_ #10 std::__1::__function::__value_func<void ()>::operator()() const /data/1/hao/Repositories/kud #11 std::__1::function<void ()>::operator()() const /data/1/hao/Repositories/kudu/thirdparty/ins #12 kudu::ThreadPool::DispatchThread() /data/1/hao/Repositories/kudu/src/kudu/util/threadpool.cc #13 kudu::ThreadPool::CreateThread()::$_1::operator()() const /data/1/hao/Repositories/kudu/src/ #14 decltype(std::__1::forward<kudu::ThreadPool::CreateThread()::$_1&>(fp)()) std::__1::__invoke #15 void std::__1::__invoke_void_return_wrapper<void>::__call<kudu::ThreadPool::CreateThread():: #16 std::__1::__function::__alloc_func<kudu::ThreadPool::CreateThread()::$_1, std::__1::allocato #17 std::__1::__function::__func<kudu::ThreadPool::CreateThread()::$_1, std::__1::allocator<kudu #18 std::__1::__function::__value_func<void ()>::operator()() const /data/1/hao/Repositories/kud #19 std::__1::function<void ()>::operator()() const /data/1/hao/Repositories/kudu/thirdparty/ins #20 kudu::Thread::SuperviseThread(void*) /data/1/hao/Repositories/kudu/src/kudu/util/thread.cc:6 Previous read of size 8 at 0x7b5400056548 by thread T150 (mutexes: read M586447839757432840): #0 std::__1::shared_ptr<kudu::tablet::Tablet>::operator->() const /data/1/hao/Repositories/kudu/ #1 kudu::tablet::TabletReplica::permanent_uuid() const /data/1/hao/Repositories/kudu/src/kudu/ta #2 kudu::transactions::TxnStatusManager::ScopedLeaderSharedLock::ScopedLeaderSharedLock(kudu::tr #3 kudu::tserver::TSTabletManager::TxnStalenessTrackerTask() /data/1/hao/Repositories/kudu/src/k #4 kudu::tserver::TSTabletManager::Init()::$_8::operator()() const /data/1/hao/Repositories/kudu #5 decltype(std::__1::forward<kudu::tserver::TSTabletManager::Init()::$_8&>(fp)()) std::__1::__i #6 void std::__1::__invoke_void_return_wrapper<void>::__call<kudu::tserver::TSTabletManager::Ini #7 std::__1::__function::__alloc_func<kudu::tserver::TSTabletManager::Init()::$_8, std::__1::all #8 std::__1::__function::__func<kudu::tserver::TSTabletManager::Init()::$_8, std::__1::allocator #9 std::__1::__function::__value_func<void ()>::operator()() const /data/1/hao/Repositories/kudu #10 std::__1::function<void ()>::operator()() const /data/1/hao/Repositories/kudu/thirdparty/ins #11 kudu::ThreadPool::DispatchThread() /data/1/hao/Repositories/kudu/src/kudu/util/threadpool.cc #12 kudu::ThreadPool::CreateThread()::$_1::operator()() const /data/1/hao/Repositories/kudu/src/ #13 decltype(std::__1::forward<kudu::ThreadPool::CreateThread()::$_1&>(fp)()) std::__1::__invoke #14 void std::__1::__invoke_void_return_wrapper<void>::__call<kudu::ThreadPool::CreateThread():: #15 std::__1::__function::__alloc_func<kudu::ThreadPool::CreateThread()::$_1, std::__1::allocato #16 std::__1::__function::__func<kudu::ThreadPool::CreateThread()::$_1, std::__1::allocator<kudu #17 std::__1::__function::__value_func<void ()>::operator()() const /data/1/hao/Repositories/kud #18 std::__1::function<void ()>::operator()() const /data/1/hao/Repositories/kudu/thirdparty/ins #19 kudu::Thread::SuperviseThread(void*) /data/1/hao/Repositories/kudu/src/kudu/util/thread.cc:6 -- To view, visit http://gerrit.cloudera.org:8080/16648 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I42c1ad095dcb4bdffcbe0ecf9631a60bac208c2a Gerrit-Change-Number: 16648 Gerrit-PatchSet: 2 Gerrit-Owner: Hao Hao <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: Hao Hao <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Wed, 23 Dec 2020 21:45:28 +0000 Gerrit-HasComments: Yes
