[ 
https://issues.apache.org/jira/browse/KUDU-1500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15415934#comment-15415934
 ] 

Dinesh Bhat commented on KUDU-1500:
-----------------------------------

[~danburkert], I had an offline discussion with [~mpercy] yesterday, and 
summarizing the details here:
With the current solution, we intend to keep partition/schema as immutable 
objects after they are created. That means, remote bootstrapping(mode: 
REPLACING_PEER) does not overwrite partition/schema if the tablet metadata was 
already initialized(state_ == initialized) as part of initial 
bootstrap(mode:NEW_PEER), and then we do not ever replace contents of 
partition/schema after that. 

This looks cleaner/simpler than locking fixes earlier and avoids 
synchronizations too for these immutable objects. However, it comes with an 
assumption that these fields are never mucked up at run time. Essentially, we 
are ignoring the contents of the superblock given by a healthy remote peer, and 
instead keeping what we had previously which may or may not have been corrupted 
in theory. I think we have room for improvisation here, but until I get a 
better grasp on the overall workflow of ConsensusMeta/WAL/TabletPeer/OnDisk 
layout, I am skeptical of introducing an invasive-change/regression.

Looking further, this line under TSTabletManager::StartTabletCopy() is the one 
which is the root-cause of this race. The TabletPeer()->meta_ takes the pointer 
to old metadata object which is tombstoned and scratches the new PB contents in 
place. 

{noformat}
    if (LookupTabletUnlocked(tablet_id, &old_tablet_peer)) {
      meta = old_tablet_peer->tablet_metadata();
      replacing_tablet = true;
    }
{noformat}

A more cleaner approach could be to keep an atomically swappable pointer inside 
TabletPeer() towards TabletMetadata object and when remote bootstrap occurs, 
copy the existing metadata onto a new object, overlay everything arrived via PB 
from remote peer onto new object, and then make the TabletPeer::meta_ pointer 
point to new metadata object and delete old one(or keep a copy on-disk for 
corruption debugging). We have enough validations along the way to check what 
has come in from wire is a healthy one so contents of PB are trustable. 
Rollback story: If we don’t succeed in resurrecting a new metadata, we go back 
to tombstoned because we would be swapping the pointer only at the final stage. 

> TSAN race in ListTablets vs tablet metadata loading
> ---------------------------------------------------
>
>                 Key: KUDU-1500
>                 URL: https://issues.apache.org/jira/browse/KUDU-1500
>             Project: Kudu
>          Issue Type: Bug
>          Components: tablet, tserver
>    Affects Versions: 0.9.0
>            Reporter: Todd Lipcon
>            Assignee: Dinesh Bhat
>              Labels: newbie
>
> {code}
> WARNING: ThreadSanitizer: data race (pid=20066)
>   Write of size 8 at 0x7d4c0000cd90 by thread T71 (mutexes: write M4355):
>     #0 std::vector<kudu::PartitionSchema::HashBucketSchema, 
> std::allocator<kudu::PartitionSchema::HashBucketSchema> 
> >::_M_erase_at_end(kudu::PartitionSchema::HashBucketSchema*) 
> /data/1/todd/kudu/thirdparty/installed-deps-tsan/gcc/include/c++/4.9.3/bits/stl_vector.h:1439:26
>  (libkudu_common.so+0x000000158983)
>     #1 std::vector<kudu::PartitionSchema::HashBucketSchema, 
> std::allocator<kudu::PartitionSchema::HashBucketSchema> >::clear() 
> /data/1/todd/kudu/thirdparty/installed-deps-tsan/gcc/include/c++/4.9.3/bits/stl_vector.h:1212:9
>  (libkudu_common.so+0x00000013d721)
>     #2 kudu::PartitionSchema::Clear() 
> /data/1/todd/kudu/build/tsan/../../src/kudu/common/partition.cc:876:3 
> (libkudu_common.so+0x00000012f8fc)
>     #3 kudu::PartitionSchema::FromPB(kudu::PartitionSchemaPB const&, 
> kudu::Schema const&, kudu::PartitionSchema*) 
> /data/1/todd/kudu/build/tsan/../../src/kudu/common/partition.cc:129:3 
> (libkudu_common.so+0x00000012f4e6)
>     #4 
> kudu::tablet::TabletMetadata::LoadFromSuperBlock(kudu::tablet::TabletSuperBlockPB
>  const&) 
> /data/1/todd/kudu/build/tsan/../../src/kudu/tablet/tablet_metadata.cc:298:7 
> (libtablet.so+0x0000003a4561)
>     #5 
> kudu::tablet::TabletMetadata::ReplaceSuperBlock(kudu::tablet::TabletSuperBlockPB
>  const&) 
> /data/1/todd/kudu/build/tsan/../../src/kudu/tablet/tablet_metadata.cc:494:3 
> (libtablet.so+0x0000003a7805)
>     #6 kudu::tserver::RemoteBootstrapClient::Start(std::string const&, 
> kudu::HostPort const&, scoped_refptr<kudu::tablet::TabletMetadata>*) 
> /data/1/todd/kudu/build/tsan/../../src/kudu/tserver/remote_bootstrap_client.cc:222:5
>  (libtserver.so+0x0000001148ca)
>     #7 
> kudu::tserver::TSTabletManager::StartRemoteBootstrap(kudu::consensus::StartRemoteBootstrapRequestPB
>  const&, boost::optional<kudu::tserver::TabletServerErrorPB_Code>*) 
> /data/1/todd/kudu/build/tsan/../../src/kudu/tserver/ts_tablet_manager.cc:423:3
>  (libtserver.so+0x0000001aa273)
>     #8 
> kudu::tserver::ConsensusServiceImpl::StartRemoteBootstrap(kudu::consensus::StartRemoteBootstrapRequestPB
>  const*, kudu::consensus::StartRemoteBootstrapResponsePB*, 
> kudu::rpc::RpcContext*) 
> /data/1/todd/kudu/build/tsan/../../src/kudu/tserver/tablet_service.cc:982:14 
> (libtserver.so+0x000000175d03)
>     #9 
> kudu::consensus::ConsensusServiceIf::ConsensusServiceIf(scoped_refptr<kudu::MetricEntity>
>  const&)::$_8::operator()(google::protobuf::Message const*, 
> google::protobuf::Message*, kudu::rpc::RpcContext*) const 
> /data/1/todd/kudu/build/tsan/src/kudu/consensus/consensus.service.cc:188:7 
> (libconsensus_proto.so+0x00000009e457)
>     #10 std::_Function_handler<void (google::protobuf::Message const*, 
> google::protobuf::Message*, kudu::rpc::RpcContext*), 
> kudu::consensus::ConsensusServiceIf::ConsensusServiceIf(scoped_refptr<kudu::MetricEntity>
>  const&)::$_8>::_M_invoke(std::_Any_data const&, google::protobuf::Message 
> const*, google::protobuf::Message*, kudu::rpc::RpcContext*) 
> /data/1/todd/kudu/thirdparty/installed-deps-tsan/gcc/include/c++/4.9.3/functional:2039:2
>  (libconsensus_proto.so+0x00000009e17a)
>     #11 std::function<void (google::protobuf::Message const*, 
> google::protobuf::Message*, 
> kudu::rpc::RpcContext*)>::operator()(google::protobuf::Message const*, 
> google::protobuf::Message*, kudu::rpc::RpcContext*) const 
> /data/1/todd/kudu/thirdparty/installed-deps-tsan/gcc/include/c++/4.9.3/functional:2439:14
>  (libkrpc.so+0x000000177997)
>     #12 kudu::rpc::GeneratedServiceIf::Handle(kudu::rpc::InboundCall*) 
> /data/1/todd/kudu/build/tsan/../../src/kudu/rpc/service_if.cc:94:3 
> (libkrpc.so+0x00000017744c)
>     #13 kudu::rpc::ServicePool::RunThread() 
> /data/1/todd/kudu/build/tsan/../../src/kudu/rpc/service_pool.cc:206:5 
> (libkrpc.so+0x00000017a650)
>     #14 boost::_mfi::mf0<void, 
> kudu::rpc::ServicePool>::operator()(kudu::rpc::ServicePool*) const 
> /usr/include/boost/bind/mem_fn_template.hpp:49:29 (libkrpc.so+0x00000017d55b)
>     #15 void boost::_bi::list1<boost::_bi::value<kudu::rpc::ServicePool*> 
> >::operator()<boost::_mfi::mf0<void, kudu::rpc::ServicePool>, 
> boost::_bi::list0>(boost::_bi::type<void>, boost::_mfi::mf0<void, 
> kudu::rpc::ServicePool>&, boost::_bi::list0&, int) 
> /usr/include/boost/bind/bind.hpp:246:9 (libkrpc.so+0x00000017d438)
>     #16 boost::_bi::bind_t<void, boost::_mfi::mf0<void, 
> kudu::rpc::ServicePool>, 
> boost::_bi::list1<boost::_bi::value<kudu::rpc::ServicePool*> > 
> >::operator()() /usr/include/boost/bind/bind_template.hpp:20:27 
> (libkrpc.so+0x00000017d39a)
>     #17 
> boost::detail::function::void_function_obj_invoker0<boost::_bi::bind_t<void, 
> boost::_mfi::mf0<void, kudu::rpc::ServicePool>, 
> boost::_bi::list1<boost::_bi::value<kudu::rpc::ServicePool*> > >, 
> void>::invoke(boost::detail::function::function_buffer&) 
> /usr/include/boost/function/function_template.hpp:153:11 
> (libkrpc.so+0x00000017d050)
>     #18 boost::function0<void>::operator()() const 
> /usr/include/boost/function/function_template.hpp:1012:12 
> (libkrpc.so+0x0000000de7ca)
>     #19 kudu::Thread::SuperviseThread(void*) 
> /data/1/todd/kudu/build/tsan/../../src/kudu/util/thread.cc:586:3 
> (libkudu_util.so+0x0000002c1549)
>   Previous read of size 8 at 0x7d4c0000cd90 by thread T31:
>     #0 std::vector<kudu::PartitionSchema::HashBucketSchema, 
> std::allocator<kudu::PartitionSchema::HashBucketSchema> >::size() const 
> /data/1/todd/kudu/thirdparty/installed-deps-tsan/gcc/include/c++/4.9.3/bits/stl_vector.h:655:40
>  (libtserver.so+0x00000010d296)
>     #1 kudu::PartitionSchema::ToPB(kudu::PartitionSchemaPB*) const 
> /data/1/todd/kudu/build/tsan/../../src/kudu/common/partition.cc:164:46 
> (libkudu_common.so+0x00000013050c)
>     #2 
> kudu::tserver::TabletServiceImpl::ListTablets(kudu::tserver::ListTabletsRequestPB
>  const*, kudu::tserver::ListTabletsResponsePB*, kudu::rpc::RpcContext*) 
> /data/1/todd/kudu/build/tsan/../../src/kudu/tserver/tablet_service.cc:1110:5 
> (libtserver.so+0x00000017e82f)
>     #3 
> kudu::tserver::TabletServerServiceIf::TabletServerServiceIf(scoped_refptr<kudu::MetricEntity>
>  const&)::$_4::operator()(google::protobuf::Message const*, 
> google::protobuf::Message*, kudu::rpc::RpcContext*) const 
> /data/1/todd/kudu/build/tsan/src/kudu/tserver/tserver_service.service.cc:118:7
>  (libtserver_service_proto.so+0x000000027997)
>     #4 std::_Function_handler<void (google::protobuf::Message const*, 
> google::protobuf::Message*, kudu::rpc::RpcContext*), 
> kudu::tserver::TabletServerServiceIf::TabletServerServiceIf(scoped_refptr<kudu::MetricEntity>
>  const&)::$_4>::_M_invoke(std::_Any_data const&, google::protobuf::Message 
> const*, google::protobuf::Message*, kudu::rpc::RpcContext*) 
> /data/1/todd/kudu/thirdparty/installed-deps-tsan/gcc/include/c++/4.9.3/functional:2039:2
>  (libtserver_service_proto.so+0x0000000276ba)
>     #5 std::function<void (google::protobuf::Message const*, 
> google::protobuf::Message*, 
> kudu::rpc::RpcContext*)>::operator()(google::protobuf::Message const*, 
> google::protobuf::Message*, kudu::rpc::RpcContext*) const 
> /data/1/todd/kudu/thirdparty/installed-deps-tsan/gcc/include/c++/4.9.3/functional:2439:14
>  (libkrpc.so+0x000000177997)
>     #6 kudu::rpc::GeneratedServiceIf::Handle(kudu::rpc::InboundCall*) 
> /data/1/todd/kudu/build/tsan/../../src/kudu/rpc/service_if.cc:94:3 
> (libkrpc.so+0x00000017744c)
>     #7 kudu::rpc::ServicePool::RunThread() 
> /data/1/todd/kudu/build/tsan/../../src/kudu/rpc/service_pool.cc:206:5 
> (libkrpc.so+0x00000017a650)
>     #8 boost::_mfi::mf0<void, 
> kudu::rpc::ServicePool>::operator()(kudu::rpc::ServicePool*) const 
> /usr/include/boost/bind/mem_fn_template.hpp:49:29 (libkrpc.so+0x00000017d55b)
>     #9 void boost::_bi::list1<boost::_bi::value<kudu::rpc::ServicePool*> 
> >::operator()<boost::_mfi::mf0<void, kudu::rpc::ServicePool>, 
> boost::_bi::list0>(boost::_bi::type<void>, boost::_mfi::mf0<void, 
> kudu::rpc::ServicePool>&, boost::_bi::list0&, int) 
> /usr/include/boost/bind/bind.hpp:246:9 (libkrpc.so+0x00000017d438)
>     #10 boost::_bi::bind_t<void, boost::_mfi::mf0<void, 
> kudu::rpc::ServicePool>, 
> boost::_bi::list1<boost::_bi::value<kudu::rpc::ServicePool*> > 
> >::operator()() /usr/include/boost/bind/bind_template.hpp:20:27 
> (libkrpc.so+0x00000017d39a)
>     #11 
> boost::detail::function::void_function_obj_invoker0<boost::_bi::bind_t<void, 
> boost::_mfi::mf0<void, kudu::rpc::ServicePool>, 
> boost::_bi::list1<boost::_bi::value<kudu::rpc::ServicePool*> > >, 
> void>::invoke(boost::detail::function::function_buffer&) 
> /usr/include/boost/function/function_template.hpp:153:11 
> (libkrpc.so+0x00000017d050)
>     #12 boost::function0<void>::operator()() const 
> /usr/include/boost/function/function_template.hpp:1012:12 
> (libkrpc.so+0x0000000de7ca)
>     #13 kudu::Thread::SuperviseThread(void*) 
> /data/1/todd/kudu/build/tsan/../../src/kudu/util/thread.cc:586:3 
> (libkudu_util.so+0x0000002c1549)
> {code}
> triggered by looping RaftConsensusITest.TestCorruptReplicaMetadata under TSAN



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to