Will Berkeley created KUDU-2727:
-----------------------------------

             Summary: Contention on the Raft consensus lock can cause tablet 
service queue overflows
                 Key: KUDU-2727
                 URL: https://issues.apache.org/jira/browse/KUDU-2727
             Project: Kudu
          Issue Type: Improvement
            Reporter: Will Berkeley


Here's stacks illustrating the phenomenon:

{noformat}
  tids=[2201]
        0x379ba0f710 <unknown>
           0x1fb951a base::internal::SpinLockDelay()
           0x1fb93b7 base::SpinLock::SlowLock()
            0xb4e68e kudu::consensus::Peer::SignalRequest()
            0xb9c0df kudu::consensus::PeerManager::SignalRequest()
            0xb8c178 kudu::consensus::RaftConsensus::Replicate()
            0xaab816 kudu::tablet::TransactionDriver::Prepare()
            0xaac0ed kudu::tablet::TransactionDriver::PrepareTask()
           0x1fa37ed kudu::ThreadPool::DispatchThread()
           0x1f9c2a1 kudu::Thread::SuperviseThread()
        0x379ba079d1 start_thread
        0x379b6e88fd clone
  tids=[4515]
        0x379ba0f710 <unknown>
           0x1fb951a base::internal::SpinLockDelay()
           0x1fb93b7 base::SpinLock::SlowLock()
            0xb74c60 kudu::consensus::RaftConsensus::NotifyCommitIndex()
            0xb59307 kudu::consensus::PeerMessageQueue::NotifyObserversTask()
            0xb54058 
_ZN4kudu8internal7InvokerILi2ENS0_9BindStateINS0_15RunnableAdapterIMNS_9consensus16PeerMessageQueueEFvRKSt8functionIFvPNS4_24PeerMessageQueueObserverEEEEEEFvPS5_SC_EFvNS0_17UnretainedWrapperIS5_EEZNS5_34NotifyObserversOfCommitIndexChangeElEUlS8_E_EEESH_E3RunEPNS0_13BindStateBaseE
           0x1fa37ed kudu::ThreadPool::DispatchThread()
           0x1f9c2a1 kudu::Thread::SuperviseThread()
        0x379ba079d1 start_thread
        0x379b6e88fd clone
  tids=[22185,22194,22193,22188,22187,22186]
        0x379ba0f710 <unknown>
           0x1fb951a base::internal::SpinLockDelay()
           0x1fb93b7 base::SpinLock::SlowLock()
            0xb8bff8 
kudu::consensus::RaftConsensus::CheckLeadershipAndBindTerm()
            0xaaaef9 kudu::tablet::TransactionDriver::ExecuteAsync()
            0xaa3742 kudu::tablet::TabletReplica::SubmitWrite()
            0x92812d kudu::tserver::TabletServiceImpl::Write()
           0x1e28f3c kudu::rpc::GeneratedServiceIf::Handle()
           0x1e2986a kudu::rpc::ServicePool::RunThread()
           0x1f9c2a1 kudu::Thread::SuperviseThread()
        0x379ba079d1 start_thread
        0x379b6e88fd clone
  tids=[22192,22191]
        0x379ba0f710 <unknown>
           0x1fb951a base::internal::SpinLockDelay()
           0x1fb93b7 base::SpinLock::SlowLock()
           0x1e13dec kudu::rpc::ResultTracker::TrackRpc()
           0x1e28ef5 kudu::rpc::GeneratedServiceIf::Handle()
           0x1e2986a kudu::rpc::ServicePool::RunThread()
           0x1f9c2a1 kudu::Thread::SuperviseThread()
        0x379ba079d1 start_thread
        0x379b6e88fd clone
  tids=[4426]
        0x379ba0f710 <unknown>
           0x206d3d0 <unknown>
           0x212fd25 google::protobuf::Message::SpaceUsedLong()
           0x211dee4 
google::protobuf::internal::GeneratedMessageReflection::SpaceUsedLong()
            0xb6658e kudu::consensus::LogCache::AppendOperations()
            0xb5c539 kudu::consensus::PeerMessageQueue::AppendOperations()
            0xb5c7c7 kudu::consensus::PeerMessageQueue::AppendOperation()
            0xb7c675 
kudu::consensus::RaftConsensus::AppendNewRoundToQueueUnlocked()
            0xb8c147 kudu::consensus::RaftConsensus::Replicate()
            0xaab816 kudu::tablet::TransactionDriver::Prepare()
            0xaac0ed kudu::tablet::TransactionDriver::PrepareTask()
           0x1fa37ed kudu::ThreadPool::DispatchThread()
           0x1f9c2a1 kudu::Thread::SuperviseThread()
        0x379ba079d1 start_thread
        0x379b6e88fd clone
{noformat}

{{kudu::consensus::RaftConsensus::CheckLeadershipAndBindTerm()}} needs to take 
the lock to check the term and the Raft role. When many RPCs come in for the 
same tablet, the contention can hog service threads and cause queue overflows 
on busy systems.

Yugabyte switched their equivalent lock to be an atomic that allows them to 
read the term and role wait-free.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to