[ 
https://issues.apache.org/jira/browse/KUDU-2708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16774539#comment-16774539
 ] 

Todd Lipcon commented on KUDU-2708:
-----------------------------------

KUDU-2204 has some earlier explorations of this issue

> Possible contention creating temporary files while flushing cmeta during an 
> election storm
> ------------------------------------------------------------------------------------------
>
>                 Key: KUDU-2708
>                 URL: https://issues.apache.org/jira/browse/KUDU-2708
>             Project: Kudu
>          Issue Type: Improvement
>            Reporter: Will Berkeley
>            Priority: Major
>
> Doing investigation into consensus queue overflows that happen under heavy 
> write load, I noticed 6/10 service threads at the time of overflow have 
> stacks like
> {noformat}
> 0x3b6720f710 <unknown>
>            0x1fb900a base::internal::SpinLockDelay()
>            0x1fb8ea7 base::SpinLock::SlowLock()
>             0xb82e25 kudu::consensus::RaftConsensus::RequestVote()
>             0x931555 
> kudu::tserver::ConsensusServiceImpl::RequestConsensusVote()
>            0x1e28a2c kudu::rpc::GeneratedServiceIf::Handle()
>            0x1e2935a kudu::rpc::ServicePool::RunThread()
>            0x1f9bd91 kudu::Thread::SuperviseThread()
>         0x3b672079d1 start_thread
>         0x3b66ee88fd clone
> {noformat}
> They are waiting on some tablet's Raft consensus instance's {{lock_}} in 
> order to vote. Looking into what might be holding that lock, I see stacks like
> {noformat}
> 0x3b6720f710 <unknown>
>         0x3b66edb2ed __GI_open64
>         0x3b66e63caa __gen_tempname
>            0x1f1cf35 kudu::(anonymous namespace)::PosixEnv::MkTmpFile()
>            0x1f1f662 kudu::(anonymous namespace)::PosixEnv::NewTempRWFile()
>            0x1f8305e kudu::pb_util::WritePBContainerToPath()
>             0xb47932 kudu::consensus::ConsensusMetadata::Flush()
>             0xb74164 
> kudu::consensus::RaftConsensus::SetVotedForCurrentTermUnlocked()
>             0xb783aa 
> kudu::consensus::RaftConsensus::RequestVoteRespondVoteGranted()
>             0xb836a1 kudu::consensus::RaftConsensus::RequestVote()
>             0x931555 
> kudu::tserver::ConsensusServiceImpl::RequestConsensusVote()
>            0x1e28a2c kudu::rpc::GeneratedServiceIf::Handle()
>            0x1e2935a kudu::rpc::ServicePool::RunThread()
>            0x1f9bd91 kudu::Thread::SuperviseThread()
>         0x3b672079d1 start_thread
>         0x3b66ee88fd clone
> {noformat}
> Doing some junior spelunking into glibc code, one hypothesis is that we are 
> generating lots of collisions of proposed temporary file names in the cmeta 
> folder because many threads are attempting to flush cmeta at once. The glibc 
> code looks like
> Maybe we could put the thread id into the temporary file name when a thread 
> does a cmeta flush.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to