[
https://issues.apache.org/jira/browse/KUDU-2244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Grant Henke updated KUDU-2244:
------------------------------
Labels: performance (was: )
> spinlock contention in raft_consensus
> -------------------------------------
>
> Key: KUDU-2244
> URL: https://issues.apache.org/jira/browse/KUDU-2244
> Project: Kudu
> Issue Type: Improvement
> Components: consensus
> Reporter: Andrew Wong
> Priority: Major
> Labels: performance
>
> I was going through the logs of a cluster that was seeing a bunch of
> kernel_stack_watchdog traces, and the slowness seemed to be caused by a lot
> of activity in consensus requests. E.g.
> W1214 18:57:29.514219 36138 kernel_stack_watchdog.cc:145] Thread 36317 stuck
> at
> /data/jenkins/workspace/generic-package-centos64-7-0-impala/topdir/BUILD/kudu-1.3.0-cdh5.11.0/src/kudu/rpc/outbound_call.cc:192
> for 123ms:
> Kernel stack:
> [<ffffffff810b7ef5>] sys_sched_yield+0x65/0xd0
> [<ffffffff81645909>] system_call_fastpath+0x16/0x1b
> [<ffffffffffffffff>] 0xffffffffffffffff
> User stack:
> @ 0x7f72fab92057 __GI___sched_yield
> @ 0x19498bf kudu::Thread::StartThread()
> @ 0x1952e7d kudu::ThreadPool::CreateThreadUnlocked()
> @ 0x19534d3 kudu::ThreadPool::Submit()
> @ 0x1953a27 kudu::ThreadPool::SubmitFunc()
> @ 0x1953ecb kudu::ThreadPool::SubmitClosure()
> @ 0x9c94ec kudu::consensus::RaftConsensus::ElectionCallback()
> @ 0x9e6032 kudu::consensus::LeaderElection::CheckForDecision()
> @ 0x9e78c3
> kudu::consensus::LeaderElection::VoteResponseRpcCallback()
> @ 0xa8b137 kudu::rpc::OutboundCall::CallCallback()
> @ 0xa8c2bc kudu::rpc::OutboundCall::SetResponse()
> @ 0xa822c0 kudu::rpc::Connection::HandleCallResponse()
> @ 0xa83ffc ev::base<>::method_thunk<>()
> @ 0x198a07f ev_invoke_pending
> @ 0x198af71 ev_run
> @ 0xa5e049 kudu::rpc::ReactorThread::RunThread()
> So it seemed to be cause by some slowness in getting threads. Upon perusing
> the logs a bit more, there were a sizable number of spinlock profiling traces:
> W1214 18:54:27.897955 36379 rpcz_store.cc:238] Trace:
> 1214 18:54:26.766922 (+ 0us) service_pool.cc:143] Inserting onto call
> queue
> 1214 18:54:26.771135 (+ 4213us) service_pool.cc:202] Handling call
> 1214 18:54:26.771138 (+ 3us) raft_consensus.cc:1126] Updating replica for
> 0 ops
> 1214 18:54:27.897699 (+1126561us) raft_consensus.cc:1165] Early marking
> committed up to index 0
> 1214 18:54:27.897700 (+ 1us) raft_consensus.cc:1170] Triggering prepare
> for 0 ops
> 1214 18:54:27.897701 (+ 1us) raft_consensus.cc:1282] Marking committed up
> to 1766
> 1214 18:54:27.897702 (+ 1us) raft_consensus.cc:1332] Filling consensus
> response to leader.
> 1214 18:54:27.897736 (+ 34us) spinlock_profiling.cc:255] Waited 991 ms on
> lock 0x120b3540. stack: 00000000019406c5 00000000009c60d7 00000000009c75f7
> 00000000007dc628 0000000000a7adfc 0000000000a7b9cd 000000000194d059
> 00007f72fbcc2dc4 00007f72fabad1cc ffffffffffffffff
> 1214 18:54:27.897737 (+ 1us) raft_consensus.cc:1327] UpdateReplicas()
> finished
> 1214 18:54:27.897741 (+ 4us) inbound_call.cc:130] Queueing success
> response
> Metrics: {"spinlock_wait_cycles":2478395136}
> Each of the traces noted on the order of 500-1000ms of waiting on spinlocks.
> Upon looking at raft_consensus.cc, it seems we're holding a spinlock
> (update_lock_) while we call RaftConsensus::UpdateReplica(), which according
> to its header, "won't return until all operations have been stored in the log
> and all Prepares() have been completed". While locking may be necessary, it's
> worth considering using a different kind of lock here.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)