Alexey Serbin has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/24176


Change subject: WIP [rpc] in-bulk memory recycling for 
Connection::ProcessOutboundTransfers()
......................................................................

WIP [rpc] in-bulk memory recycling for Connection::ProcessOutboundTransfers()

WIP:
  * collect the initial feedback
  * add a test/perf scenario if not covered by rpc-bench and similar?
  * provide perf report from rpc-bench (or from the newly added test)
    before and after the update

While troubleshooting RPC performance issues in a highly concurrent
workload, I noticed a pattern of lock contention in tcmalloc.  For more
context, the majority of RPCs had large side-cars and a single RPC
connection often had a multitude of outgoing in-flight transfers.

Among the captured stack traces, multiple reactor threads in a single
pstack snapshot often had stack traces similar to the one below.

It seems the problem stems from the fact that allocation/deallocation
of large (size > 256KByte) memory chunks in tcmalloc often goes through
the central free list, while the latter one is guarded by a lock.

This patch is an attempt to reduce the lock contention by performing
the socket I/O for all the outgoing transfers with pending data first,
and performing memory deallocation after that in bulk.  In addition,
it straightens the memory ownership rules for the
OutboundTransfer::callbacks_ field and modernized signature of the
related methods to return std::unique_ptr instead of raw pointers.

  #0  sys_futex (... <tcmalloc::Static::pageheap_lock_>)
  #1  base::internal::SpinLockDelay (...)
  ...
  #7  0x00000000038c7ee1 in tcmalloc::ThreadCache::ReleaseToCentralCache(...) ()
  #8  0x00000000038c8505 in tcmalloc::ThreadCache::ListTooLong(...) ()
  #9  0x00000000038d9103 in google::protobuf::internal::ArenaImpl::~ArenaImpl() 
()
  #10 0x0000000002043eec in google::protobuf::Arena::~Arena ()
  #11 kudu::rpc::InboundCall::~InboundCall (...)
  ...
  #16 kudu::rpc::ResponseTransferCallbacks::NotifyTransferFinished (...)
  #17 0x00000000020748be in kudu::rpc::OutboundTransfer::SendBuffer (...)
  #18 0x0000000002077de9 in kudu::rpc::Connection::ProcessOutboundTransfers 
(...)
  #19 0x0000000002078440 in kudu::rpc::Connection::QueueOutbound (...)
  #20 0x000000000207ade1 in kudu::rpc::QueueTransferTask::Run (...)
  #21 0x0000000002052b40 in kudu::rpc::ReactorThread::AsyncHandler (...)
  #22 0x000000000376e063 in ev_invoke_pending ()
  #23 0x000000000204f5a1 in kudu::rpc::ReactorThread::InvokePendingCb (...)
  #24 0x00000000037713b5 in ev_run ()
  #25 0x00000000020508eb in ev::loop_ref::run (...)
  #26 kudu::rpc::ReactorThread::RunThread (...)

Change-Id: Idf7ab105a851ef4d583efc2d1b33d57607810df0
---
M src/kudu/rpc/connection.cc
M src/kudu/rpc/transfer.cc
M src/kudu/rpc/transfer.h
3 files changed, 89 insertions(+), 45 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/76/24176/1
--
To view, visit http://gerrit.cloudera.org:8080/24176
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Idf7ab105a851ef4d583efc2d1b33d57607810df0
Gerrit-Change-Number: 24176
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin <[email protected]>

Reply via email to