Hello Kurt Deschler, Kudu Jenkins, Abhishek Chennaka, Joe McDonnell, Csaba 
Ringhofer, Michael Smith,

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/24176

to look at the new patch set (#4).

Change subject: WIP [rpc] in-bulk memory recycling for 
Connection::ProcessOutboundTransfers()
......................................................................

WIP [rpc] in-bulk memory recycling for Connection::ProcessOutboundTransfers()

WIP:
  * add a test/perf scenario if not covered by rpc-bench and similar?
  * provide perf report from rpc-bench (or from the newly added test)
    before and after the update

While troubleshooting RPC performance issues in a highly concurrent
workload, I noticed a pattern of lock contention in tcmalloc.  Among
the captured stack traces, multiple reactor threads in a single pstack
snapshot often had stack traces similar to the one below.

For more context, the majority of RPCs had relatively large side-cars
and a single RPC connection often had a multitude of outgoing in-flight
transfers.  The socket buffer size was around 128MB at the OS level.

It seems the issue manifests itself when many concurrent allocations
and deallocations go through the tcmalloc's central free list, while
the latter is guarded by a lock.  There might be multiple reasons why
it happens: see [1].  Regardless of the underlying reasons, a reactor
thread is more efficient if performing as much I/O as possible at once
without the risk of waiting on a synchronization primitive and then
being de-scheduled off the CPU while there is still data ready to be
written into a socket whose buffer isn't full yet.

This patch is an attempt to reduce the described lock contention by
performing pending socket I/O for all the outgoing transfers first, and
deallocating the memory in-bulk after completing all the pending write
activity.  In addition, it straightens memory ownership rules for the
OutboundTransfer::callbacks_ field and modernizes signatures of related
methods to use std::unique_ptr instead of raw pointers.

[1] https://gperftools.github.io/gperftools/tcmalloc.html

  #0  sys_futex (... <tcmalloc::Static::pageheap_lock_>)
  #1  base::internal::SpinLockDelay (...)
  #2  base::internal::SpinLockDelay (...)
  #3  SpinLock::SlowLock() ()
  #4  tcmalloc::CentralFreeList::ReleaseToSpans(...) ()
  #5  tcmalloc::CentralFreeList::ReleaseListToSpans(...) ()
  #6  tcmalloc::CentralFreeList::InsertRange(...) ()
  #7  tcmalloc::ThreadCache::ReleaseToCentralCache(...) ()
  #8  tcmalloc::ThreadCache::ListTooLong(...) ()
  #9  google::protobuf::internal::ArenaImpl::~ArenaImpl() ()
  #10 google::protobuf::Arena::~Arena ()
  #11 kudu::rpc::InboundCall::~InboundCall (...)
  #12 std::default_delete<kudu::rpc::InboundCall>::operator()
  #13 std::unique_ptr<kudu::rpc::InboundCall, ...>::~unique_ptr (...)
  #14 kudu::rpc::ResponseTransferCallbacks::~ResponseTransferCallbacks
  #15 kudu::rpc::ResponseTransferCallbacks::~ResponseTransferCallbacks (...)
  #16 kudu::rpc::ResponseTransferCallbacks::NotifyTransferFinished (...)
  #17 kudu::rpc::OutboundTransfer::SendBuffer (...)
  #18 kudu::rpc::Connection::ProcessOutboundTransfers (...)
  #19 kudu::rpc::Connection::QueueOutbound (...)
  #20 kudu::rpc::QueueTransferTask::Run (...)
  #21 kudu::rpc::ReactorThread::AsyncHandler (...)
  #22 ev_invoke_pending ()
  #23 kudu::rpc::ReactorThread::InvokePendingCb (...)
  #24 ev_run ()
  #25 ev::loop_ref::run (...)
  #26 kudu::rpc::ReactorThread::RunThread (...)

Change-Id: Idf7ab105a851ef4d583efc2d1b33d57607810df0
---
M src/kudu/rpc/connection.cc
M src/kudu/rpc/transfer.cc
M src/kudu/rpc/transfer.h
3 files changed, 100 insertions(+), 56 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/76/24176/4
--
To view, visit http://gerrit.cloudera.org:8080/24176
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Idf7ab105a851ef4d583efc2d1b33d57607810df0
Gerrit-Change-Number: 24176
Gerrit-PatchSet: 4
Gerrit-Owner: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Abhishek Chennaka <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Joe McDonnell <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Kurt Deschler <[email protected]>
Gerrit-Reviewer: Michael Smith <[email protected]>

Reply via email to