Alexey Serbin has uploaded a new patch set (#8) to the change originally 
created by Todd Lipcon. ( http://gerrit.cloudera.org:8080/5905 )

Change subject: KUDU-1865 (part 1): reduce some cross-thread allocations
......................................................................

KUDU-1865 (part 1): reduce some cross-thread allocations

Per the analysis in the JIRA, each RPC caused two "cross-thread"
allocations of ReactorTasks (one in the client, one in the server).
These cross-thread allocations harm tcmalloc caching.

ReactorTasks don't actually need to be heap-allocated -- that was only
an easy mechanism to use a normal-looking "interface" paradigm. Instead,
if we use a struct with some std::functions in it, and std::move() it
to/from the pending tasks container, we avoid the heap allocation. More
importantly, we avoid the worst kind of heap allocation which is
allocated on one thread and freed on another.

Below are results from various RPC benchmark tests.  The improvement
is minuscule (less than 1%) for synthetic tests scenarios simulating
real-world concurrent workloads.

========================================================================
Using boost::function in ReactorTask
========================================================================

CentOS 7.5, RELEASE configuration built with gcc (GCC) 4.8.5

Without patch:
  GetTableLocations PRC:  29902.1 req/sec
  GetTableSchema RPC   : 207100.0 req/sec

With patch:
  GetTableLocations PRC:  30202.9 req/sec
  GetTableSchema RPC   : 207674.0 req/sec

------------------------------------------------------------------------

Results from rpc-bench, 60 seconds runtime
  Mode:            Sync
  Client threads:   16
  Worker threads:   1
  Server reactors:  4
  Encryption:       0

Without patch:
  Reqs/sec:         155254
  User CPU per req: 31.7217us
  Sys CPU per req:  51.3626us
  Ctx Sw. per req:  3.46345

With patch:
  Reqs/sec:         155015
  User CPU per req: 32.9851us
  Sys CPU per req:  51.1958us
  Ctx Sw. per req:  3.48085

========================================================================
Using std::function in ReactorTask
========================================================================

GetTableLocations RPC (RELEASE build with third-party CLANG, 60 seconds):
  Without patch: 15528.3 req/sec
  With    patch: 15535.4 req/sec

------------------------------------------------------------------------

GetTableSchema RPC (RELEASE build with third-party CLANG, 60 seconds):
  Without patch: 103892 req/sec
  With    patch: 102711 req/sec

------------------------------------------------------------------------

Results from rpc-bench, 60 seconds runtime
  Mode:            Sync
  Client threads:   16
  Worker threads:   1
  Server reactors:  4
  Encryption:       0
(RELEASE build with third-party CLANG):

Without patch:
  Reqs/sec:         68191.4
  User CPU per req: 34.8228us
  Sys CPU per req:  172.426us
  Ctx Sw. per req:  3.58111
  Server reactor load histogram
  Count: 11980
  Mean: 36.4902
  Percentiles:
     0%  (min) = 16
    25%        = 26
    50%  (med) = 28
    75%        = 31
    95%        = 78
    99%        = 84
    99.9%      = 88
    99.99%     = 90
    100% (max) = 91

  Server reactor latency histogram
  Count: 13517690
  Mean: 29.8539
  Percentiles:
     0%  (min) = 0
    25%        = 18
    50%  (med) = 28
    75%        = 36
    95%        = 58
    99%        = 86
    99.9%      = 135
    99.99%     = 191
    100% (max) = 28239

------------------------------------------------------------------------

With patch:
  Reqs/sec:         67295.2
  User CPU per req: 37.7613us
  Sys CPU per req:  171.141us
  Ctx Sw. per req:  3.59101
  Server reactor load histogram
  Count: 11979
  Mean: 36.1685
  Percentiles:
     0%  (min) = 16
    25%        = 26
    50%  (med) = 28
    75%        = 32
    95%        = 80
    99%        = 87
    99.9%      = 91
    99.99%     = 93
    100% (max) = 93
  Server reactor latency histogram
  Count: 13141249
  Mean: 30.3945
  Percentiles:
     0%  (min) = 0
    25%        = 18
    50%  (med) = 29
    75%        = 37
    95%        = 58
    99%        = 86
    99.9%      = 141
    99.99%     = 204
    100% (max) = 26220

Change-Id: I7d4d5f14fb302196b1797c712b21cfce81f157c1
---
M src/kudu/rpc/connection.cc
M src/kudu/rpc/connection.h
M src/kudu/rpc/messenger.cc
M src/kudu/rpc/messenger.h
M src/kudu/rpc/negotiation.cc
M src/kudu/rpc/reactor.cc
M src/kudu/rpc/reactor.h
7 files changed, 184 insertions(+), 283 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/05/5905/8
--
To view, visit http://gerrit.cloudera.org:8080/5905
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7d4d5f14fb302196b1797c712b21cfce81f157c1
Gerrit-Change-Number: 5905
Gerrit-PatchSet: 8
Gerrit-Owner: Todd Lipcon <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Bankim Bhavsar <[email protected]>
Gerrit-Reviewer: David Ribeiro Alves <[email protected]>
Gerrit-Reviewer: Grant Henke <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Michael Ho <[email protected]>
Gerrit-Reviewer: Sailesh Mukil <[email protected]>
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Todd Lipcon <[email protected]>

Reply via email to