Hello Alexey Serbin, Dan Burkert, Kudu Jenkins, Adar Dembo,

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/9753

to look at the new patch set (#3).

Change subject: KUDU-1977. Avoid extra reference counting for fast-path 
MetaCache lookups
......................................................................

KUDU-1977. Avoid extra reference counting for fast-path MetaCache lookups

This does a bit of refactoring of MetaCache so that fast-path lookups
can occur without allocating a LookupRpc object. This avoids an
allocation/deallocation pair as well as reference counting overhead on
various objects. The reference counting was seen as a bottleneck in a
workload like:

$ kudu perf loadgen -num-threads 16 \
  -num-rows-per-thread $[10*1000*1000] -table-num-replicas 3 \
  -table-num-buckets=100 <master>

I ran before/after on the above command writing into a large (100+-node
cluster) that I had handy. A few key stats:

Before:

  time total  : 90676.3 ms
    1458961.567442      task-clock (msec)         #   15.954 CPUs utilized

  Top lines of profile:

    27.21%  kudu             kudu                 [.] 
kudu::client::internal::MetaCache::LookupTabletByKeyFastPath
    14.95%  kudu             kudu                 [.] 
kudu::subtle::RefCountedThreadSafeBase::AddRef
    12.52%  kudu             kudu                 [.] 
kudu::subtle::RefCountedThreadSafeBase::Release
     4.75%  kudu             kudu                 [.] 
kudu::client::internal::MetaCache::LookupTabletByKey
     3.04%  kudu             kudu                 [.] operator new[]
     1.78%  rpc reactor-387  kudu                 [.] 
std::unordered_set<InFlightOp*>::erase

After:

  time total  : 60796.6 ms
     693975.355549      task-clock (msec)         #   11.154 CPUs utilized
  Top lines of profile:
    14.05%  kudu             kudu                   [.] 
kudu::client::internal::MetaCache::LookupTabletByKeyFastPath
     7.89%  kudu             kudu                   [.] operator new[]
     4.96%  rpc reactor-397  kudu                   [.] 
std::unordered_set<InFlightOp*>::erase
     4.47%  kudu             libc-2.17.so           [.] __memcpy_ssse3_back
     3.42%  rpc reactor-397  kudu                   [.] 
kudu::client::KuduInsert::~KuduInsert

In other words, this reduced CPU by more than 2x and throughput by about 50%.
Results on the above were relatively stable across multiple runs, with
wall-clock showing more variance than CPU-time.

The new bottleneck seems to be the rw_spinlock in MetaCache.

Change-Id: I5f17ffa88289c766b5b168b22da9781bf78f5592
---
M src/kudu/client/batcher.cc
M src/kudu/client/client-test.cc
M src/kudu/client/meta_cache.cc
M src/kudu/client/meta_cache.h
M src/kudu/client/partitioner-internal.cc
M src/kudu/client/scan_token-internal.cc
M src/kudu/client/scanner-internal.cc
M src/kudu/client/schema.h
8 files changed, 140 insertions(+), 98 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/53/9753/3
--
To view, visit http://gerrit.cloudera.org:8080/9753
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5f17ffa88289c766b5b168b22da9781bf78f5592
Gerrit-Change-Number: 9753
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Dan Burkert <[email protected]>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <[email protected]>

Reply via email to