Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/2794

to look at the new patch set (#3).

Change subject: KUDU-1410 (part 1). Add per-Trace counters
......................................................................

KUDU-1410 (part 1). Add per-Trace counters

This adds a map of counters to each Trace object and provides a nice macro to
increment a given counter on the current trace.

The goal here is to make it easy to collect per-request metrics without plumbing
structs through the call stack or new sets of thread-locals. This mechanism
also makes it easy to hook up generic pieces of code such as threadpools and
locking primitives where plumbing a metrics structure through the call stack
is quite impractical.

To keep things fast, the map is keyed by const char* pointers which use pointer
comparison rather than string comparison - this avoids having to do any hashing
on the increment path. We could probably improve this further (eg using a
different more-optimized map type) if we start seeing it on a profile.

One wrinkle about using 'const char*' pointers is that it means the keys have
to live forever. The class provides a 'string interning' implementation which
allows strings generated at runtime to be used.  This is used in particular for
threadpool-specific counters, which we can bound as a relatively small set.

I also sprinkled various counters around the code that I thought might be
useful, including:

- counters for spinlock and mutex wait times
- counters for cpu-time, wall-time, and queue-time for each threadpool
- counters for disk reads, so it's easy to see if a request was
  blocked on seeky disk accesses
- counters for various pieces of the tablet code which have been
  responsible for slow RPCs in the past

The RPC layer is augmented slightly to log an INFO level message for
any request which takes longer than a second. This should be conservative
enough that we don't get a lot of spew, but still interesting to look
at when we have slowness.

A typical trace looks something like:

I0411 23:47:48.600158 23021 inbound_call.cc:230] Call
  kudu.tserver.TabletServerService.Write from 127.0.0.1:41490 (request call id 
2029711) took 1125ms.
  Request Metrics:
{ "child_traces" : [ { "apply.queue_time_us" : 390,
        "cfile_cache_hit" : 5,
        "cfile_cache_hit_bytes" : 20510,
        "num_ops" : 1,
        "prepare.queue_time_us" : 8,
        "prepare.run_cpu_time_us" : 20,
        "prepare.run_wall_time_us" : 29
      } ] }

(pretty-printed using jsonlint -f -- the actual log is compact)

Change-Id: I86001778976f01b62342f63f98b62962bc74212f
---
M src/kudu/cfile/cfile_reader.cc
M src/kudu/consensus/raft_consensus.cc
M src/kudu/consensus/raft_consensus_state.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/rpc/inbound_call.cc
M src/kudu/rpc/negotiation.cc
M src/kudu/rpc/rpc_context.cc
M src/kudu/tablet/deltafile.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet_peer.cc
M src/kudu/tablet/transactions/transaction_driver.cc
M src/kudu/tablet/transactions/transaction_driver.h
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/util/CMakeLists.txt
M src/kudu/util/mutex.cc
M src/kudu/util/spinlock_profiling-test.cc
M src/kudu/util/spinlock_profiling.cc
M src/kudu/util/stopwatch.h
M src/kudu/util/threadpool-test.cc
M src/kudu/util/threadpool.cc
M src/kudu/util/threadpool.h
M src/kudu/util/trace-test.cc
M src/kudu/util/trace.cc
M src/kudu/util/trace.h
A src/kudu/util/trace_metrics.cc
A src/kudu/util/trace_metrics.h
26 files changed, 373 insertions(+), 35 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/94/2794/3
-- 
To view, visit http://gerrit.cloudera.org:8080/2794
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I86001778976f01b62342f63f98b62962bc74212f
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: Jean-Daniel Cryans
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <[email protected]>

Reply via email to