[Impala-ASF-CR] IMPALA-4187: Switch RPC latency metrics to histograms

2016-09-26 Thread Henry Robinson (Code Review)
Hello Juan Yu, Matthew Jacobs, Dan Hecht,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4516

to look at the new patch set (#4).

Change subject: IMPALA-4187: Switch RPC latency metrics to histograms
..

IMPALA-4187: Switch RPC latency metrics to histograms

It's usually better to measure latency distributions with histograms,
not avg / min / max. This change switches out the metrics that measure
the latency of RPC processing time to histograms.

Add HistogramMetric::Reset() to allow histogram to remove all
entries. Added spinlock around histogram access.

On a 8-core i7 @ 3.4GHz, the following throughputs were observed:

1 thread -> 25M updates/sec
4 threads -> 7M updates/sec
16 threads -> 5M updates/sec

Each histogram takes ~108KB of storage for its buckets. This can be
reduced by reducing the maximum value, currently 60 minutes.

The new metrics have the following text representation:

Count: 148, 25th %-ile: 0, 50th %-ile: 0, 75th %-ile: 0, 90th %-ile: 0,
95th %-ile: 0, 99.9th %-ile: 1ms

Which changes from the old:

count: 345, last: 6ms, min: 0, max: 12ms, mean: 1ms, stddev: 1ms

Change-Id: I9ba6d4270dd5676eeeff35ad8d9dc5dcddd95e34
---
M be/src/rpc/rpc-trace.cc
M be/src/rpc/rpc-trace.h
M be/src/util/histogram-metric.h
M common/thrift/metrics.json
4 files changed, 64 insertions(+), 37 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/16/4516/4
-- 
To view, visit http://gerrit.cloudera.org:8080/4516
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9ba6d4270dd5676eeeff35ad8d9dc5dcddd95e34
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Juan Yu 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-4187: Switch RPC latency metrics to histograms

2016-09-26 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4187: Switch RPC latency metrics to histograms
..


Patch Set 3: Code-Review+2

(1 comment)

Can you give an example of what the old and new metries look like?  Does CM 
need to be updated at all?

http://gerrit.cloudera.org:8080/#/c/4516/3/be/src/util/histogram-metric.h
File be/src/util/histogram-metric.h:

Line 107:   // Protects histogram_
how about: Protects histogram_ pointer.
to make it clear that it's only the pointer, not the implementation that's 
being protected.


-- 
To view, visit http://gerrit.cloudera.org:8080/4516
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9ba6d4270dd5676eeeff35ad8d9dc5dcddd95e34
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Juan Yu 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4187: Switch RPC latency metrics to histograms

2016-09-24 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-4187: Switch RPC latency metrics to histograms
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4516/3/be/src/rpc/rpc-trace.cc
File be/src/rpc/rpc-trace.cc:

PS3, Line 179:   constexpr int32_t SIXTY_MINUTES_IN_MS = 60 * 1000 * 60;
> We'll see users with long running queries go over this (not uncommon) for h
I checked - any value > 60 minutes will be clipped to 60 minutes. That seems ok 
- beyond 60 minutes I'm not sure there's much use in knowing the distribution. 
60 minutes might be too much since most RPCs will never get anywhere close, but 
the total data structure cost is 100KB and there are only a handful of them for 
all the RPCs we support.


-- 
To view, visit http://gerrit.cloudera.org:8080/4516
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9ba6d4270dd5676eeeff35ad8d9dc5dcddd95e34
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Juan Yu 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4187: Switch RPC latency metrics to histograms

2016-09-23 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-4187: Switch RPC latency metrics to histograms
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4516/3/be/src/rpc/rpc-trace.cc
File be/src/rpc/rpc-trace.cc:

PS3, Line 179:   constexpr int32_t SIXTY_MINUTES_IN_MS = 60 * 1000 * 60;
We'll see users with long running queries go over this (not uncommon) for 
hs2::ExecuteStatement/beeswax::query

I'm not sure it's worth doing anything about, but do you know how it will be 
handled?


-- 
To view, visit http://gerrit.cloudera.org:8080/4516
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9ba6d4270dd5676eeeff35ad8d9dc5dcddd95e34
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Juan Yu 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4187: Switch RPC latency metrics to histograms

2016-09-23 Thread Juan Yu (Code Review)
Juan Yu has posted comments on this change.

Change subject: IMPALA-4187: Switch RPC latency metrics to histograms
..


Patch Set 3: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4516
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9ba6d4270dd5676eeeff35ad8d9dc5dcddd95e34
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Juan Yu 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4187: Switch RPC latency metrics to histograms

2016-09-22 Thread Juan Yu (Code Review)
Juan Yu has posted comments on this change.

Change subject: IMPALA-4187: Switch RPC latency metrics to histograms
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/4516/2/be/src/rpc/rpc-trace.cc
File be/src/rpc/rpc-trace.cc:

PS2, Line 181: SIXTY_MINUTES_IN_MS
If I understand correctly, this means the longest RPC call could be 60min. 
Isn't it too long? most of RPC should be in ms range or even shorter, 1 min is 
already abnormal, right?


http://gerrit.cloudera.org:8080/#/c/4516/2/be/src/util/histogram-metric.h
File be/src/util/histogram-metric.h:

PS2, Line 37: highest_trackable_value, num_significant_digits
would it be worth to store those in HistogramMetric?


-- 
To view, visit http://gerrit.cloudera.org:8080/4516
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9ba6d4270dd5676eeeff35ad8d9dc5dcddd95e34
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Juan Yu 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4187: Switch RPC latency metrics to histograms

2016-09-22 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-4187: Switch RPC latency metrics to histograms
..


Patch Set 2:

The core change to HistogramMetric (adding a Reset() method) is needed for our 
proposed solution to https://gerrit.cloudera.org/#/c/4500/.

Since I had this patch all ready to go, might as well get the RPC metric 
changes reviewed at the same time.

-- 
To view, visit http://gerrit.cloudera.org:8080/4516
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9ba6d4270dd5676eeeff35ad8d9dc5dcddd95e34
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Henry Robinson 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4187: Switch RPC latency metrics to histograms

2016-09-22 Thread Henry Robinson (Code Review)
Henry Robinson has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4516

Change subject: IMPALA-4187: Switch RPC latency metrics to histograms
..

IMPALA-4187: Switch RPC latency metrics to histograms

It's usually better to measure latency distributions with histograms,
not avg / min / max. This change switches out the metrics that measure
the latency of RPC processing time to histograms.

Add HistogramMetric::Reset() to allow histogram to remove all
entries. Added spinlock around histogram access.

Change-Id: I9ba6d4270dd5676eeeff35ad8d9dc5dcddd95e34
---
M be/src/rpc/rpc-trace.cc
M be/src/rpc/rpc-trace.h
M be/src/util/histogram-metric.h
3 files changed, 58 insertions(+), 36 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/16/4516/2
-- 
To view, visit http://gerrit.cloudera.org:8080/4516
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I9ba6d4270dd5676eeeff35ad8d9dc5dcddd95e34
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson