Re: Review Request 53965: track percentile values for timed sliding stats

David McLaughlin Tue, 22 Nov 2016 10:09:11 -0800

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/53965/#review156595
-----------------------------------------------------------




Thanks for the patch! I think percentiles are crucial for analyzing scheduling 
performance and I'm pretty excited about having them for our latencies. 
However, I'm not sure if it makes sense to patch our local fork of 
twitter.commons. It might be better to either:

a) Upstream these changes to twitter.commons (where we might get better 
feedback on the approach here) where the tests for the code actually runs, and 
then copy them over into our fork 

OR

b) Move off twitter-commons stats and use something like util-stats 
(https://github.com/twitter/util/tree/develop/util-stats) instead. 


The key advantage of (b) is that we'd be using a library that is very 
lightweight but has both percentiles and can support multiple granularities 
(e.g. for per second metrics for graphview, but minutely metrics for long term 
storage) and has a migration strategy (I've done this migration internally at 
Twitter myself). 

I'm not sure what the implications of using (b) would be in terms of our 
dependency graph, though. AFAIK we forked twitter-commons originally because of 
that.

- David McLaughlin


On Nov. 22, 2016, 4:35 p.m., Reza Motamedi wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/53965/
> -----------------------------------------------------------
> 
> (Updated Nov. 22, 2016, 4:35 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Joshua Cohen, Mehrdad 
> Nurolahzade, and Santhosh Kumar Shanmugham.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> The timing of many methods are sampled with @Timed annotation. Currently 
> SlidingSatas is being used to track total-msec-in-method-x and 
> times-invoked-of-method-x and then compute the 
> mean-msec-in-method-x-per-invocation. However mean values are known not to 
> represent distrubtions with high skewness. 
> 
> This is request for review feedback for changes on computing stats, esp. 
> timing stats by adding percentile.
> 
> This patch introduces `SlidingTimedStats`, which uses `SlidingStats` to track 
> the mean mean-msec-in-method-x-per-invocation for events in the last K 
> sampling windows and `Percentiles` to track the p-values for msec-in-method-x 
> for all invocations in the last K windows.
> 
> __changes overview__:
> - Introduced `SlidingTimedStats` to measure percentiles on all @Timed method 
> calls
> - Comments added for a better way to calculate ratio based on two rates (num 
> and denom) in `SlidingStats`.
> - Tests added
> 
> _sample new stat:_
> attribute_store_fetch_one_10_0_percentile 0.0
> attribute_store_fetch_one_50_0_percentile 0.0
> attribute_store_fetch_one_90_0_percentile 0.0
> attribute_store_fetch_one_99_0_percentile 0.0
> attribute_store_fetch_one_99_9_percentile 0.0
> attribute_store_fetch_one_99_99_percentile 0.0
> 
> 
> Diffs
> -----
> 
>   commons/src/main/java/org/apache/aurora/common/inject/TimedInterceptor.java 
> a4c4240b67cafb108ca587ad86d9fb9d9bb1553d 
>   commons/src/main/java/org/apache/aurora/common/stats/SlidingStats.java 
> f7a5ae41e307627fc55157758e9b7cdd861c3268 
>   commons/src/main/java/org/apache/aurora/common/stats/SlidingTimedStats.java 
> PRE-CREATION 
>   
> commons/src/test/java/org/apache/aurora/common/stats/SlidingTimedStatsTest.java
>  PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/53965/diff/
> 
> 
> Testing
> -------
> 
> SlidingTimedStatsTest
> > passed
> 
> Seems like gradle is not set to run tests on `common`. Tests passed in IDEA.
> 
> 
> Thanks,
> 
> Reza Motamedi
> 
>

Re: Review Request 53965: track percentile values for timed sliding stats

Reply via email to