> On Nov. 22, 2016, 6:07 p.m., David McLaughlin wrote: > > Thanks for the patch! I think percentiles are crucial for analyzing > > scheduling performance and I'm pretty excited about having them for our > > latencies. However, I'm not sure if it makes sense to patch our local fork > > of twitter.commons. It might be better to either: > > > > a) Upstream these changes to twitter.commons (where we might get better > > feedback on the approach here) where the tests for the code actually runs, > > and then copy them over into our fork > > > > OR > > > > b) Move off twitter-commons stats and use something like util-stats > > (https://github.com/twitter/util/tree/develop/util-stats) instead. > > > > > > The key advantage of (b) is that we'd be using a library that is very > > lightweight but has both percentiles and can support multiple granularities > > (e.g. for per second metrics for graphview, but minutely metrics for long > > term storage) and has a migration strategy (I've done this migration > > internally at Twitter myself). > > > > I'm not sure what the implications of using (b) would be in terms of our > > dependency graph, though. AFAIK we forked twitter-commons originally > > because of that. > > Joshua Cohen wrote: > I know Zameer had some intentions of modernizing our stats suite and had > been doing some early work that was laying the groundwork for replacing > Twitter Commons[1][2]. I'm definitely in favor of reducing our usage of > Twitter Commons, replacing it with something that's actively maintained > wherever possible (e.g. the recent Curator work to replace our ZK singleton > service). So given that, I definitely don't want to contribute this patch > back upstream to Twitter Commons; I think that ship sailed when we forked it. > I'm ok with a stop-gap approach where we update Percentile in our fork with a > long term goal of moving to some other stats library, whatever that may be. > > (And for what it's worth, the tests for Twitter Commons definitely run in > our fork, I verified this locally by adding a `fail()` to `PercentileTest`, > running `gradlew build` then failed as expected). > > > [1] > https://github.com/apache/aurora/commit/059b08621ae4892d98954a6cd0e88f274cdd1bef > [2] > https://github.com/apache/aurora/commit/9b34a4036ab84f560ca80dfce9bdcd831efdeb30 > > Zameer Manji wrote: > I was hoping for us to migrate to http://metrics.dropwizard.io/3.1.0/ > which is a very widely used and popular metrics library. On par on how > Curator is the ZK library for Java. > > I believe it is possible for us to hide metrics behind the > `StatsProvider` interface.
I'd rather discuss libraries in terms of feature sets than by how popular they are. If metrics has the same features that twitter/util-stats does, then I don't care which one we use. - David ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/53965/#review156595 ----------------------------------------------------------- On Nov. 22, 2016, 4:35 p.m., Reza Motamedi wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/53965/ > ----------------------------------------------------------- > > (Updated Nov. 22, 2016, 4:35 p.m.) > > > Review request for Aurora, David McLaughlin, Joshua Cohen, Mehrdad > Nurolahzade, and Santhosh Kumar Shanmugham. > > > Repository: aurora > > > Description > ------- > > The timing of many methods are sampled with @Timed annotation. Currently > SlidingSatas is being used to track total-msec-in-method-x and > times-invoked-of-method-x and then compute the > mean-msec-in-method-x-per-invocation. However mean values are known not to > represent distrubtions with high skewness. > > This is request for review feedback for changes on computing stats, esp. > timing stats by adding percentile. > > This patch introduces `SlidingTimedStats`, which uses `SlidingStats` to track > the mean mean-msec-in-method-x-per-invocation for events in the last K > sampling windows and `Percentiles` to track the p-values for msec-in-method-x > for all invocations in the last K windows. > > __changes overview__: > - Introduced `SlidingTimedStats` to measure percentiles on all @Timed method > calls > - Comments added for a better way to calculate ratio based on two rates (num > and denom) in `SlidingStats`. > - Tests added > > _sample new stat:_ > attribute_store_fetch_one_10_0_percentile 0.0 > attribute_store_fetch_one_50_0_percentile 0.0 > attribute_store_fetch_one_90_0_percentile 0.0 > attribute_store_fetch_one_99_0_percentile 0.0 > attribute_store_fetch_one_99_9_percentile 0.0 > attribute_store_fetch_one_99_99_percentile 0.0 > > > Diffs > ----- > > commons/src/main/java/org/apache/aurora/common/inject/TimedInterceptor.java > a4c4240b67cafb108ca587ad86d9fb9d9bb1553d > commons/src/main/java/org/apache/aurora/common/stats/SlidingStats.java > f7a5ae41e307627fc55157758e9b7cdd861c3268 > commons/src/main/java/org/apache/aurora/common/stats/SlidingTimedStats.java > PRE-CREATION > > commons/src/test/java/org/apache/aurora/common/stats/SlidingTimedStatsTest.java > PRE-CREATION > > Diff: https://reviews.apache.org/r/53965/diff/ > > > Testing > ------- > > SlidingTimedStatsTest > > passed > > Seems like gradle is not set to run tests on `common`. Tests passed in IDEA. > > > Thanks, > > Reza Motamedi > >
