Re: [DISCUSS] KIP-488: Clean up Sum,Count,Total Metrics

2019-07-18 Thread Matthias J. Sax
Thanks for the background John. I am happy with `Windowed` and `Cumulative`, too. -Matthias On 7/17/19 11:22 AM, Ryanne Dolan wrote: > John, makes sense to me! Thanks. > > Ryanne > > On Wed, Jul 17, 2019, 1:16 PM John Roesler wrote: > >> Agreed. I think the names are actually not ambiguous

Re: [DISCUSS] KIP-488: Clean up Sum,Count,Total Metrics

2019-07-17 Thread John Roesler
Cool. I'll update the KIP, then, and we can re-start the vote. (looks like you've already cast a vote :) ) -John On Wed, Jul 17, 2019 at 1:23 PM Ryanne Dolan wrote: > > John, makes sense to me! Thanks. > > Ryanne > > On Wed, Jul 17, 2019, 1:16 PM John Roesler wrote: > > > Agreed. I think the

Re: [DISCUSS] KIP-488: Clean up Sum,Count,Total Metrics

2019-07-17 Thread Ryanne Dolan
John, makes sense to me! Thanks. Ryanne On Wed, Jul 17, 2019, 1:16 PM John Roesler wrote: > Agreed. I think the names are actually not ambiguous once you recall > that the stats summarize measurements and each measurement is a > floating point number, but there's enough overlap that I also was

Re: [DISCUSS] KIP-488: Clean up Sum,Count,Total Metrics

2019-07-17 Thread John Roesler
Agreed. I think the names are actually not ambiguous once you recall that the stats summarize measurements and each measurement is a floating point number, but there's enough overlap that I also was initially confused as well. I do plan to make this super clear in the documentation. Thanks, -John

Re: [DISCUSS] KIP-488: Clean up Sum,Count,Total Metrics

2019-07-17 Thread Sophie Blee-Goldman
Sounds good to me By the way, while I agree that we can't really do better than Sum and Count I will say I also found the distinction a bit unclear at first glance. We should at least document clearly that "Sum" is a "sum of values" whereas "Count" is a "number of things" -- but that doesn't need

Re: [DISCUSS] KIP-488: Clean up Sum,Count,Total Metrics

2019-07-17 Thread John Roesler
Thanks for the replies. I guess that if we did add (e.g.) ExponentiallyWeightedWindowedX or something, it should still be pretty obvious that WindowedX is the unweighted version? In that case, I buy the argument that we don't need "Simple" and we can just go with: WindowedSum, WindowedCount

Re: [DISCUSS] KIP-488: Clean up Sum,Count,Total Metrics

2019-07-17 Thread Sophie Blee-Goldman
Thanks for the crash course in statistical terms :) In light of this I definitely support Cumulative{Sum,Count}, but I'm really not crazy about SimpleWindowed{Sum,Count} (vs just Windowed). Not so much because of its unfortunate length (although that is unfortunate it shouldn't be a deciding

Re: [DISCUSS] KIP-488: Clean up Sum,Count,Total Metrics

2019-07-17 Thread Bill Bejeck
I'm fine with the new names. While I was previously in the "brevity" camp when it came to the Cumulative name, I'll take clarity over brevity. Thanks for the updates, John. -Bill On Wed, Jul 17, 2019 at 11:18 AM John Roesler wrote: > Thanks for the discussion, all. > > I've done a little

Re: [DISCUSS] KIP-488: Clean up Sum,Count,Total Metrics

2019-07-17 Thread John Roesler
Thanks for the discussion, all. I've done a little more research into the statistical terminology. Matthias is correct, "running" and "moving" appear to be synonyms. Unfortunately, both can be computed either over a window of the last N measurements or over all prior measurements. "Moving" just

Re: [DISCUSS] KIP-488: Clean up Sum,Count,Total Metrics

2019-07-16 Thread Matthias J. Sax
It's a fair point that Ryanne raises. However, "running sum" is the same as "moving sum" from my understanding. The issue is still, that `Sum` and `Count` which seem to be the cleanest names cannot be used. While I agree that `TotalSum` and `TotalCount` is somewhat redundant, I still think it the

Re: [DISCUSS] KIP-488: Clean up Sum,Count,Total Metrics

2019-07-16 Thread Sophie Blee-Goldman
I'm +1 on Windowed, was about to suggest that as I was catching up on the discussion but Bill beat me to it :) On Tue, Jul 16, 2019 at 2:23 PM Bill Bejeck wrote: > Hi John, > > Thanks for the updates. > > I like RunningCount and RunningSum. > > What about WindowedCount, WindowedSum instead of

Re: [DISCUSS] KIP-488: Clean up Sum,Count,Total Metrics

2019-07-16 Thread Bill Bejeck
Hi John, Thanks for the updates. I like RunningCount and RunningSum. What about WindowedCount, WindowedSum instead of Moving? I'm just throwing this out there as Windowed seems more intuitive to me, but I'm not married to the idea. -Bill On Tue, Jul 16, 2019 at 5:09 PM John Roesler wrote: >

Re: [DISCUSS] KIP-488: Clean up Sum,Count,Total Metrics

2019-07-16 Thread John Roesler
No worries! Choosing good public API names is a high-impact design activity. Matthias, Bruno, Bill, and Stanislav, You've all contributed to this discussion or the vote so far... How do you feel about the proposed name change: MovingCount, MovingSum (instead of Sampled) RunningCount,

Re: [DISCUSS] KIP-488: Clean up Sum,Count,Total Metrics

2019-07-16 Thread Ryanne Dolan
John, that makes sense to me. Sorry for the bikeshedding. Ryanne On Tue, Jul 16, 2019 at 12:49 PM John Roesler wrote: > Thanks for the explanation and the suggestion, Ryanne, > > I went with "sampled" just because these are instances of SampledStat, > which in the Kafka Metrics ecosystem are

Re: [DISCUSS] KIP-488: Clean up Sum,Count,Total Metrics

2019-07-16 Thread John Roesler
Thanks for the explanation and the suggestion, Ryanne, I went with "sampled" just because these are instances of SampledStat, which in the Kafka Metrics ecosystem are computed from a window of recent samples. Thinking more about it, the fact that they are sampled and the fact that they are

Re: [DISCUSS] KIP-488: Clean up Sum,Count,Total Metrics

2019-07-16 Thread Ryanne Dolan
> measurements, which decay/expire over time Thanks John for the clarification. This was my (re-)reading of the code, but this is not what I think of when I hear "sampled". In fact, you'll notice that the Wikipedia pages for "Sample (statistics)" and "Sample (signal processing)" do not contain

Re: [DISCUSS] KIP-488: Clean up Sum,Count,Total Metrics

2019-07-16 Thread John Roesler
Thanks for raising this concern, Ryanne, "Sampled" indicates that the metrics is sampled, namely that we maintain a set of samples from recent value measurements, which decay/expire over time. So, the metric value is only representative of the recent past. "Total" indicates that the metric value

Re: [DISCUSS] KIP-488: Clean up Sum,Count,Total Metrics

2019-07-16 Thread Ryanne Dolan
John, I mentioned on the VOTE thread that the proposed names are a bit confusing, > given that "sum", "total", and "count" are roughly synonymous... In particular, TotalSum is, I think, a "running total", though the naming doesn't really capture that. I think to avoid confusion, we should

Re: [DISCUSS] KIP-488: Clean up Sum,Count,Total Metrics

2019-07-15 Thread John Roesler
Hey Stanislav, Thanks for the feedback! I do think we should try to keep the scope of this KIP small. The idea you proposed is orthogonal, though, so if you wanted to write up a quick KIP concurrently, I'd be happy to review it. Regarding deprecation: yes, the metrics are in the public

Re: [DISCUSS] KIP-488: Clean up Sum,Count,Total Metrics

2019-07-13 Thread Stanislav Kozlovski
Hello, Thanks for the KIP, I'm happy we're converging on implementations. I was wondering whether we need to deprecate the old metrics - are those classes public interfaces? I see we don't build a JavaDoc for them ( https://kafka.apache.org/22/javadoc/allclasses-noframe.html) and, as far as I

Re: [DISCUSS] KIP-488: Clean up Sum,Count,Total Metrics

2019-07-12 Thread John Roesler
Hey, thanks Matthias and Bruno, I agree, "Cumulative" is a mouthful. "TotalX" sounds fine to me. Also, yes, I would have liked to not have any modifier for "non-sampled", but there is a name conflict with Sum. I'll update the KIP to reflect "TotalX" and then start the vote thread. Thanks

Re: [DISCUSS] KIP-488: Clean up Sum,Count,Total Metrics

2019-07-12 Thread Bruno Cadonna
OK, makes sense. Then, I am in favour of TotalCount and TotalSum. Best, Bruno On Fri, Jul 12, 2019 at 12:57 AM Matthias J. Sax wrote: > > `Sum` is an existing name, for the "sampled sum" metric, that gets > deprecated. Hence, we cannot use it. > > If we cannot use `Sum` and use `TotalSum`, we

Re: [DISCUSS] KIP-488: Clean up Sum,Count,Total Metrics

2019-07-11 Thread Matthias J. Sax
`Sum` is an existing name, for the "sampled sum" metric, that gets deprecated. Hence, we cannot use it. If we cannot use `Sum` and use `TotalSum`, we should also not use `Count` but `TotalCount` for consistency. -Matthias On 7/11/19 12:58 PM, Bruno Cadonna wrote: > Hi John, > > Thank you

Re: [DISCUSS] KIP-488: Clean up Sum,Count,Total Metrics

2019-07-11 Thread Bruno Cadonna
Hi John, Thank you for the KIP. LGTM I also do not like CumulativeSum/Count so much. I propose to just call it Sum and Count. I understand that you want to unequivocally distinguish the two metric functions by their names, but I have the feeling the names become artificially complex. The exact

Re: [DISCUSS] KIP-488: Clean up Sum,Count,Total Metrics

2019-07-11 Thread Matthias J. Sax
Thanks for the KIP. Overall LGTM. The only though I have is, if we may want to use `TotalSum` and `TotalCount` instead of `CumulativeSum/Count` as names? -Matthias On 7/11/19 9:31 AM, John Roesler wrote: > Hi Kafka devs, > > I'd like to propose KIP-488 as a minor cleanup of some of our

[DISCUSS] KIP-488: Clean up Sum,Count,Total Metrics

2019-07-11 Thread John Roesler
Hi Kafka devs, I'd like to propose KIP-488 as a minor cleanup of some of our metric implementations. KIP-488: https://cwiki.apache.org/confluence/x/kkAyBw Over time, iterative updates to these metrics has resulted in a pretty confusing little collection of classes, and I've personally been