Thanks for the background John.
I am happy with `Windowed` and `Cumulative`, too.
-Matthias
On 7/17/19 11:22 AM, Ryanne Dolan wrote:
> John, makes sense to me! Thanks.
>
> Ryanne
>
> On Wed, Jul 17, 2019, 1:16 PM John Roesler wrote:
>
>> Agreed. I think the names are actually not ambiguous
Cool. I'll update the KIP, then, and we can re-start the vote. (looks
like you've already cast a vote :) )
-John
On Wed, Jul 17, 2019 at 1:23 PM Ryanne Dolan wrote:
>
> John, makes sense to me! Thanks.
>
> Ryanne
>
> On Wed, Jul 17, 2019, 1:16 PM John Roesler wrote:
>
> > Agreed. I think the
John, makes sense to me! Thanks.
Ryanne
On Wed, Jul 17, 2019, 1:16 PM John Roesler wrote:
> Agreed. I think the names are actually not ambiguous once you recall
> that the stats summarize measurements and each measurement is a
> floating point number, but there's enough overlap that I also was
Agreed. I think the names are actually not ambiguous once you recall
that the stats summarize measurements and each measurement is a
floating point number, but there's enough overlap that I also was
initially confused as well. I do plan to make this super clear in the
documentation.
Thanks,
-John
Sounds good to me
By the way, while I agree that we can't really do better than Sum and Count
I will say I also found the distinction a bit unclear at first glance. We
should at least document clearly that "Sum" is a "sum of values" whereas
"Count" is a "number of things" -- but that doesn't need
Thanks for the replies.
I guess that if we did add (e.g.) ExponentiallyWeightedWindowedX or
something, it should still be pretty obvious that WindowedX is the
unweighted version? In that case, I buy the argument that we don't
need "Simple" and we can just go with:
WindowedSum, WindowedCount
Thanks for the crash course in statistical terms :)
In light of this I definitely support Cumulative{Sum,Count}, but I'm really
not crazy about SimpleWindowed{Sum,Count} (vs just Windowed). Not so much
because of its unfortunate length (although that is unfortunate it
shouldn't be a deciding
I'm fine with the new names.
While I was previously in the "brevity" camp when it came to the Cumulative
name, I'll take clarity over brevity.
Thanks for the updates, John.
-Bill
On Wed, Jul 17, 2019 at 11:18 AM John Roesler wrote:
> Thanks for the discussion, all.
>
> I've done a little
Thanks for the discussion, all.
I've done a little more research into the statistical terminology.
Matthias is correct, "running" and "moving" appear to be synonyms.
Unfortunately, both can be computed either over a window of the last N
measurements or over all prior measurements. "Moving" just
It's a fair point that Ryanne raises. However, "running sum" is the same
as "moving sum" from my understanding.
The issue is still, that `Sum` and `Count` which seem to be the cleanest
names cannot be used. While I agree that `TotalSum` and `TotalCount` is
somewhat redundant, I still think it the
I'm +1 on Windowed, was about to suggest that as I was catching up on the
discussion but Bill beat me to it :)
On Tue, Jul 16, 2019 at 2:23 PM Bill Bejeck wrote:
> Hi John,
>
> Thanks for the updates.
>
> I like RunningCount and RunningSum.
>
> What about WindowedCount, WindowedSum instead of
Hi John,
Thanks for the updates.
I like RunningCount and RunningSum.
What about WindowedCount, WindowedSum instead of Moving?
I'm just throwing this out there as Windowed seems more intuitive to me,
but I'm not married to the idea.
-Bill
On Tue, Jul 16, 2019 at 5:09 PM John Roesler wrote:
>
No worries! Choosing good public API names is a high-impact design activity.
Matthias, Bruno, Bill, and Stanislav,
You've all contributed to this discussion or the vote so far... How do
you feel about the proposed name change:
MovingCount, MovingSum (instead of Sampled)
RunningCount,
John, that makes sense to me. Sorry for the bikeshedding.
Ryanne
On Tue, Jul 16, 2019 at 12:49 PM John Roesler wrote:
> Thanks for the explanation and the suggestion, Ryanne,
>
> I went with "sampled" just because these are instances of SampledStat,
> which in the Kafka Metrics ecosystem are
Thanks for the explanation and the suggestion, Ryanne,
I went with "sampled" just because these are instances of SampledStat,
which in the Kafka Metrics ecosystem are computed from a window of
recent samples. Thinking more about it, the fact that they are sampled
and the fact that they are
> measurements, which decay/expire over time
Thanks John for the clarification. This was my (re-)reading of the code,
but this is not what I think of when I hear "sampled". In fact, you'll
notice that the Wikipedia pages for "Sample (statistics)" and "Sample
(signal processing)" do not contain
Thanks for raising this concern, Ryanne,
"Sampled" indicates that the metrics is sampled, namely that we
maintain a set of samples from recent value measurements, which
decay/expire over time. So, the metric value is only representative of
the recent past.
"Total" indicates that the metric value
John, I mentioned on the VOTE thread that the proposed names are a bit
confusing,
> given that "sum", "total", and "count" are roughly synonymous...
In particular, TotalSum is, I think, a "running total", though the naming
doesn't really capture that.
I think to avoid confusion, we should
Hey Stanislav,
Thanks for the feedback! I do think we should try to keep the scope of
this KIP small. The idea you proposed is orthogonal, though, so if you
wanted to write up a quick KIP concurrently, I'd be happy to review
it.
Regarding deprecation: yes, the metrics are in the public
Hello,
Thanks for the KIP, I'm happy we're converging on implementations.
I was wondering whether we need to deprecate the old metrics - are those
classes public interfaces? I see we don't build a JavaDoc for them (
https://kafka.apache.org/22/javadoc/allclasses-noframe.html) and, as far as
I
Hey, thanks Matthias and Bruno,
I agree, "Cumulative" is a mouthful. "TotalX" sounds fine to me.
Also, yes, I would have liked to not have any modifier for
"non-sampled", but there is a name conflict with Sum.
I'll update the KIP to reflect "TotalX" and then start the vote thread.
Thanks
OK, makes sense. Then, I am in favour of TotalCount and TotalSum.
Best,
Bruno
On Fri, Jul 12, 2019 at 12:57 AM Matthias J. Sax wrote:
>
> `Sum` is an existing name, for the "sampled sum" metric, that gets
> deprecated. Hence, we cannot use it.
>
> If we cannot use `Sum` and use `TotalSum`, we
`Sum` is an existing name, for the "sampled sum" metric, that gets
deprecated. Hence, we cannot use it.
If we cannot use `Sum` and use `TotalSum`, we should also not use
`Count` but `TotalCount` for consistency.
-Matthias
On 7/11/19 12:58 PM, Bruno Cadonna wrote:
> Hi John,
>
> Thank you
Hi John,
Thank you for the KIP.
LGTM
I also do not like CumulativeSum/Count so much. I propose to just call
it Sum and Count.
I understand that you want to unequivocally distinguish the two metric
functions by their names, but I have the feeling the names become
artificially complex. The exact
Thanks for the KIP. Overall LGTM.
The only though I have is, if we may want to use `TotalSum` and
`TotalCount` instead of `CumulativeSum/Count` as names?
-Matthias
On 7/11/19 9:31 AM, John Roesler wrote:
> Hi Kafka devs,
>
> I'd like to propose KIP-488 as a minor cleanup of some of our
Hi Kafka devs,
I'd like to propose KIP-488 as a minor cleanup of some of our metric
implementations.
KIP-488: https://cwiki.apache.org/confluence/x/kkAyBw
Over time, iterative updates to these metrics has resulted in a pretty
confusing little collection of classes, and I've personally been
26 matches
Mail list logo