Re: [DISCUSS] KIP-488: Clean up Sum,Count,Total Metrics

Ryanne Dolan Tue, 16 Jul 2019 10:32:14 -0700

> measurements, which decay/expire over time

Thanks John for the clarification. This was my (re-)reading of the code,
but this is not what I think of when I hear "sampled". In fact, you'll
notice that the Wikipedia pages for "Sample (statistics)" and "Sample
(signal processing)" do not contain the words decay, expire, recent,
history, or anything similar.


Similar to "running", I'd suggest the more correct "moving", as in "moving
average" and "moving sum", which involve looking back N samples, applying a
sliding window, decaying over time etc.

Ryanne

On Tue, Jul 16, 2019, 11:58 AM John Roesler <[email protected]> wrote:

> Thanks for raising this concern, Ryanne,
>
> "Sampled" indicates that the metrics is sampled, namely that we
> maintain a set of samples from recent value measurements, which
> decay/expire over time. So, the metric value is only representative of
> the recent past.
>
> "Total" indicates that the metric value contains all the information
> from the creation of the metric. For example., the total sum would
> include all measurements since the app started up.
>
> It seems like your concern is that the word "total" doesn't really
> pinpoint this meaning, which is true. It's especially confusing that
> another meaning of "total" is synonymous with "sum", rendering the
> name "TotalSum" sort of absurd.
>
> We previously considered "cumulative", which was rejected as a
> mouthful (it's four syllables) .
>
> You mentioned "running", which might be a more appropriate modifier
> (RunningSum and RunningCount).
>
> What would everyone think about that?
>
> Thanks,
> -John
>
> On Tue, Jul 16, 2019 at 11:27 AM Ryanne Dolan <[email protected]>
> wrote:
> >
> > John, I mentioned on the VOTE thread that the proposed names are a bit
> > confusing,
> >
> > > given that "sum", "total", and "count" are roughly synonymous...
> >
> > In particular, TotalSum is, I think, a "running total", though the naming
> > doesn't really capture that.
> >
> > I think to avoid confusion, we should define exactly what "total" and
> > "sampled" are supposed to indicate, and perhaps pick appropriate naming
> > from there.
> >
> > Ryanne
> >
> >
> > On Fri, Jul 12, 2019 at 1:42 PM John Roesler <[email protected]> wrote:
> >
> > > Hey, thanks Matthias and Bruno,
> > >
> > > I agree, "Cumulative" is a mouthful. "TotalX" sounds fine to me.
> > >
> > > Also, yes, I would have liked to not have any modifier for
> > > "non-sampled", but there is a name conflict with Sum.
> > >
> > > I'll update the KIP to reflect "TotalX" and then start the vote thread.
> > >
> > > Thanks again,
> > > -John
> > >
> > > On Fri, Jul 12, 2019 at 11:27 AM Bruno Cadonna <[email protected]>
> wrote:
> > > >
> > > > OK, makes sense. Then, I am in favour of TotalCount and TotalSum.
> > > >
> > > > Best,
> > > > Bruno
> > > >
> > > > On Fri, Jul 12, 2019 at 12:57 AM Matthias J. Sax <
> [email protected]>
> > > wrote:
> > > > >
> > > > > `Sum` is an existing name, for the "sampled sum" metric, that gets
> > > > > deprecated. Hence, we cannot use it.
> > > > >
> > > > > If we cannot use `Sum` and use `TotalSum`, we should also not use
> > > > > `Count` but `TotalCount` for consistency.
> > > > >
> > > > >
> > > > > -Matthias
> > > > >
> > > > >
> > > > >
> > > > > On 7/11/19 12:58 PM, Bruno Cadonna wrote:
> > > > > > Hi John,
> > > > > >
> > > > > > Thank you for the KIP.
> > > > > >
> > > > > > LGTM
> > > > > >
> > > > > > I also do not like CumulativeSum/Count so much. I propose to just
> > > call
> > > > > > it Sum and Count.
> > > > > >
> > > > > > I understand that you want to unequivocally distinguish the two
> > > metric
> > > > > > functions by their names, but I have the feeling the names become
> > > > > > artificially complex. The exact semantics can also be documented
> in
> > > > > > the javadocs, which btw could also be improved in those classes.
> > > > > >
> > > > > > Best,
> > > > > > Bruno
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Thu, Jul 11, 2019 at 8:25 PM Matthias J. Sax <
> > > [email protected]> wrote:
> > > > > >>
> > > > > >> Thanks for the KIP. Overall LGTM.
> > > > > >>
> > > > > >> The only though I have is, if we may want to use `TotalSum` and
> > > > > >> `TotalCount` instead of `CumulativeSum/Count` as names?
> > > > > >>
> > > > > >>
> > > > > >> -Matthias
> > > > > >>
> > > > > >>
> > > > > >> On 7/11/19 9:31 AM, John Roesler wrote:
> > > > > >>> Hi Kafka devs,
> > > > > >>>
> > > > > >>> I'd like to propose KIP-488 as a minor cleanup of some of our
> > > metric
> > > > > >>> implementations.
> > > > > >>>
> > > > > >>> KIP-488: https://cwiki.apache.org/confluence/x/kkAyBw
> > > > > >>>
> > > > > >>> Over time, iterative updates to these metrics has resulted in a
> > > pretty
> > > > > >>> confusing little collection of classes, and I've personally
> been
> > > > > >>> involved in three separate moderately time-consuming
> iterations of
> > > me
> > > > > >>> or someone else trying to work out which metrics are
> available, and
> > > > > >>> which ones are desired for a given use case. One of these was
> > > actually
> > > > > >>> a long-running bug in Kafka Streams' metrics, so not only has
> this
> > > > > >>> confusion been a time sink, but it has also led to bugs.
> > > > > >>>
> > > > > >>> I'm hoping this change won't be too controversial.
> > > > > >>>
> > > > > >>> Thanks,
> > > > > >>> -John
> > > > > >>>
> > > > > >>
> > > > >
> > >
>
>

Re: [DISCUSS] KIP-488: Clean up Sum,Count,Total Metrics

Reply via email to