Re: [DISCUSS] KIP-488: Clean up Sum,Count,Total Metrics
Thanks for the background John. I am happy with `Windowed` and `Cumulative`, too. -Matthias On 7/17/19 11:22 AM, Ryanne Dolan wrote: > John, makes sense to me! Thanks. > > Ryanne > > On Wed, Jul 17, 2019, 1:16 PM John Roesler wrote: > >> Agreed. I think the names are actually not ambiguous once you recall >> that the stats summarize measurements and each measurement is a >> floating point number, but there's enough overlap that I also was >> initially confused as well. I do plan to make this super clear in the >> documentation. >> >> Thanks, >> -John >> >> On Wed, Jul 17, 2019 at 1:08 PM Sophie Blee-Goldman >> wrote: >>> >>> Sounds good to me >>> >>> By the way, while I agree that we can't really do better than Sum and >> Count >>> I will say I also found the distinction a bit unclear at first glance. We >>> should at least document clearly that "Sum" is a "sum of values" whereas >>> "Count" is a "number of things" -- but that doesn't need to be part of >> the >>> KIP >>> >>> On Wed, Jul 17, 2019 at 11:00 AM John Roesler wrote: >>> Thanks for the replies. I guess that if we did add (e.g.) ExponentiallyWeightedWindowedX or something, it should still be pretty obvious that WindowedX is the unweighted version? In that case, I buy the argument that we don't need "Simple" and we can just go with: WindowedSum, WindowedCount CumulativeSum, CumulativeCount Sound good? Thanks, -John On Wed, Jul 17, 2019 at 11:53 AM Sophie Blee-Goldman wrote: > > Thanks for the crash course in statistical terms :) > > In light of this I definitely support Cumulative{Sum,Count}, but I'm really > not crazy about SimpleWindowed{Sum,Count} (vs just Windowed). Not so >> much > because of its unfortunate length (although that is unfortunate it > shouldn't be a deciding factor) but because it seems to have the potential > to confuse further. I'm not sure what we gain by adding "Simple" >> since to > me at least, the unweighted-ness is obvious and the definition of >> simple is > not. To those who haven't been exposed to the finer details of statistical > definitions, I think they are more likely to read "SimpleXX" and >> wonder "is > there an 'advanced' or non-simple kind of Windowed?" than they are to > wonder what is the weighting behind these metrics. > > Sophie > > On Wed, Jul 17, 2019 at 8:18 AM John Roesler >> wrote: > >> Thanks for the discussion, all. >> >> I've done a little more research into the statistical terminology. >> Matthias is correct, "running" and "moving" appear to be synonyms. >> Unfortunately, both can be computed either over a window of the >> last N >> measurements or over all prior measurements. "Moving" just >> signifies >> that the statistic is computed over a "live" data set, i.e., a >> continuous stream of measurements, and the expectation is that the >> stat would be updated in response to new measurements. >> >> I found https://en.wikipedia.org/wiki/Moving_average to have a >> pretty >> good overview of the whole picture. >> >> After considering the discussion so far and some light reading, it >> seems like "Cumulative" is truly the correct term for the all-time >> metrics: >> >>> In a cumulative moving average, the data arrive in an >>> ordered datum stream, and the user would like to get >>> the average of all of the data up until the current datum >>> point. >> >> I know that we previously felt that "cumulative" was too much of a >> mouthful, but it seems like our quest for a terser term led us >> into a >> briar patch. Also, now there is an independent source (the wiki >> page) >> indicating that this is indeed the correct term, and it doesn't >> offer >> any synonyms to choose from. Maybe we can take comfort in the fact >> that we'll rarely be saying the name of the classes out loud. >> >> As far as moving stats that operate over a window of the last N >> measurements, there are multiple options, including Simple >> (unweighted), Weighted, and Exponentially Weighted, and presumably >> infinite variations with other weighting functions. In our domain, >> there is only one weighting function available, but it's still more >> self-documenting and future-proof to specify the type of windowed >> statistic. Therefore, I'm proposing "Simple" as the term for the >> windowed (aka sampled) stats, while keeping Windowed in the name to >> distinguish it from the all-time metrics. >> >>> In financial applications a simple moving average (SMA) >>> is the unweighted mean of the previous n data. >> >> Therefore, we would have the proposed matrix: >> >> SimpleWindowedSum, SimpleWindowedCount >> CumulativeSum, CumulativeCount >> >> Again, all
Re: [DISCUSS] KIP-488: Clean up Sum,Count,Total Metrics
Cool. I'll update the KIP, then, and we can re-start the vote. (looks like you've already cast a vote :) ) -John On Wed, Jul 17, 2019 at 1:23 PM Ryanne Dolan wrote: > > John, makes sense to me! Thanks. > > Ryanne > > On Wed, Jul 17, 2019, 1:16 PM John Roesler wrote: > > > Agreed. I think the names are actually not ambiguous once you recall > > that the stats summarize measurements and each measurement is a > > floating point number, but there's enough overlap that I also was > > initially confused as well. I do plan to make this super clear in the > > documentation. > > > > Thanks, > > -John > > > > On Wed, Jul 17, 2019 at 1:08 PM Sophie Blee-Goldman > > wrote: > > > > > > Sounds good to me > > > > > > By the way, while I agree that we can't really do better than Sum and > > Count > > > I will say I also found the distinction a bit unclear at first glance. We > > > should at least document clearly that "Sum" is a "sum of values" whereas > > > "Count" is a "number of things" -- but that doesn't need to be part of > > the > > > KIP > > > > > > On Wed, Jul 17, 2019 at 11:00 AM John Roesler wrote: > > > > > > > Thanks for the replies. > > > > > > > > I guess that if we did add (e.g.) ExponentiallyWeightedWindowedX or > > > > something, it should still be pretty obvious that WindowedX is the > > > > unweighted version? In that case, I buy the argument that we don't > > > > need "Simple" and we can just go with: > > > > > > > > WindowedSum, WindowedCount > > > > CumulativeSum, CumulativeCount > > > > > > > > Sound good? > > > > Thanks, > > > > -John > > > > > > > > On Wed, Jul 17, 2019 at 11:53 AM Sophie Blee-Goldman > > > > wrote: > > > > > > > > > > Thanks for the crash course in statistical terms :) > > > > > > > > > > In light of this I definitely support Cumulative{Sum,Count}, but I'm > > > > really > > > > > not crazy about SimpleWindowed{Sum,Count} (vs just Windowed). Not so > > much > > > > > because of its unfortunate length (although that is unfortunate it > > > > > shouldn't be a deciding factor) but because it seems to have the > > > > potential > > > > > to confuse further. I'm not sure what we gain by adding "Simple" > > since to > > > > > me at least, the unweighted-ness is obvious and the definition of > > simple > > > > is > > > > > not. To those who haven't been exposed to the finer details of > > > > statistical > > > > > definitions, I think they are more likely to read "SimpleXX" and > > wonder > > > > "is > > > > > there an 'advanced' or non-simple kind of Windowed?" than they are to > > > > > wonder what is the weighting behind these metrics. > > > > > > > > > > Sophie > > > > > > > > > > On Wed, Jul 17, 2019 at 8:18 AM John Roesler > > wrote: > > > > > > > > > > > Thanks for the discussion, all. > > > > > > > > > > > > I've done a little more research into the statistical terminology. > > > > > > Matthias is correct, "running" and "moving" appear to be synonyms. > > > > > > Unfortunately, both can be computed either over a window of the > > last N > > > > > > measurements or over all prior measurements. "Moving" just > > signifies > > > > > > that the statistic is computed over a "live" data set, i.e., a > > > > > > continuous stream of measurements, and the expectation is that the > > > > > > stat would be updated in response to new measurements. > > > > > > > > > > > > I found https://en.wikipedia.org/wiki/Moving_average to have a > > pretty > > > > > > good overview of the whole picture. > > > > > > > > > > > > After considering the discussion so far and some light reading, it > > > > > > seems like "Cumulative" is truly the correct term for the all-time > > > > > > metrics: > > > > > > > > > > > > > In a cumulative moving average, the data arrive in an > > > > > > > ordered datum stream, and the user would like to get > > > > > > > the average of all of the data up until the current datum > > > > > > > point. > > > > > > > > > > > > I know that we previously felt that "cumulative" was too much of a > > > > > > mouthful, but it seems like our quest for a terser term led us > > into a > > > > > > briar patch. Also, now there is an independent source (the wiki > > page) > > > > > > indicating that this is indeed the correct term, and it doesn't > > offer > > > > > > any synonyms to choose from. Maybe we can take comfort in the fact > > > > > > that we'll rarely be saying the name of the classes out loud. > > > > > > > > > > > > As far as moving stats that operate over a window of the last N > > > > > > measurements, there are multiple options, including Simple > > > > > > (unweighted), Weighted, and Exponentially Weighted, and presumably > > > > > > infinite variations with other weighting functions. In our domain, > > > > > > there is only one weighting function available, but it's still more > > > > > > self-documenting and future-proof to specify the type of windowed > > > > > > statistic. Therefore, I'm proposing "Simple" as the term for the > > > > > > windowed (aka
Re: [DISCUSS] KIP-488: Clean up Sum,Count,Total Metrics
John, makes sense to me! Thanks. Ryanne On Wed, Jul 17, 2019, 1:16 PM John Roesler wrote: > Agreed. I think the names are actually not ambiguous once you recall > that the stats summarize measurements and each measurement is a > floating point number, but there's enough overlap that I also was > initially confused as well. I do plan to make this super clear in the > documentation. > > Thanks, > -John > > On Wed, Jul 17, 2019 at 1:08 PM Sophie Blee-Goldman > wrote: > > > > Sounds good to me > > > > By the way, while I agree that we can't really do better than Sum and > Count > > I will say I also found the distinction a bit unclear at first glance. We > > should at least document clearly that "Sum" is a "sum of values" whereas > > "Count" is a "number of things" -- but that doesn't need to be part of > the > > KIP > > > > On Wed, Jul 17, 2019 at 11:00 AM John Roesler wrote: > > > > > Thanks for the replies. > > > > > > I guess that if we did add (e.g.) ExponentiallyWeightedWindowedX or > > > something, it should still be pretty obvious that WindowedX is the > > > unweighted version? In that case, I buy the argument that we don't > > > need "Simple" and we can just go with: > > > > > > WindowedSum, WindowedCount > > > CumulativeSum, CumulativeCount > > > > > > Sound good? > > > Thanks, > > > -John > > > > > > On Wed, Jul 17, 2019 at 11:53 AM Sophie Blee-Goldman > > > wrote: > > > > > > > > Thanks for the crash course in statistical terms :) > > > > > > > > In light of this I definitely support Cumulative{Sum,Count}, but I'm > > > really > > > > not crazy about SimpleWindowed{Sum,Count} (vs just Windowed). Not so > much > > > > because of its unfortunate length (although that is unfortunate it > > > > shouldn't be a deciding factor) but because it seems to have the > > > potential > > > > to confuse further. I'm not sure what we gain by adding "Simple" > since to > > > > me at least, the unweighted-ness is obvious and the definition of > simple > > > is > > > > not. To those who haven't been exposed to the finer details of > > > statistical > > > > definitions, I think they are more likely to read "SimpleXX" and > wonder > > > "is > > > > there an 'advanced' or non-simple kind of Windowed?" than they are to > > > > wonder what is the weighting behind these metrics. > > > > > > > > Sophie > > > > > > > > On Wed, Jul 17, 2019 at 8:18 AM John Roesler > wrote: > > > > > > > > > Thanks for the discussion, all. > > > > > > > > > > I've done a little more research into the statistical terminology. > > > > > Matthias is correct, "running" and "moving" appear to be synonyms. > > > > > Unfortunately, both can be computed either over a window of the > last N > > > > > measurements or over all prior measurements. "Moving" just > signifies > > > > > that the statistic is computed over a "live" data set, i.e., a > > > > > continuous stream of measurements, and the expectation is that the > > > > > stat would be updated in response to new measurements. > > > > > > > > > > I found https://en.wikipedia.org/wiki/Moving_average to have a > pretty > > > > > good overview of the whole picture. > > > > > > > > > > After considering the discussion so far and some light reading, it > > > > > seems like "Cumulative" is truly the correct term for the all-time > > > > > metrics: > > > > > > > > > > > In a cumulative moving average, the data arrive in an > > > > > > ordered datum stream, and the user would like to get > > > > > > the average of all of the data up until the current datum > > > > > > point. > > > > > > > > > > I know that we previously felt that "cumulative" was too much of a > > > > > mouthful, but it seems like our quest for a terser term led us > into a > > > > > briar patch. Also, now there is an independent source (the wiki > page) > > > > > indicating that this is indeed the correct term, and it doesn't > offer > > > > > any synonyms to choose from. Maybe we can take comfort in the fact > > > > > that we'll rarely be saying the name of the classes out loud. > > > > > > > > > > As far as moving stats that operate over a window of the last N > > > > > measurements, there are multiple options, including Simple > > > > > (unweighted), Weighted, and Exponentially Weighted, and presumably > > > > > infinite variations with other weighting functions. In our domain, > > > > > there is only one weighting function available, but it's still more > > > > > self-documenting and future-proof to specify the type of windowed > > > > > statistic. Therefore, I'm proposing "Simple" as the term for the > > > > > windowed (aka sampled) stats, while keeping Windowed in the name to > > > > > distinguish it from the all-time metrics. > > > > > > > > > > > In financial applications a simple moving average (SMA) > > > > > > is the unweighted mean of the previous n data. > > > > > > > > > > Therefore, we would have the proposed matrix: > > > > > > > > > > SimpleWindowedSum, SimpleWindowedCount > > > > > CumulativeSum, CumulativeCount
Re: [DISCUSS] KIP-488: Clean up Sum,Count,Total Metrics
Agreed. I think the names are actually not ambiguous once you recall that the stats summarize measurements and each measurement is a floating point number, but there's enough overlap that I also was initially confused as well. I do plan to make this super clear in the documentation. Thanks, -John On Wed, Jul 17, 2019 at 1:08 PM Sophie Blee-Goldman wrote: > > Sounds good to me > > By the way, while I agree that we can't really do better than Sum and Count > I will say I also found the distinction a bit unclear at first glance. We > should at least document clearly that "Sum" is a "sum of values" whereas > "Count" is a "number of things" -- but that doesn't need to be part of the > KIP > > On Wed, Jul 17, 2019 at 11:00 AM John Roesler wrote: > > > Thanks for the replies. > > > > I guess that if we did add (e.g.) ExponentiallyWeightedWindowedX or > > something, it should still be pretty obvious that WindowedX is the > > unweighted version? In that case, I buy the argument that we don't > > need "Simple" and we can just go with: > > > > WindowedSum, WindowedCount > > CumulativeSum, CumulativeCount > > > > Sound good? > > Thanks, > > -John > > > > On Wed, Jul 17, 2019 at 11:53 AM Sophie Blee-Goldman > > wrote: > > > > > > Thanks for the crash course in statistical terms :) > > > > > > In light of this I definitely support Cumulative{Sum,Count}, but I'm > > really > > > not crazy about SimpleWindowed{Sum,Count} (vs just Windowed). Not so much > > > because of its unfortunate length (although that is unfortunate it > > > shouldn't be a deciding factor) but because it seems to have the > > potential > > > to confuse further. I'm not sure what we gain by adding "Simple" since to > > > me at least, the unweighted-ness is obvious and the definition of simple > > is > > > not. To those who haven't been exposed to the finer details of > > statistical > > > definitions, I think they are more likely to read "SimpleXX" and wonder > > "is > > > there an 'advanced' or non-simple kind of Windowed?" than they are to > > > wonder what is the weighting behind these metrics. > > > > > > Sophie > > > > > > On Wed, Jul 17, 2019 at 8:18 AM John Roesler wrote: > > > > > > > Thanks for the discussion, all. > > > > > > > > I've done a little more research into the statistical terminology. > > > > Matthias is correct, "running" and "moving" appear to be synonyms. > > > > Unfortunately, both can be computed either over a window of the last N > > > > measurements or over all prior measurements. "Moving" just signifies > > > > that the statistic is computed over a "live" data set, i.e., a > > > > continuous stream of measurements, and the expectation is that the > > > > stat would be updated in response to new measurements. > > > > > > > > I found https://en.wikipedia.org/wiki/Moving_average to have a pretty > > > > good overview of the whole picture. > > > > > > > > After considering the discussion so far and some light reading, it > > > > seems like "Cumulative" is truly the correct term for the all-time > > > > metrics: > > > > > > > > > In a cumulative moving average, the data arrive in an > > > > > ordered datum stream, and the user would like to get > > > > > the average of all of the data up until the current datum > > > > > point. > > > > > > > > I know that we previously felt that "cumulative" was too much of a > > > > mouthful, but it seems like our quest for a terser term led us into a > > > > briar patch. Also, now there is an independent source (the wiki page) > > > > indicating that this is indeed the correct term, and it doesn't offer > > > > any synonyms to choose from. Maybe we can take comfort in the fact > > > > that we'll rarely be saying the name of the classes out loud. > > > > > > > > As far as moving stats that operate over a window of the last N > > > > measurements, there are multiple options, including Simple > > > > (unweighted), Weighted, and Exponentially Weighted, and presumably > > > > infinite variations with other weighting functions. In our domain, > > > > there is only one weighting function available, but it's still more > > > > self-documenting and future-proof to specify the type of windowed > > > > statistic. Therefore, I'm proposing "Simple" as the term for the > > > > windowed (aka sampled) stats, while keeping Windowed in the name to > > > > distinguish it from the all-time metrics. > > > > > > > > > In financial applications a simple moving average (SMA) > > > > > is the unweighted mean of the previous n data. > > > > > > > > Therefore, we would have the proposed matrix: > > > > > > > > SimpleWindowedSum, SimpleWindowedCount > > > > CumulativeSum, CumulativeCount > > > > > > > > Again, all these proposed names are less pithy than we might wish, but > > > > the whole point of this exercise is to demystify and disambiguate > > > > them. It seems like the discussion so far illustrates the futility of > > > > trying to find names that are both short and descriptive. > > > > > > > > How does
Re: [DISCUSS] KIP-488: Clean up Sum,Count,Total Metrics
Sounds good to me By the way, while I agree that we can't really do better than Sum and Count I will say I also found the distinction a bit unclear at first glance. We should at least document clearly that "Sum" is a "sum of values" whereas "Count" is a "number of things" -- but that doesn't need to be part of the KIP On Wed, Jul 17, 2019 at 11:00 AM John Roesler wrote: > Thanks for the replies. > > I guess that if we did add (e.g.) ExponentiallyWeightedWindowedX or > something, it should still be pretty obvious that WindowedX is the > unweighted version? In that case, I buy the argument that we don't > need "Simple" and we can just go with: > > WindowedSum, WindowedCount > CumulativeSum, CumulativeCount > > Sound good? > Thanks, > -John > > On Wed, Jul 17, 2019 at 11:53 AM Sophie Blee-Goldman > wrote: > > > > Thanks for the crash course in statistical terms :) > > > > In light of this I definitely support Cumulative{Sum,Count}, but I'm > really > > not crazy about SimpleWindowed{Sum,Count} (vs just Windowed). Not so much > > because of its unfortunate length (although that is unfortunate it > > shouldn't be a deciding factor) but because it seems to have the > potential > > to confuse further. I'm not sure what we gain by adding "Simple" since to > > me at least, the unweighted-ness is obvious and the definition of simple > is > > not. To those who haven't been exposed to the finer details of > statistical > > definitions, I think they are more likely to read "SimpleXX" and wonder > "is > > there an 'advanced' or non-simple kind of Windowed?" than they are to > > wonder what is the weighting behind these metrics. > > > > Sophie > > > > On Wed, Jul 17, 2019 at 8:18 AM John Roesler wrote: > > > > > Thanks for the discussion, all. > > > > > > I've done a little more research into the statistical terminology. > > > Matthias is correct, "running" and "moving" appear to be synonyms. > > > Unfortunately, both can be computed either over a window of the last N > > > measurements or over all prior measurements. "Moving" just signifies > > > that the statistic is computed over a "live" data set, i.e., a > > > continuous stream of measurements, and the expectation is that the > > > stat would be updated in response to new measurements. > > > > > > I found https://en.wikipedia.org/wiki/Moving_average to have a pretty > > > good overview of the whole picture. > > > > > > After considering the discussion so far and some light reading, it > > > seems like "Cumulative" is truly the correct term for the all-time > > > metrics: > > > > > > > In a cumulative moving average, the data arrive in an > > > > ordered datum stream, and the user would like to get > > > > the average of all of the data up until the current datum > > > > point. > > > > > > I know that we previously felt that "cumulative" was too much of a > > > mouthful, but it seems like our quest for a terser term led us into a > > > briar patch. Also, now there is an independent source (the wiki page) > > > indicating that this is indeed the correct term, and it doesn't offer > > > any synonyms to choose from. Maybe we can take comfort in the fact > > > that we'll rarely be saying the name of the classes out loud. > > > > > > As far as moving stats that operate over a window of the last N > > > measurements, there are multiple options, including Simple > > > (unweighted), Weighted, and Exponentially Weighted, and presumably > > > infinite variations with other weighting functions. In our domain, > > > there is only one weighting function available, but it's still more > > > self-documenting and future-proof to specify the type of windowed > > > statistic. Therefore, I'm proposing "Simple" as the term for the > > > windowed (aka sampled) stats, while keeping Windowed in the name to > > > distinguish it from the all-time metrics. > > > > > > > In financial applications a simple moving average (SMA) > > > > is the unweighted mean of the previous n data. > > > > > > Therefore, we would have the proposed matrix: > > > > > > SimpleWindowedSum, SimpleWindowedCount > > > CumulativeSum, CumulativeCount > > > > > > Again, all these proposed names are less pithy than we might wish, but > > > the whole point of this exercise is to demystify and disambiguate > > > them. It seems like the discussion so far illustrates the futility of > > > trying to find names that are both short and descriptive. > > > > > > How does that sound? > > > -John > > > > > > On Tue, Jul 16, 2019 at 4:43 PM Matthias J. Sax > > > > wrote: > > > > > > > > It's a fair point that Ryanne raises. However, "running sum" is the > same > > > > as "moving sum" from my understanding. > > > > > > > > The issue is still, that `Sum` and `Count` which seem to be the > cleanest > > > > names cannot be used. While I agree that `TotalSum` and `TotalCount` > is > > > > somewhat redundant, I still think it the best suggestion so far. > > > > > > > > For the "sampled" version, I am personally fine with either >
Re: [DISCUSS] KIP-488: Clean up Sum,Count,Total Metrics
Thanks for the replies. I guess that if we did add (e.g.) ExponentiallyWeightedWindowedX or something, it should still be pretty obvious that WindowedX is the unweighted version? In that case, I buy the argument that we don't need "Simple" and we can just go with: WindowedSum, WindowedCount CumulativeSum, CumulativeCount Sound good? Thanks, -John On Wed, Jul 17, 2019 at 11:53 AM Sophie Blee-Goldman wrote: > > Thanks for the crash course in statistical terms :) > > In light of this I definitely support Cumulative{Sum,Count}, but I'm really > not crazy about SimpleWindowed{Sum,Count} (vs just Windowed). Not so much > because of its unfortunate length (although that is unfortunate it > shouldn't be a deciding factor) but because it seems to have the potential > to confuse further. I'm not sure what we gain by adding "Simple" since to > me at least, the unweighted-ness is obvious and the definition of simple is > not. To those who haven't been exposed to the finer details of statistical > definitions, I think they are more likely to read "SimpleXX" and wonder "is > there an 'advanced' or non-simple kind of Windowed?" than they are to > wonder what is the weighting behind these metrics. > > Sophie > > On Wed, Jul 17, 2019 at 8:18 AM John Roesler wrote: > > > Thanks for the discussion, all. > > > > I've done a little more research into the statistical terminology. > > Matthias is correct, "running" and "moving" appear to be synonyms. > > Unfortunately, both can be computed either over a window of the last N > > measurements or over all prior measurements. "Moving" just signifies > > that the statistic is computed over a "live" data set, i.e., a > > continuous stream of measurements, and the expectation is that the > > stat would be updated in response to new measurements. > > > > I found https://en.wikipedia.org/wiki/Moving_average to have a pretty > > good overview of the whole picture. > > > > After considering the discussion so far and some light reading, it > > seems like "Cumulative" is truly the correct term for the all-time > > metrics: > > > > > In a cumulative moving average, the data arrive in an > > > ordered datum stream, and the user would like to get > > > the average of all of the data up until the current datum > > > point. > > > > I know that we previously felt that "cumulative" was too much of a > > mouthful, but it seems like our quest for a terser term led us into a > > briar patch. Also, now there is an independent source (the wiki page) > > indicating that this is indeed the correct term, and it doesn't offer > > any synonyms to choose from. Maybe we can take comfort in the fact > > that we'll rarely be saying the name of the classes out loud. > > > > As far as moving stats that operate over a window of the last N > > measurements, there are multiple options, including Simple > > (unweighted), Weighted, and Exponentially Weighted, and presumably > > infinite variations with other weighting functions. In our domain, > > there is only one weighting function available, but it's still more > > self-documenting and future-proof to specify the type of windowed > > statistic. Therefore, I'm proposing "Simple" as the term for the > > windowed (aka sampled) stats, while keeping Windowed in the name to > > distinguish it from the all-time metrics. > > > > > In financial applications a simple moving average (SMA) > > > is the unweighted mean of the previous n data. > > > > Therefore, we would have the proposed matrix: > > > > SimpleWindowedSum, SimpleWindowedCount > > CumulativeSum, CumulativeCount > > > > Again, all these proposed names are less pithy than we might wish, but > > the whole point of this exercise is to demystify and disambiguate > > them. It seems like the discussion so far illustrates the futility of > > trying to find names that are both short and descriptive. > > > > How does that sound? > > -John > > > > On Tue, Jul 16, 2019 at 4:43 PM Matthias J. Sax > > wrote: > > > > > > It's a fair point that Ryanne raises. However, "running sum" is the same > > > as "moving sum" from my understanding. > > > > > > The issue is still, that `Sum` and `Count` which seem to be the cleanest > > > names cannot be used. While I agree that `TotalSum` and `TotalCount` is > > > somewhat redundant, I still think it the best suggestion so far. > > > > > > For the "sampled" version, I am personally fine with either `MovingXxx`, > > > `WindowedXxx`, or `RunningXxx` -- to me, that are all equally good to > > > describe the semantics. > > > > > > > > > > > > -Matthias > > > > > > > > > > > > On 7/16/19 2:25 PM, Sophie Blee-Goldman wrote: > > > > I'm +1 on Windowed, was about to suggest that as I was catching up on > > the > > > > discussion but Bill beat me to it :) > > > > > > > > On Tue, Jul 16, 2019 at 2:23 PM Bill Bejeck wrote: > > > > > > > >> Hi John, > > > >> > > > >> Thanks for the updates. > > > >> > > > >> I like RunningCount and RunningSum. > > > >> > > > >> What about WindowedCount,
Re: [DISCUSS] KIP-488: Clean up Sum,Count,Total Metrics
Thanks for the crash course in statistical terms :) In light of this I definitely support Cumulative{Sum,Count}, but I'm really not crazy about SimpleWindowed{Sum,Count} (vs just Windowed). Not so much because of its unfortunate length (although that is unfortunate it shouldn't be a deciding factor) but because it seems to have the potential to confuse further. I'm not sure what we gain by adding "Simple" since to me at least, the unweighted-ness is obvious and the definition of simple is not. To those who haven't been exposed to the finer details of statistical definitions, I think they are more likely to read "SimpleXX" and wonder "is there an 'advanced' or non-simple kind of Windowed?" than they are to wonder what is the weighting behind these metrics. Sophie On Wed, Jul 17, 2019 at 8:18 AM John Roesler wrote: > Thanks for the discussion, all. > > I've done a little more research into the statistical terminology. > Matthias is correct, "running" and "moving" appear to be synonyms. > Unfortunately, both can be computed either over a window of the last N > measurements or over all prior measurements. "Moving" just signifies > that the statistic is computed over a "live" data set, i.e., a > continuous stream of measurements, and the expectation is that the > stat would be updated in response to new measurements. > > I found https://en.wikipedia.org/wiki/Moving_average to have a pretty > good overview of the whole picture. > > After considering the discussion so far and some light reading, it > seems like "Cumulative" is truly the correct term for the all-time > metrics: > > > In a cumulative moving average, the data arrive in an > > ordered datum stream, and the user would like to get > > the average of all of the data up until the current datum > > point. > > I know that we previously felt that "cumulative" was too much of a > mouthful, but it seems like our quest for a terser term led us into a > briar patch. Also, now there is an independent source (the wiki page) > indicating that this is indeed the correct term, and it doesn't offer > any synonyms to choose from. Maybe we can take comfort in the fact > that we'll rarely be saying the name of the classes out loud. > > As far as moving stats that operate over a window of the last N > measurements, there are multiple options, including Simple > (unweighted), Weighted, and Exponentially Weighted, and presumably > infinite variations with other weighting functions. In our domain, > there is only one weighting function available, but it's still more > self-documenting and future-proof to specify the type of windowed > statistic. Therefore, I'm proposing "Simple" as the term for the > windowed (aka sampled) stats, while keeping Windowed in the name to > distinguish it from the all-time metrics. > > > In financial applications a simple moving average (SMA) > > is the unweighted mean of the previous n data. > > Therefore, we would have the proposed matrix: > > SimpleWindowedSum, SimpleWindowedCount > CumulativeSum, CumulativeCount > > Again, all these proposed names are less pithy than we might wish, but > the whole point of this exercise is to demystify and disambiguate > them. It seems like the discussion so far illustrates the futility of > trying to find names that are both short and descriptive. > > How does that sound? > -John > > On Tue, Jul 16, 2019 at 4:43 PM Matthias J. Sax > wrote: > > > > It's a fair point that Ryanne raises. However, "running sum" is the same > > as "moving sum" from my understanding. > > > > The issue is still, that `Sum` and `Count` which seem to be the cleanest > > names cannot be used. While I agree that `TotalSum` and `TotalCount` is > > somewhat redundant, I still think it the best suggestion so far. > > > > For the "sampled" version, I am personally fine with either `MovingXxx`, > > `WindowedXxx`, or `RunningXxx` -- to me, that are all equally good to > > describe the semantics. > > > > > > > > -Matthias > > > > > > > > On 7/16/19 2:25 PM, Sophie Blee-Goldman wrote: > > > I'm +1 on Windowed, was about to suggest that as I was catching up on > the > > > discussion but Bill beat me to it :) > > > > > > On Tue, Jul 16, 2019 at 2:23 PM Bill Bejeck wrote: > > > > > >> Hi John, > > >> > > >> Thanks for the updates. > > >> > > >> I like RunningCount and RunningSum. > > >> > > >> What about WindowedCount, WindowedSum instead of Moving? > > >> I'm just throwing this out there as Windowed seems more intuitive to > me, > > >> but I'm not married to the idea. > > >> > > >> -Bill > > >> > > >> On Tue, Jul 16, 2019 at 5:09 PM John Roesler > wrote: > > >> > > >>> No worries! Choosing good public API names is a high-impact design > > >>> activity. > > >>> > > >>> Matthias, Bruno, Bill, and Stanislav, > > >>> > > >>> You've all contributed to this discussion or the vote so far... How > do > > >>> you feel about the proposed name change: > > >>> > > >>> MovingCount, MovingSum (instead of Sampled) > > >>> RunningCount, RunningSum
Re: [DISCUSS] KIP-488: Clean up Sum,Count,Total Metrics
I'm fine with the new names. While I was previously in the "brevity" camp when it came to the Cumulative name, I'll take clarity over brevity. Thanks for the updates, John. -Bill On Wed, Jul 17, 2019 at 11:18 AM John Roesler wrote: > Thanks for the discussion, all. > > I've done a little more research into the statistical terminology. > Matthias is correct, "running" and "moving" appear to be synonyms. > Unfortunately, both can be computed either over a window of the last N > measurements or over all prior measurements. "Moving" just signifies > that the statistic is computed over a "live" data set, i.e., a > continuous stream of measurements, and the expectation is that the > stat would be updated in response to new measurements. > > I found https://en.wikipedia.org/wiki/Moving_average to have a pretty > good overview of the whole picture. > > After considering the discussion so far and some light reading, it > seems like "Cumulative" is truly the correct term for the all-time > metrics: > > > In a cumulative moving average, the data arrive in an > > ordered datum stream, and the user would like to get > > the average of all of the data up until the current datum > > point. > > I know that we previously felt that "cumulative" was too much of a > mouthful, but it seems like our quest for a terser term led us into a > briar patch. Also, now there is an independent source (the wiki page) > indicating that this is indeed the correct term, and it doesn't offer > any synonyms to choose from. Maybe we can take comfort in the fact > that we'll rarely be saying the name of the classes out loud. > > As far as moving stats that operate over a window of the last N > measurements, there are multiple options, including Simple > (unweighted), Weighted, and Exponentially Weighted, and presumably > infinite variations with other weighting functions. In our domain, > there is only one weighting function available, but it's still more > self-documenting and future-proof to specify the type of windowed > statistic. Therefore, I'm proposing "Simple" as the term for the > windowed (aka sampled) stats, while keeping Windowed in the name to > distinguish it from the all-time metrics. > > > In financial applications a simple moving average (SMA) > > is the unweighted mean of the previous n data. > > Therefore, we would have the proposed matrix: > > SimpleWindowedSum, SimpleWindowedCount > CumulativeSum, CumulativeCount > > Again, all these proposed names are less pithy than we might wish, but > the whole point of this exercise is to demystify and disambiguate > them. It seems like the discussion so far illustrates the futility of > trying to find names that are both short and descriptive. > > How does that sound? > -John > > On Tue, Jul 16, 2019 at 4:43 PM Matthias J. Sax > wrote: > > > > It's a fair point that Ryanne raises. However, "running sum" is the same > > as "moving sum" from my understanding. > > > > The issue is still, that `Sum` and `Count` which seem to be the cleanest > > names cannot be used. While I agree that `TotalSum` and `TotalCount` is > > somewhat redundant, I still think it the best suggestion so far. > > > > For the "sampled" version, I am personally fine with either `MovingXxx`, > > `WindowedXxx`, or `RunningXxx` -- to me, that are all equally good to > > describe the semantics. > > > > > > > > -Matthias > > > > > > > > On 7/16/19 2:25 PM, Sophie Blee-Goldman wrote: > > > I'm +1 on Windowed, was about to suggest that as I was catching up on > the > > > discussion but Bill beat me to it :) > > > > > > On Tue, Jul 16, 2019 at 2:23 PM Bill Bejeck wrote: > > > > > >> Hi John, > > >> > > >> Thanks for the updates. > > >> > > >> I like RunningCount and RunningSum. > > >> > > >> What about WindowedCount, WindowedSum instead of Moving? > > >> I'm just throwing this out there as Windowed seems more intuitive to > me, > > >> but I'm not married to the idea. > > >> > > >> -Bill > > >> > > >> On Tue, Jul 16, 2019 at 5:09 PM John Roesler > wrote: > > >> > > >>> No worries! Choosing good public API names is a high-impact design > > >>> activity. > > >>> > > >>> Matthias, Bruno, Bill, and Stanislav, > > >>> > > >>> You've all contributed to this discussion or the vote so far... How > do > > >>> you feel about the proposed name change: > > >>> > > >>> MovingCount, MovingSum (instead of Sampled) > > >>> RunningCount, RunningSum (Instead of Total) > > >>> > > >>> Thanks, > > >>> -John > > >>> > > >>> On Tue, Jul 16, 2019 at 3:04 PM Ryanne Dolan > > >>> wrote: > > > > John, that makes sense to me. Sorry for the bikeshedding. > > > > Ryanne > > > > On Tue, Jul 16, 2019 at 12:49 PM John Roesler > > >> wrote: > > > > > Thanks for the explanation and the suggestion, Ryanne, > > > > > > I went with "sampled" just because these are instances of > > >> SampledStat, > > > which in the Kafka Metrics ecosystem are computed from a window of > > > recent samples.
Re: [DISCUSS] KIP-488: Clean up Sum,Count,Total Metrics
Thanks for the discussion, all. I've done a little more research into the statistical terminology. Matthias is correct, "running" and "moving" appear to be synonyms. Unfortunately, both can be computed either over a window of the last N measurements or over all prior measurements. "Moving" just signifies that the statistic is computed over a "live" data set, i.e., a continuous stream of measurements, and the expectation is that the stat would be updated in response to new measurements. I found https://en.wikipedia.org/wiki/Moving_average to have a pretty good overview of the whole picture. After considering the discussion so far and some light reading, it seems like "Cumulative" is truly the correct term for the all-time metrics: > In a cumulative moving average, the data arrive in an > ordered datum stream, and the user would like to get > the average of all of the data up until the current datum > point. I know that we previously felt that "cumulative" was too much of a mouthful, but it seems like our quest for a terser term led us into a briar patch. Also, now there is an independent source (the wiki page) indicating that this is indeed the correct term, and it doesn't offer any synonyms to choose from. Maybe we can take comfort in the fact that we'll rarely be saying the name of the classes out loud. As far as moving stats that operate over a window of the last N measurements, there are multiple options, including Simple (unweighted), Weighted, and Exponentially Weighted, and presumably infinite variations with other weighting functions. In our domain, there is only one weighting function available, but it's still more self-documenting and future-proof to specify the type of windowed statistic. Therefore, I'm proposing "Simple" as the term for the windowed (aka sampled) stats, while keeping Windowed in the name to distinguish it from the all-time metrics. > In financial applications a simple moving average (SMA) > is the unweighted mean of the previous n data. Therefore, we would have the proposed matrix: SimpleWindowedSum, SimpleWindowedCount CumulativeSum, CumulativeCount Again, all these proposed names are less pithy than we might wish, but the whole point of this exercise is to demystify and disambiguate them. It seems like the discussion so far illustrates the futility of trying to find names that are both short and descriptive. How does that sound? -John On Tue, Jul 16, 2019 at 4:43 PM Matthias J. Sax wrote: > > It's a fair point that Ryanne raises. However, "running sum" is the same > as "moving sum" from my understanding. > > The issue is still, that `Sum` and `Count` which seem to be the cleanest > names cannot be used. While I agree that `TotalSum` and `TotalCount` is > somewhat redundant, I still think it the best suggestion so far. > > For the "sampled" version, I am personally fine with either `MovingXxx`, > `WindowedXxx`, or `RunningXxx` -- to me, that are all equally good to > describe the semantics. > > > > -Matthias > > > > On 7/16/19 2:25 PM, Sophie Blee-Goldman wrote: > > I'm +1 on Windowed, was about to suggest that as I was catching up on the > > discussion but Bill beat me to it :) > > > > On Tue, Jul 16, 2019 at 2:23 PM Bill Bejeck wrote: > > > >> Hi John, > >> > >> Thanks for the updates. > >> > >> I like RunningCount and RunningSum. > >> > >> What about WindowedCount, WindowedSum instead of Moving? > >> I'm just throwing this out there as Windowed seems more intuitive to me, > >> but I'm not married to the idea. > >> > >> -Bill > >> > >> On Tue, Jul 16, 2019 at 5:09 PM John Roesler wrote: > >> > >>> No worries! Choosing good public API names is a high-impact design > >>> activity. > >>> > >>> Matthias, Bruno, Bill, and Stanislav, > >>> > >>> You've all contributed to this discussion or the vote so far... How do > >>> you feel about the proposed name change: > >>> > >>> MovingCount, MovingSum (instead of Sampled) > >>> RunningCount, RunningSum (Instead of Total) > >>> > >>> Thanks, > >>> -John > >>> > >>> On Tue, Jul 16, 2019 at 3:04 PM Ryanne Dolan > >>> wrote: > > John, that makes sense to me. Sorry for the bikeshedding. > > Ryanne > > On Tue, Jul 16, 2019 at 12:49 PM John Roesler > >> wrote: > > > Thanks for the explanation and the suggestion, Ryanne, > > > > I went with "sampled" just because these are instances of > >> SampledStat, > > which in the Kafka Metrics ecosystem are computed from a window of > > recent samples. Thinking more about it, the fact that they are > >> sampled > > and the fact that they are windowed are orthogonal, which is what > > you're pointing out... sampling by itself doesn't indicate that it's > >> a > > moving average. > > > > Since there is no way in Kafka Metrics for a metric to be sampled and > > not windowed/moving/decaying, calling them Sampled would never be > > incorrect. But to someone unfamiliar with the code, it wouldn't > >
Re: [DISCUSS] KIP-488: Clean up Sum,Count,Total Metrics
It's a fair point that Ryanne raises. However, "running sum" is the same as "moving sum" from my understanding. The issue is still, that `Sum` and `Count` which seem to be the cleanest names cannot be used. While I agree that `TotalSum` and `TotalCount` is somewhat redundant, I still think it the best suggestion so far. For the "sampled" version, I am personally fine with either `MovingXxx`, `WindowedXxx`, or `RunningXxx` -- to me, that are all equally good to describe the semantics. -Matthias On 7/16/19 2:25 PM, Sophie Blee-Goldman wrote: > I'm +1 on Windowed, was about to suggest that as I was catching up on the > discussion but Bill beat me to it :) > > On Tue, Jul 16, 2019 at 2:23 PM Bill Bejeck wrote: > >> Hi John, >> >> Thanks for the updates. >> >> I like RunningCount and RunningSum. >> >> What about WindowedCount, WindowedSum instead of Moving? >> I'm just throwing this out there as Windowed seems more intuitive to me, >> but I'm not married to the idea. >> >> -Bill >> >> On Tue, Jul 16, 2019 at 5:09 PM John Roesler wrote: >> >>> No worries! Choosing good public API names is a high-impact design >>> activity. >>> >>> Matthias, Bruno, Bill, and Stanislav, >>> >>> You've all contributed to this discussion or the vote so far... How do >>> you feel about the proposed name change: >>> >>> MovingCount, MovingSum (instead of Sampled) >>> RunningCount, RunningSum (Instead of Total) >>> >>> Thanks, >>> -John >>> >>> On Tue, Jul 16, 2019 at 3:04 PM Ryanne Dolan >>> wrote: John, that makes sense to me. Sorry for the bikeshedding. Ryanne On Tue, Jul 16, 2019 at 12:49 PM John Roesler >> wrote: > Thanks for the explanation and the suggestion, Ryanne, > > I went with "sampled" just because these are instances of >> SampledStat, > which in the Kafka Metrics ecosystem are computed from a window of > recent samples. Thinking more about it, the fact that they are >> sampled > and the fact that they are windowed are orthogonal, which is what > you're pointing out... sampling by itself doesn't indicate that it's >> a > moving average. > > Since there is no way in Kafka Metrics for a metric to be sampled and > not windowed/moving/decaying, calling them Sampled would never be > incorrect. But to someone unfamiliar with the code, it wouldn't > immediately suggest the behavior of the metric that actually matters. > That is, the behavior that distinguishes the two classes of metrics >> we > want to disambiguate here. > > It sounds like you'd suggest a new matrix of names: > MovingCount, MovingSum > RunningCount, RunningSum > > Are these names unambiguous and self explanatory? > > Thanks, > -John > > On Tue, Jul 16, 2019 at 12:32 PM Ryanne Dolan >> > wrote: >> >>> measurements, which decay/expire over time >> >> Thanks John for the clarification. This was my (re-)reading of the >>> code, >> but this is not what I think of when I hear "sampled". In fact, >>> you'll >> notice that the Wikipedia pages for "Sample (statistics)" and >> "Sample >> (signal processing)" do not contain the words decay, expire, >> recent, >> history, or anything similar. >> >> Similar to "running", I'd suggest the more correct "moving", as in > "moving >> average" and "moving sum", which involve looking back N samples, > applying a >> sliding window, decaying over time etc. >> >> Ryanne >> >> On Tue, Jul 16, 2019, 11:58 AM John Roesler >>> wrote: >> >>> Thanks for raising this concern, Ryanne, >>> >>> "Sampled" indicates that the metrics is sampled, namely that we >>> maintain a set of samples from recent value measurements, which >>> decay/expire over time. So, the metric value is only >>> representative of >>> the recent past. >>> >>> "Total" indicates that the metric value contains all the >>> information >>> from the creation of the metric. For example., the total sum >> would >>> include all measurements since the app started up. >>> >>> It seems like your concern is that the word "total" doesn't >> really >>> pinpoint this meaning, which is true. It's especially confusing >>> that >>> another meaning of "total" is synonymous with "sum", rendering >> the >>> name "TotalSum" sort of absurd. >>> >>> We previously considered "cumulative", which was rejected as a >>> mouthful (it's four syllables) . >>> >>> You mentioned "running", which might be a more appropriate >> modifier >>> (RunningSum and RunningCount). >>> >>> What would everyone think about that? >>> >>> Thanks, >>> -John >>> >>> On Tue, Jul 16, 2019 at 11:27 AM Ryanne Dolan < >>> ryannedo...@gmail.com> >>> wrote: John, I mentioned on the VOTE thread that the proposed names >> are >>> a > bit confusing,
Re: [DISCUSS] KIP-488: Clean up Sum,Count,Total Metrics
I'm +1 on Windowed, was about to suggest that as I was catching up on the discussion but Bill beat me to it :) On Tue, Jul 16, 2019 at 2:23 PM Bill Bejeck wrote: > Hi John, > > Thanks for the updates. > > I like RunningCount and RunningSum. > > What about WindowedCount, WindowedSum instead of Moving? > I'm just throwing this out there as Windowed seems more intuitive to me, > but I'm not married to the idea. > > -Bill > > On Tue, Jul 16, 2019 at 5:09 PM John Roesler wrote: > > > No worries! Choosing good public API names is a high-impact design > > activity. > > > > Matthias, Bruno, Bill, and Stanislav, > > > > You've all contributed to this discussion or the vote so far... How do > > you feel about the proposed name change: > > > > MovingCount, MovingSum (instead of Sampled) > > RunningCount, RunningSum (Instead of Total) > > > > Thanks, > > -John > > > > On Tue, Jul 16, 2019 at 3:04 PM Ryanne Dolan > > wrote: > > > > > > John, that makes sense to me. Sorry for the bikeshedding. > > > > > > Ryanne > > > > > > On Tue, Jul 16, 2019 at 12:49 PM John Roesler > wrote: > > > > > > > Thanks for the explanation and the suggestion, Ryanne, > > > > > > > > I went with "sampled" just because these are instances of > SampledStat, > > > > which in the Kafka Metrics ecosystem are computed from a window of > > > > recent samples. Thinking more about it, the fact that they are > sampled > > > > and the fact that they are windowed are orthogonal, which is what > > > > you're pointing out... sampling by itself doesn't indicate that it's > a > > > > moving average. > > > > > > > > Since there is no way in Kafka Metrics for a metric to be sampled and > > > > not windowed/moving/decaying, calling them Sampled would never be > > > > incorrect. But to someone unfamiliar with the code, it wouldn't > > > > immediately suggest the behavior of the metric that actually matters. > > > > That is, the behavior that distinguishes the two classes of metrics > we > > > > want to disambiguate here. > > > > > > > > It sounds like you'd suggest a new matrix of names: > > > > MovingCount, MovingSum > > > > RunningCount, RunningSum > > > > > > > > Are these names unambiguous and self explanatory? > > > > > > > > Thanks, > > > > -John > > > > > > > > On Tue, Jul 16, 2019 at 12:32 PM Ryanne Dolan > > > > > wrote: > > > > > > > > > > > measurements, which decay/expire over time > > > > > > > > > > Thanks John for the clarification. This was my (re-)reading of the > > code, > > > > > but this is not what I think of when I hear "sampled". In fact, > > you'll > > > > > notice that the Wikipedia pages for "Sample (statistics)" and > "Sample > > > > > (signal processing)" do not contain the words decay, expire, > recent, > > > > > history, or anything similar. > > > > > > > > > > Similar to "running", I'd suggest the more correct "moving", as in > > > > "moving > > > > > average" and "moving sum", which involve looking back N samples, > > > > applying a > > > > > sliding window, decaying over time etc. > > > > > > > > > > Ryanne > > > > > > > > > > On Tue, Jul 16, 2019, 11:58 AM John Roesler > > wrote: > > > > > > > > > > > Thanks for raising this concern, Ryanne, > > > > > > > > > > > > "Sampled" indicates that the metrics is sampled, namely that we > > > > > > maintain a set of samples from recent value measurements, which > > > > > > decay/expire over time. So, the metric value is only > > representative of > > > > > > the recent past. > > > > > > > > > > > > "Total" indicates that the metric value contains all the > > information > > > > > > from the creation of the metric. For example., the total sum > would > > > > > > include all measurements since the app started up. > > > > > > > > > > > > It seems like your concern is that the word "total" doesn't > really > > > > > > pinpoint this meaning, which is true. It's especially confusing > > that > > > > > > another meaning of "total" is synonymous with "sum", rendering > the > > > > > > name "TotalSum" sort of absurd. > > > > > > > > > > > > We previously considered "cumulative", which was rejected as a > > > > > > mouthful (it's four syllables) . > > > > > > > > > > > > You mentioned "running", which might be a more appropriate > modifier > > > > > > (RunningSum and RunningCount). > > > > > > > > > > > > What would everyone think about that? > > > > > > > > > > > > Thanks, > > > > > > -John > > > > > > > > > > > > On Tue, Jul 16, 2019 at 11:27 AM Ryanne Dolan < > > ryannedo...@gmail.com> > > > > > > wrote: > > > > > > > > > > > > > > John, I mentioned on the VOTE thread that the proposed names > are > > a > > > > bit > > > > > > > confusing, > > > > > > > > > > > > > > > given that "sum", "total", and "count" are roughly > > synonymous... > > > > > > > > > > > > > > In particular, TotalSum is, I think, a "running total", though > > the > > > > naming > > > > > > > doesn't really capture that. > > > > > > > > > > > > > > I think to avoid confusion, we should define exactly what > >
Re: [DISCUSS] KIP-488: Clean up Sum,Count,Total Metrics
Hi John, Thanks for the updates. I like RunningCount and RunningSum. What about WindowedCount, WindowedSum instead of Moving? I'm just throwing this out there as Windowed seems more intuitive to me, but I'm not married to the idea. -Bill On Tue, Jul 16, 2019 at 5:09 PM John Roesler wrote: > No worries! Choosing good public API names is a high-impact design > activity. > > Matthias, Bruno, Bill, and Stanislav, > > You've all contributed to this discussion or the vote so far... How do > you feel about the proposed name change: > > MovingCount, MovingSum (instead of Sampled) > RunningCount, RunningSum (Instead of Total) > > Thanks, > -John > > On Tue, Jul 16, 2019 at 3:04 PM Ryanne Dolan > wrote: > > > > John, that makes sense to me. Sorry for the bikeshedding. > > > > Ryanne > > > > On Tue, Jul 16, 2019 at 12:49 PM John Roesler wrote: > > > > > Thanks for the explanation and the suggestion, Ryanne, > > > > > > I went with "sampled" just because these are instances of SampledStat, > > > which in the Kafka Metrics ecosystem are computed from a window of > > > recent samples. Thinking more about it, the fact that they are sampled > > > and the fact that they are windowed are orthogonal, which is what > > > you're pointing out... sampling by itself doesn't indicate that it's a > > > moving average. > > > > > > Since there is no way in Kafka Metrics for a metric to be sampled and > > > not windowed/moving/decaying, calling them Sampled would never be > > > incorrect. But to someone unfamiliar with the code, it wouldn't > > > immediately suggest the behavior of the metric that actually matters. > > > That is, the behavior that distinguishes the two classes of metrics we > > > want to disambiguate here. > > > > > > It sounds like you'd suggest a new matrix of names: > > > MovingCount, MovingSum > > > RunningCount, RunningSum > > > > > > Are these names unambiguous and self explanatory? > > > > > > Thanks, > > > -John > > > > > > On Tue, Jul 16, 2019 at 12:32 PM Ryanne Dolan > > > wrote: > > > > > > > > > measurements, which decay/expire over time > > > > > > > > Thanks John for the clarification. This was my (re-)reading of the > code, > > > > but this is not what I think of when I hear "sampled". In fact, > you'll > > > > notice that the Wikipedia pages for "Sample (statistics)" and "Sample > > > > (signal processing)" do not contain the words decay, expire, recent, > > > > history, or anything similar. > > > > > > > > Similar to "running", I'd suggest the more correct "moving", as in > > > "moving > > > > average" and "moving sum", which involve looking back N samples, > > > applying a > > > > sliding window, decaying over time etc. > > > > > > > > Ryanne > > > > > > > > On Tue, Jul 16, 2019, 11:58 AM John Roesler > wrote: > > > > > > > > > Thanks for raising this concern, Ryanne, > > > > > > > > > > "Sampled" indicates that the metrics is sampled, namely that we > > > > > maintain a set of samples from recent value measurements, which > > > > > decay/expire over time. So, the metric value is only > representative of > > > > > the recent past. > > > > > > > > > > "Total" indicates that the metric value contains all the > information > > > > > from the creation of the metric. For example., the total sum would > > > > > include all measurements since the app started up. > > > > > > > > > > It seems like your concern is that the word "total" doesn't really > > > > > pinpoint this meaning, which is true. It's especially confusing > that > > > > > another meaning of "total" is synonymous with "sum", rendering the > > > > > name "TotalSum" sort of absurd. > > > > > > > > > > We previously considered "cumulative", which was rejected as a > > > > > mouthful (it's four syllables) . > > > > > > > > > > You mentioned "running", which might be a more appropriate modifier > > > > > (RunningSum and RunningCount). > > > > > > > > > > What would everyone think about that? > > > > > > > > > > Thanks, > > > > > -John > > > > > > > > > > On Tue, Jul 16, 2019 at 11:27 AM Ryanne Dolan < > ryannedo...@gmail.com> > > > > > wrote: > > > > > > > > > > > > John, I mentioned on the VOTE thread that the proposed names are > a > > > bit > > > > > > confusing, > > > > > > > > > > > > > given that "sum", "total", and "count" are roughly > synonymous... > > > > > > > > > > > > In particular, TotalSum is, I think, a "running total", though > the > > > naming > > > > > > doesn't really capture that. > > > > > > > > > > > > I think to avoid confusion, we should define exactly what > "total" and > > > > > > "sampled" are supposed to indicate, and perhaps pick appropriate > > > naming > > > > > > from there. > > > > > > > > > > > > Ryanne > > > > > > > > > > > > > > > > > > On Fri, Jul 12, 2019 at 1:42 PM John Roesler > > > wrote: > > > > > > > > > > > > > Hey, thanks Matthias and Bruno, > > > > > > > > > > > > > > I agree, "Cumulative" is a mouthful. "TotalX" sounds fine to > me. > > > > > > > > > > > > > > Also, yes, I would have
Re: [DISCUSS] KIP-488: Clean up Sum,Count,Total Metrics
No worries! Choosing good public API names is a high-impact design activity. Matthias, Bruno, Bill, and Stanislav, You've all contributed to this discussion or the vote so far... How do you feel about the proposed name change: MovingCount, MovingSum (instead of Sampled) RunningCount, RunningSum (Instead of Total) Thanks, -John On Tue, Jul 16, 2019 at 3:04 PM Ryanne Dolan wrote: > > John, that makes sense to me. Sorry for the bikeshedding. > > Ryanne > > On Tue, Jul 16, 2019 at 12:49 PM John Roesler wrote: > > > Thanks for the explanation and the suggestion, Ryanne, > > > > I went with "sampled" just because these are instances of SampledStat, > > which in the Kafka Metrics ecosystem are computed from a window of > > recent samples. Thinking more about it, the fact that they are sampled > > and the fact that they are windowed are orthogonal, which is what > > you're pointing out... sampling by itself doesn't indicate that it's a > > moving average. > > > > Since there is no way in Kafka Metrics for a metric to be sampled and > > not windowed/moving/decaying, calling them Sampled would never be > > incorrect. But to someone unfamiliar with the code, it wouldn't > > immediately suggest the behavior of the metric that actually matters. > > That is, the behavior that distinguishes the two classes of metrics we > > want to disambiguate here. > > > > It sounds like you'd suggest a new matrix of names: > > MovingCount, MovingSum > > RunningCount, RunningSum > > > > Are these names unambiguous and self explanatory? > > > > Thanks, > > -John > > > > On Tue, Jul 16, 2019 at 12:32 PM Ryanne Dolan > > wrote: > > > > > > > measurements, which decay/expire over time > > > > > > Thanks John for the clarification. This was my (re-)reading of the code, > > > but this is not what I think of when I hear "sampled". In fact, you'll > > > notice that the Wikipedia pages for "Sample (statistics)" and "Sample > > > (signal processing)" do not contain the words decay, expire, recent, > > > history, or anything similar. > > > > > > Similar to "running", I'd suggest the more correct "moving", as in > > "moving > > > average" and "moving sum", which involve looking back N samples, > > applying a > > > sliding window, decaying over time etc. > > > > > > Ryanne > > > > > > On Tue, Jul 16, 2019, 11:58 AM John Roesler wrote: > > > > > > > Thanks for raising this concern, Ryanne, > > > > > > > > "Sampled" indicates that the metrics is sampled, namely that we > > > > maintain a set of samples from recent value measurements, which > > > > decay/expire over time. So, the metric value is only representative of > > > > the recent past. > > > > > > > > "Total" indicates that the metric value contains all the information > > > > from the creation of the metric. For example., the total sum would > > > > include all measurements since the app started up. > > > > > > > > It seems like your concern is that the word "total" doesn't really > > > > pinpoint this meaning, which is true. It's especially confusing that > > > > another meaning of "total" is synonymous with "sum", rendering the > > > > name "TotalSum" sort of absurd. > > > > > > > > We previously considered "cumulative", which was rejected as a > > > > mouthful (it's four syllables) . > > > > > > > > You mentioned "running", which might be a more appropriate modifier > > > > (RunningSum and RunningCount). > > > > > > > > What would everyone think about that? > > > > > > > > Thanks, > > > > -John > > > > > > > > On Tue, Jul 16, 2019 at 11:27 AM Ryanne Dolan > > > > wrote: > > > > > > > > > > John, I mentioned on the VOTE thread that the proposed names are a > > bit > > > > > confusing, > > > > > > > > > > > given that "sum", "total", and "count" are roughly synonymous... > > > > > > > > > > In particular, TotalSum is, I think, a "running total", though the > > naming > > > > > doesn't really capture that. > > > > > > > > > > I think to avoid confusion, we should define exactly what "total" and > > > > > "sampled" are supposed to indicate, and perhaps pick appropriate > > naming > > > > > from there. > > > > > > > > > > Ryanne > > > > > > > > > > > > > > > On Fri, Jul 12, 2019 at 1:42 PM John Roesler > > wrote: > > > > > > > > > > > Hey, thanks Matthias and Bruno, > > > > > > > > > > > > I agree, "Cumulative" is a mouthful. "TotalX" sounds fine to me. > > > > > > > > > > > > Also, yes, I would have liked to not have any modifier for > > > > > > "non-sampled", but there is a name conflict with Sum. > > > > > > > > > > > > I'll update the KIP to reflect "TotalX" and then start the vote > > thread. > > > > > > > > > > > > Thanks again, > > > > > > -John > > > > > > > > > > > > On Fri, Jul 12, 2019 at 11:27 AM Bruno Cadonna > > > > > > wrote: > > > > > > > > > > > > > > OK, makes sense. Then, I am in favour of TotalCount and TotalSum. > > > > > > > > > > > > > > Best, > > > > > > > Bruno > > > > > > > > > > > > > > On Fri, Jul 12, 2019 at 12:57 AM Matthias J. Sax < > > > >
Re: [DISCUSS] KIP-488: Clean up Sum,Count,Total Metrics
John, that makes sense to me. Sorry for the bikeshedding. Ryanne On Tue, Jul 16, 2019 at 12:49 PM John Roesler wrote: > Thanks for the explanation and the suggestion, Ryanne, > > I went with "sampled" just because these are instances of SampledStat, > which in the Kafka Metrics ecosystem are computed from a window of > recent samples. Thinking more about it, the fact that they are sampled > and the fact that they are windowed are orthogonal, which is what > you're pointing out... sampling by itself doesn't indicate that it's a > moving average. > > Since there is no way in Kafka Metrics for a metric to be sampled and > not windowed/moving/decaying, calling them Sampled would never be > incorrect. But to someone unfamiliar with the code, it wouldn't > immediately suggest the behavior of the metric that actually matters. > That is, the behavior that distinguishes the two classes of metrics we > want to disambiguate here. > > It sounds like you'd suggest a new matrix of names: > MovingCount, MovingSum > RunningCount, RunningSum > > Are these names unambiguous and self explanatory? > > Thanks, > -John > > On Tue, Jul 16, 2019 at 12:32 PM Ryanne Dolan > wrote: > > > > > measurements, which decay/expire over time > > > > Thanks John for the clarification. This was my (re-)reading of the code, > > but this is not what I think of when I hear "sampled". In fact, you'll > > notice that the Wikipedia pages for "Sample (statistics)" and "Sample > > (signal processing)" do not contain the words decay, expire, recent, > > history, or anything similar. > > > > Similar to "running", I'd suggest the more correct "moving", as in > "moving > > average" and "moving sum", which involve looking back N samples, > applying a > > sliding window, decaying over time etc. > > > > Ryanne > > > > On Tue, Jul 16, 2019, 11:58 AM John Roesler wrote: > > > > > Thanks for raising this concern, Ryanne, > > > > > > "Sampled" indicates that the metrics is sampled, namely that we > > > maintain a set of samples from recent value measurements, which > > > decay/expire over time. So, the metric value is only representative of > > > the recent past. > > > > > > "Total" indicates that the metric value contains all the information > > > from the creation of the metric. For example., the total sum would > > > include all measurements since the app started up. > > > > > > It seems like your concern is that the word "total" doesn't really > > > pinpoint this meaning, which is true. It's especially confusing that > > > another meaning of "total" is synonymous with "sum", rendering the > > > name "TotalSum" sort of absurd. > > > > > > We previously considered "cumulative", which was rejected as a > > > mouthful (it's four syllables) . > > > > > > You mentioned "running", which might be a more appropriate modifier > > > (RunningSum and RunningCount). > > > > > > What would everyone think about that? > > > > > > Thanks, > > > -John > > > > > > On Tue, Jul 16, 2019 at 11:27 AM Ryanne Dolan > > > wrote: > > > > > > > > John, I mentioned on the VOTE thread that the proposed names are a > bit > > > > confusing, > > > > > > > > > given that "sum", "total", and "count" are roughly synonymous... > > > > > > > > In particular, TotalSum is, I think, a "running total", though the > naming > > > > doesn't really capture that. > > > > > > > > I think to avoid confusion, we should define exactly what "total" and > > > > "sampled" are supposed to indicate, and perhaps pick appropriate > naming > > > > from there. > > > > > > > > Ryanne > > > > > > > > > > > > On Fri, Jul 12, 2019 at 1:42 PM John Roesler > wrote: > > > > > > > > > Hey, thanks Matthias and Bruno, > > > > > > > > > > I agree, "Cumulative" is a mouthful. "TotalX" sounds fine to me. > > > > > > > > > > Also, yes, I would have liked to not have any modifier for > > > > > "non-sampled", but there is a name conflict with Sum. > > > > > > > > > > I'll update the KIP to reflect "TotalX" and then start the vote > thread. > > > > > > > > > > Thanks again, > > > > > -John > > > > > > > > > > On Fri, Jul 12, 2019 at 11:27 AM Bruno Cadonna > > > > wrote: > > > > > > > > > > > > OK, makes sense. Then, I am in favour of TotalCount and TotalSum. > > > > > > > > > > > > Best, > > > > > > Bruno > > > > > > > > > > > > On Fri, Jul 12, 2019 at 12:57 AM Matthias J. Sax < > > > matth...@confluent.io> > > > > > wrote: > > > > > > > > > > > > > > `Sum` is an existing name, for the "sampled sum" metric, that > gets > > > > > > > deprecated. Hence, we cannot use it. > > > > > > > > > > > > > > If we cannot use `Sum` and use `TotalSum`, we should also not > use > > > > > > > `Count` but `TotalCount` for consistency. > > > > > > > > > > > > > > > > > > > > > -Matthias > > > > > > > > > > > > > > > > > > > > > > > > > > > > On 7/11/19 12:58 PM, Bruno Cadonna wrote: > > > > > > > > Hi John, > > > > > > > > > > > > > > > > Thank you for the KIP. > > > > > > > > > > > > > > > > LGTM > > > > > > > > > > > > > > > > I also do
Re: [DISCUSS] KIP-488: Clean up Sum,Count,Total Metrics
Thanks for the explanation and the suggestion, Ryanne, I went with "sampled" just because these are instances of SampledStat, which in the Kafka Metrics ecosystem are computed from a window of recent samples. Thinking more about it, the fact that they are sampled and the fact that they are windowed are orthogonal, which is what you're pointing out... sampling by itself doesn't indicate that it's a moving average. Since there is no way in Kafka Metrics for a metric to be sampled and not windowed/moving/decaying, calling them Sampled would never be incorrect. But to someone unfamiliar with the code, it wouldn't immediately suggest the behavior of the metric that actually matters. That is, the behavior that distinguishes the two classes of metrics we want to disambiguate here. It sounds like you'd suggest a new matrix of names: MovingCount, MovingSum RunningCount, RunningSum Are these names unambiguous and self explanatory? Thanks, -John On Tue, Jul 16, 2019 at 12:32 PM Ryanne Dolan wrote: > > > measurements, which decay/expire over time > > Thanks John for the clarification. This was my (re-)reading of the code, > but this is not what I think of when I hear "sampled". In fact, you'll > notice that the Wikipedia pages for "Sample (statistics)" and "Sample > (signal processing)" do not contain the words decay, expire, recent, > history, or anything similar. > > Similar to "running", I'd suggest the more correct "moving", as in "moving > average" and "moving sum", which involve looking back N samples, applying a > sliding window, decaying over time etc. > > Ryanne > > On Tue, Jul 16, 2019, 11:58 AM John Roesler wrote: > > > Thanks for raising this concern, Ryanne, > > > > "Sampled" indicates that the metrics is sampled, namely that we > > maintain a set of samples from recent value measurements, which > > decay/expire over time. So, the metric value is only representative of > > the recent past. > > > > "Total" indicates that the metric value contains all the information > > from the creation of the metric. For example., the total sum would > > include all measurements since the app started up. > > > > It seems like your concern is that the word "total" doesn't really > > pinpoint this meaning, which is true. It's especially confusing that > > another meaning of "total" is synonymous with "sum", rendering the > > name "TotalSum" sort of absurd. > > > > We previously considered "cumulative", which was rejected as a > > mouthful (it's four syllables) . > > > > You mentioned "running", which might be a more appropriate modifier > > (RunningSum and RunningCount). > > > > What would everyone think about that? > > > > Thanks, > > -John > > > > On Tue, Jul 16, 2019 at 11:27 AM Ryanne Dolan > > wrote: > > > > > > John, I mentioned on the VOTE thread that the proposed names are a bit > > > confusing, > > > > > > > given that "sum", "total", and "count" are roughly synonymous... > > > > > > In particular, TotalSum is, I think, a "running total", though the naming > > > doesn't really capture that. > > > > > > I think to avoid confusion, we should define exactly what "total" and > > > "sampled" are supposed to indicate, and perhaps pick appropriate naming > > > from there. > > > > > > Ryanne > > > > > > > > > On Fri, Jul 12, 2019 at 1:42 PM John Roesler wrote: > > > > > > > Hey, thanks Matthias and Bruno, > > > > > > > > I agree, "Cumulative" is a mouthful. "TotalX" sounds fine to me. > > > > > > > > Also, yes, I would have liked to not have any modifier for > > > > "non-sampled", but there is a name conflict with Sum. > > > > > > > > I'll update the KIP to reflect "TotalX" and then start the vote thread. > > > > > > > > Thanks again, > > > > -John > > > > > > > > On Fri, Jul 12, 2019 at 11:27 AM Bruno Cadonna > > wrote: > > > > > > > > > > OK, makes sense. Then, I am in favour of TotalCount and TotalSum. > > > > > > > > > > Best, > > > > > Bruno > > > > > > > > > > On Fri, Jul 12, 2019 at 12:57 AM Matthias J. Sax < > > matth...@confluent.io> > > > > wrote: > > > > > > > > > > > > `Sum` is an existing name, for the "sampled sum" metric, that gets > > > > > > deprecated. Hence, we cannot use it. > > > > > > > > > > > > If we cannot use `Sum` and use `TotalSum`, we should also not use > > > > > > `Count` but `TotalCount` for consistency. > > > > > > > > > > > > > > > > > > -Matthias > > > > > > > > > > > > > > > > > > > > > > > > On 7/11/19 12:58 PM, Bruno Cadonna wrote: > > > > > > > Hi John, > > > > > > > > > > > > > > Thank you for the KIP. > > > > > > > > > > > > > > LGTM > > > > > > > > > > > > > > I also do not like CumulativeSum/Count so much. I propose to just > > > > call > > > > > > > it Sum and Count. > > > > > > > > > > > > > > I understand that you want to unequivocally distinguish the two > > > > metric > > > > > > > functions by their names, but I have the feeling the names become > > > > > > > artificially complex. The exact semantics can also be documented > > in > > > > > > > the javadocs,
Re: [DISCUSS] KIP-488: Clean up Sum,Count,Total Metrics
> measurements, which decay/expire over time Thanks John for the clarification. This was my (re-)reading of the code, but this is not what I think of when I hear "sampled". In fact, you'll notice that the Wikipedia pages for "Sample (statistics)" and "Sample (signal processing)" do not contain the words decay, expire, recent, history, or anything similar. Similar to "running", I'd suggest the more correct "moving", as in "moving average" and "moving sum", which involve looking back N samples, applying a sliding window, decaying over time etc. Ryanne On Tue, Jul 16, 2019, 11:58 AM John Roesler wrote: > Thanks for raising this concern, Ryanne, > > "Sampled" indicates that the metrics is sampled, namely that we > maintain a set of samples from recent value measurements, which > decay/expire over time. So, the metric value is only representative of > the recent past. > > "Total" indicates that the metric value contains all the information > from the creation of the metric. For example., the total sum would > include all measurements since the app started up. > > It seems like your concern is that the word "total" doesn't really > pinpoint this meaning, which is true. It's especially confusing that > another meaning of "total" is synonymous with "sum", rendering the > name "TotalSum" sort of absurd. > > We previously considered "cumulative", which was rejected as a > mouthful (it's four syllables) . > > You mentioned "running", which might be a more appropriate modifier > (RunningSum and RunningCount). > > What would everyone think about that? > > Thanks, > -John > > On Tue, Jul 16, 2019 at 11:27 AM Ryanne Dolan > wrote: > > > > John, I mentioned on the VOTE thread that the proposed names are a bit > > confusing, > > > > > given that "sum", "total", and "count" are roughly synonymous... > > > > In particular, TotalSum is, I think, a "running total", though the naming > > doesn't really capture that. > > > > I think to avoid confusion, we should define exactly what "total" and > > "sampled" are supposed to indicate, and perhaps pick appropriate naming > > from there. > > > > Ryanne > > > > > > On Fri, Jul 12, 2019 at 1:42 PM John Roesler wrote: > > > > > Hey, thanks Matthias and Bruno, > > > > > > I agree, "Cumulative" is a mouthful. "TotalX" sounds fine to me. > > > > > > Also, yes, I would have liked to not have any modifier for > > > "non-sampled", but there is a name conflict with Sum. > > > > > > I'll update the KIP to reflect "TotalX" and then start the vote thread. > > > > > > Thanks again, > > > -John > > > > > > On Fri, Jul 12, 2019 at 11:27 AM Bruno Cadonna > wrote: > > > > > > > > OK, makes sense. Then, I am in favour of TotalCount and TotalSum. > > > > > > > > Best, > > > > Bruno > > > > > > > > On Fri, Jul 12, 2019 at 12:57 AM Matthias J. Sax < > matth...@confluent.io> > > > wrote: > > > > > > > > > > `Sum` is an existing name, for the "sampled sum" metric, that gets > > > > > deprecated. Hence, we cannot use it. > > > > > > > > > > If we cannot use `Sum` and use `TotalSum`, we should also not use > > > > > `Count` but `TotalCount` for consistency. > > > > > > > > > > > > > > > -Matthias > > > > > > > > > > > > > > > > > > > > On 7/11/19 12:58 PM, Bruno Cadonna wrote: > > > > > > Hi John, > > > > > > > > > > > > Thank you for the KIP. > > > > > > > > > > > > LGTM > > > > > > > > > > > > I also do not like CumulativeSum/Count so much. I propose to just > > > call > > > > > > it Sum and Count. > > > > > > > > > > > > I understand that you want to unequivocally distinguish the two > > > metric > > > > > > functions by their names, but I have the feeling the names become > > > > > > artificially complex. The exact semantics can also be documented > in > > > > > > the javadocs, which btw could also be improved in those classes. > > > > > > > > > > > > Best, > > > > > > Bruno > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Jul 11, 2019 at 8:25 PM Matthias J. Sax < > > > matth...@confluent.io> wrote: > > > > > >> > > > > > >> Thanks for the KIP. Overall LGTM. > > > > > >> > > > > > >> The only though I have is, if we may want to use `TotalSum` and > > > > > >> `TotalCount` instead of `CumulativeSum/Count` as names? > > > > > >> > > > > > >> > > > > > >> -Matthias > > > > > >> > > > > > >> > > > > > >> On 7/11/19 9:31 AM, John Roesler wrote: > > > > > >>> Hi Kafka devs, > > > > > >>> > > > > > >>> I'd like to propose KIP-488 as a minor cleanup of some of our > > > metric > > > > > >>> implementations. > > > > > >>> > > > > > >>> KIP-488: https://cwiki.apache.org/confluence/x/kkAyBw > > > > > >>> > > > > > >>> Over time, iterative updates to these metrics has resulted in a > > > pretty > > > > > >>> confusing little collection of classes, and I've personally > been > > > > > >>> involved in three separate moderately time-consuming > iterations of > > > me > > > > > >>> or someone else trying to work out which metrics are > available, and > > > > > >>> which ones are desired for a given use
Re: [DISCUSS] KIP-488: Clean up Sum,Count,Total Metrics
Thanks for raising this concern, Ryanne, "Sampled" indicates that the metrics is sampled, namely that we maintain a set of samples from recent value measurements, which decay/expire over time. So, the metric value is only representative of the recent past. "Total" indicates that the metric value contains all the information from the creation of the metric. For example., the total sum would include all measurements since the app started up. It seems like your concern is that the word "total" doesn't really pinpoint this meaning, which is true. It's especially confusing that another meaning of "total" is synonymous with "sum", rendering the name "TotalSum" sort of absurd. We previously considered "cumulative", which was rejected as a mouthful (it's four syllables) . You mentioned "running", which might be a more appropriate modifier (RunningSum and RunningCount). What would everyone think about that? Thanks, -John On Tue, Jul 16, 2019 at 11:27 AM Ryanne Dolan wrote: > > John, I mentioned on the VOTE thread that the proposed names are a bit > confusing, > > > given that "sum", "total", and "count" are roughly synonymous... > > In particular, TotalSum is, I think, a "running total", though the naming > doesn't really capture that. > > I think to avoid confusion, we should define exactly what "total" and > "sampled" are supposed to indicate, and perhaps pick appropriate naming > from there. > > Ryanne > > > On Fri, Jul 12, 2019 at 1:42 PM John Roesler wrote: > > > Hey, thanks Matthias and Bruno, > > > > I agree, "Cumulative" is a mouthful. "TotalX" sounds fine to me. > > > > Also, yes, I would have liked to not have any modifier for > > "non-sampled", but there is a name conflict with Sum. > > > > I'll update the KIP to reflect "TotalX" and then start the vote thread. > > > > Thanks again, > > -John > > > > On Fri, Jul 12, 2019 at 11:27 AM Bruno Cadonna wrote: > > > > > > OK, makes sense. Then, I am in favour of TotalCount and TotalSum. > > > > > > Best, > > > Bruno > > > > > > On Fri, Jul 12, 2019 at 12:57 AM Matthias J. Sax > > wrote: > > > > > > > > `Sum` is an existing name, for the "sampled sum" metric, that gets > > > > deprecated. Hence, we cannot use it. > > > > > > > > If we cannot use `Sum` and use `TotalSum`, we should also not use > > > > `Count` but `TotalCount` for consistency. > > > > > > > > > > > > -Matthias > > > > > > > > > > > > > > > > On 7/11/19 12:58 PM, Bruno Cadonna wrote: > > > > > Hi John, > > > > > > > > > > Thank you for the KIP. > > > > > > > > > > LGTM > > > > > > > > > > I also do not like CumulativeSum/Count so much. I propose to just > > call > > > > > it Sum and Count. > > > > > > > > > > I understand that you want to unequivocally distinguish the two > > metric > > > > > functions by their names, but I have the feeling the names become > > > > > artificially complex. The exact semantics can also be documented in > > > > > the javadocs, which btw could also be improved in those classes. > > > > > > > > > > Best, > > > > > Bruno > > > > > > > > > > > > > > > > > > > > On Thu, Jul 11, 2019 at 8:25 PM Matthias J. Sax < > > matth...@confluent.io> wrote: > > > > >> > > > > >> Thanks for the KIP. Overall LGTM. > > > > >> > > > > >> The only though I have is, if we may want to use `TotalSum` and > > > > >> `TotalCount` instead of `CumulativeSum/Count` as names? > > > > >> > > > > >> > > > > >> -Matthias > > > > >> > > > > >> > > > > >> On 7/11/19 9:31 AM, John Roesler wrote: > > > > >>> Hi Kafka devs, > > > > >>> > > > > >>> I'd like to propose KIP-488 as a minor cleanup of some of our > > metric > > > > >>> implementations. > > > > >>> > > > > >>> KIP-488: https://cwiki.apache.org/confluence/x/kkAyBw > > > > >>> > > > > >>> Over time, iterative updates to these metrics has resulted in a > > pretty > > > > >>> confusing little collection of classes, and I've personally been > > > > >>> involved in three separate moderately time-consuming iterations of > > me > > > > >>> or someone else trying to work out which metrics are available, and > > > > >>> which ones are desired for a given use case. One of these was > > actually > > > > >>> a long-running bug in Kafka Streams' metrics, so not only has this > > > > >>> confusion been a time sink, but it has also led to bugs. > > > > >>> > > > > >>> I'm hoping this change won't be too controversial. > > > > >>> > > > > >>> Thanks, > > > > >>> -John > > > > >>> > > > > >> > > > > > >
Re: [DISCUSS] KIP-488: Clean up Sum,Count,Total Metrics
John, I mentioned on the VOTE thread that the proposed names are a bit confusing, > given that "sum", "total", and "count" are roughly synonymous... In particular, TotalSum is, I think, a "running total", though the naming doesn't really capture that. I think to avoid confusion, we should define exactly what "total" and "sampled" are supposed to indicate, and perhaps pick appropriate naming from there. Ryanne On Fri, Jul 12, 2019 at 1:42 PM John Roesler wrote: > Hey, thanks Matthias and Bruno, > > I agree, "Cumulative" is a mouthful. "TotalX" sounds fine to me. > > Also, yes, I would have liked to not have any modifier for > "non-sampled", but there is a name conflict with Sum. > > I'll update the KIP to reflect "TotalX" and then start the vote thread. > > Thanks again, > -John > > On Fri, Jul 12, 2019 at 11:27 AM Bruno Cadonna wrote: > > > > OK, makes sense. Then, I am in favour of TotalCount and TotalSum. > > > > Best, > > Bruno > > > > On Fri, Jul 12, 2019 at 12:57 AM Matthias J. Sax > wrote: > > > > > > `Sum` is an existing name, for the "sampled sum" metric, that gets > > > deprecated. Hence, we cannot use it. > > > > > > If we cannot use `Sum` and use `TotalSum`, we should also not use > > > `Count` but `TotalCount` for consistency. > > > > > > > > > -Matthias > > > > > > > > > > > > On 7/11/19 12:58 PM, Bruno Cadonna wrote: > > > > Hi John, > > > > > > > > Thank you for the KIP. > > > > > > > > LGTM > > > > > > > > I also do not like CumulativeSum/Count so much. I propose to just > call > > > > it Sum and Count. > > > > > > > > I understand that you want to unequivocally distinguish the two > metric > > > > functions by their names, but I have the feeling the names become > > > > artificially complex. The exact semantics can also be documented in > > > > the javadocs, which btw could also be improved in those classes. > > > > > > > > Best, > > > > Bruno > > > > > > > > > > > > > > > > On Thu, Jul 11, 2019 at 8:25 PM Matthias J. Sax < > matth...@confluent.io> wrote: > > > >> > > > >> Thanks for the KIP. Overall LGTM. > > > >> > > > >> The only though I have is, if we may want to use `TotalSum` and > > > >> `TotalCount` instead of `CumulativeSum/Count` as names? > > > >> > > > >> > > > >> -Matthias > > > >> > > > >> > > > >> On 7/11/19 9:31 AM, John Roesler wrote: > > > >>> Hi Kafka devs, > > > >>> > > > >>> I'd like to propose KIP-488 as a minor cleanup of some of our > metric > > > >>> implementations. > > > >>> > > > >>> KIP-488: https://cwiki.apache.org/confluence/x/kkAyBw > > > >>> > > > >>> Over time, iterative updates to these metrics has resulted in a > pretty > > > >>> confusing little collection of classes, and I've personally been > > > >>> involved in three separate moderately time-consuming iterations of > me > > > >>> or someone else trying to work out which metrics are available, and > > > >>> which ones are desired for a given use case. One of these was > actually > > > >>> a long-running bug in Kafka Streams' metrics, so not only has this > > > >>> confusion been a time sink, but it has also led to bugs. > > > >>> > > > >>> I'm hoping this change won't be too controversial. > > > >>> > > > >>> Thanks, > > > >>> -John > > > >>> > > > >> > > > >
Re: [DISCUSS] KIP-488: Clean up Sum,Count,Total Metrics
Hey Stanislav, Thanks for the feedback! I do think we should try to keep the scope of this KIP small. The idea you proposed is orthogonal, though, so if you wanted to write up a quick KIP concurrently, I'd be happy to review it. Regarding deprecation: yes, the metrics are in the public interface, because they are not inside a package named `internal`. Therefore, we do need to do deprecations and migrations when considering changes to them. I wish we didn't, since it could be a cleaner transition without them, but we shouldn't risk breaking source compatibility. Thanks, -John On Sat, Jul 13, 2019 at 1:30 PM Stanislav Kozlovski wrote: > > Hello, > > Thanks for the KIP, I'm happy we're converging on implementations. > > I was wondering whether we need to deprecate the old metrics - are those > classes public interfaces? I see we don't build a JavaDoc for them ( > https://kafka.apache.org/22/javadoc/allclasses-noframe.html) and, as far as > I know, that means we don't necessarily need to go with the deprecation > route. > > I think we also have the Measurable and Gauge classes which seem to do the > same thing. From the names, I intuitively thought that Measurable indicates > something that might require some amount of processing to get the value of > and Gauge to be something that is instantly read. Reading their usages, > though, I see that is not the case. Measurable is inherited by > MeasurableStat which itself is inherited by most of the classes you > mentioned in the KIP. Is it worth it converging on those as well (perhaps > deprecating/removing Measurable and opting for Gauge) or do we prefer to > keep the scope of this KIP small? > > Thanks, > Stanislav > > On Fri, Jul 12, 2019 at 7:42 PM John Roesler wrote: > > > Hey, thanks Matthias and Bruno, > > > > I agree, "Cumulative" is a mouthful. "TotalX" sounds fine to me. > > > > Also, yes, I would have liked to not have any modifier for > > "non-sampled", but there is a name conflict with Sum. > > > > I'll update the KIP to reflect "TotalX" and then start the vote thread. > > > > Thanks again, > > -John > > > > On Fri, Jul 12, 2019 at 11:27 AM Bruno Cadonna wrote: > > > > > > OK, makes sense. Then, I am in favour of TotalCount and TotalSum. > > > > > > Best, > > > Bruno > > > > > > On Fri, Jul 12, 2019 at 12:57 AM Matthias J. Sax > > wrote: > > > > > > > > `Sum` is an existing name, for the "sampled sum" metric, that gets > > > > deprecated. Hence, we cannot use it. > > > > > > > > If we cannot use `Sum` and use `TotalSum`, we should also not use > > > > `Count` but `TotalCount` for consistency. > > > > > > > > > > > > -Matthias > > > > > > > > > > > > > > > > On 7/11/19 12:58 PM, Bruno Cadonna wrote: > > > > > Hi John, > > > > > > > > > > Thank you for the KIP. > > > > > > > > > > LGTM > > > > > > > > > > I also do not like CumulativeSum/Count so much. I propose to just > > call > > > > > it Sum and Count. > > > > > > > > > > I understand that you want to unequivocally distinguish the two > > metric > > > > > functions by their names, but I have the feeling the names become > > > > > artificially complex. The exact semantics can also be documented in > > > > > the javadocs, which btw could also be improved in those classes. > > > > > > > > > > Best, > > > > > Bruno > > > > > > > > > > > > > > > > > > > > On Thu, Jul 11, 2019 at 8:25 PM Matthias J. Sax < > > matth...@confluent.io> wrote: > > > > >> > > > > >> Thanks for the KIP. Overall LGTM. > > > > >> > > > > >> The only though I have is, if we may want to use `TotalSum` and > > > > >> `TotalCount` instead of `CumulativeSum/Count` as names? > > > > >> > > > > >> > > > > >> -Matthias > > > > >> > > > > >> > > > > >> On 7/11/19 9:31 AM, John Roesler wrote: > > > > >>> Hi Kafka devs, > > > > >>> > > > > >>> I'd like to propose KIP-488 as a minor cleanup of some of our > > metric > > > > >>> implementations. > > > > >>> > > > > >>> KIP-488: https://cwiki.apache.org/confluence/x/kkAyBw > > > > >>> > > > > >>> Over time, iterative updates to these metrics has resulted in a > > pretty > > > > >>> confusing little collection of classes, and I've personally been > > > > >>> involved in three separate moderately time-consuming iterations of > > me > > > > >>> or someone else trying to work out which metrics are available, and > > > > >>> which ones are desired for a given use case. One of these was > > actually > > > > >>> a long-running bug in Kafka Streams' metrics, so not only has this > > > > >>> confusion been a time sink, but it has also led to bugs. > > > > >>> > > > > >>> I'm hoping this change won't be too controversial. > > > > >>> > > > > >>> Thanks, > > > > >>> -John > > > > >>> > > > > >> > > > > > >
Re: [DISCUSS] KIP-488: Clean up Sum,Count,Total Metrics
Hello, Thanks for the KIP, I'm happy we're converging on implementations. I was wondering whether we need to deprecate the old metrics - are those classes public interfaces? I see we don't build a JavaDoc for them ( https://kafka.apache.org/22/javadoc/allclasses-noframe.html) and, as far as I know, that means we don't necessarily need to go with the deprecation route. I think we also have the Measurable and Gauge classes which seem to do the same thing. From the names, I intuitively thought that Measurable indicates something that might require some amount of processing to get the value of and Gauge to be something that is instantly read. Reading their usages, though, I see that is not the case. Measurable is inherited by MeasurableStat which itself is inherited by most of the classes you mentioned in the KIP. Is it worth it converging on those as well (perhaps deprecating/removing Measurable and opting for Gauge) or do we prefer to keep the scope of this KIP small? Thanks, Stanislav On Fri, Jul 12, 2019 at 7:42 PM John Roesler wrote: > Hey, thanks Matthias and Bruno, > > I agree, "Cumulative" is a mouthful. "TotalX" sounds fine to me. > > Also, yes, I would have liked to not have any modifier for > "non-sampled", but there is a name conflict with Sum. > > I'll update the KIP to reflect "TotalX" and then start the vote thread. > > Thanks again, > -John > > On Fri, Jul 12, 2019 at 11:27 AM Bruno Cadonna wrote: > > > > OK, makes sense. Then, I am in favour of TotalCount and TotalSum. > > > > Best, > > Bruno > > > > On Fri, Jul 12, 2019 at 12:57 AM Matthias J. Sax > wrote: > > > > > > `Sum` is an existing name, for the "sampled sum" metric, that gets > > > deprecated. Hence, we cannot use it. > > > > > > If we cannot use `Sum` and use `TotalSum`, we should also not use > > > `Count` but `TotalCount` for consistency. > > > > > > > > > -Matthias > > > > > > > > > > > > On 7/11/19 12:58 PM, Bruno Cadonna wrote: > > > > Hi John, > > > > > > > > Thank you for the KIP. > > > > > > > > LGTM > > > > > > > > I also do not like CumulativeSum/Count so much. I propose to just > call > > > > it Sum and Count. > > > > > > > > I understand that you want to unequivocally distinguish the two > metric > > > > functions by their names, but I have the feeling the names become > > > > artificially complex. The exact semantics can also be documented in > > > > the javadocs, which btw could also be improved in those classes. > > > > > > > > Best, > > > > Bruno > > > > > > > > > > > > > > > > On Thu, Jul 11, 2019 at 8:25 PM Matthias J. Sax < > matth...@confluent.io> wrote: > > > >> > > > >> Thanks for the KIP. Overall LGTM. > > > >> > > > >> The only though I have is, if we may want to use `TotalSum` and > > > >> `TotalCount` instead of `CumulativeSum/Count` as names? > > > >> > > > >> > > > >> -Matthias > > > >> > > > >> > > > >> On 7/11/19 9:31 AM, John Roesler wrote: > > > >>> Hi Kafka devs, > > > >>> > > > >>> I'd like to propose KIP-488 as a minor cleanup of some of our > metric > > > >>> implementations. > > > >>> > > > >>> KIP-488: https://cwiki.apache.org/confluence/x/kkAyBw > > > >>> > > > >>> Over time, iterative updates to these metrics has resulted in a > pretty > > > >>> confusing little collection of classes, and I've personally been > > > >>> involved in three separate moderately time-consuming iterations of > me > > > >>> or someone else trying to work out which metrics are available, and > > > >>> which ones are desired for a given use case. One of these was > actually > > > >>> a long-running bug in Kafka Streams' metrics, so not only has this > > > >>> confusion been a time sink, but it has also led to bugs. > > > >>> > > > >>> I'm hoping this change won't be too controversial. > > > >>> > > > >>> Thanks, > > > >>> -John > > > >>> > > > >> > > > >
Re: [DISCUSS] KIP-488: Clean up Sum,Count,Total Metrics
Hey, thanks Matthias and Bruno, I agree, "Cumulative" is a mouthful. "TotalX" sounds fine to me. Also, yes, I would have liked to not have any modifier for "non-sampled", but there is a name conflict with Sum. I'll update the KIP to reflect "TotalX" and then start the vote thread. Thanks again, -John On Fri, Jul 12, 2019 at 11:27 AM Bruno Cadonna wrote: > > OK, makes sense. Then, I am in favour of TotalCount and TotalSum. > > Best, > Bruno > > On Fri, Jul 12, 2019 at 12:57 AM Matthias J. Sax > wrote: > > > > `Sum` is an existing name, for the "sampled sum" metric, that gets > > deprecated. Hence, we cannot use it. > > > > If we cannot use `Sum` and use `TotalSum`, we should also not use > > `Count` but `TotalCount` for consistency. > > > > > > -Matthias > > > > > > > > On 7/11/19 12:58 PM, Bruno Cadonna wrote: > > > Hi John, > > > > > > Thank you for the KIP. > > > > > > LGTM > > > > > > I also do not like CumulativeSum/Count so much. I propose to just call > > > it Sum and Count. > > > > > > I understand that you want to unequivocally distinguish the two metric > > > functions by their names, but I have the feeling the names become > > > artificially complex. The exact semantics can also be documented in > > > the javadocs, which btw could also be improved in those classes. > > > > > > Best, > > > Bruno > > > > > > > > > > > > On Thu, Jul 11, 2019 at 8:25 PM Matthias J. Sax > > > wrote: > > >> > > >> Thanks for the KIP. Overall LGTM. > > >> > > >> The only though I have is, if we may want to use `TotalSum` and > > >> `TotalCount` instead of `CumulativeSum/Count` as names? > > >> > > >> > > >> -Matthias > > >> > > >> > > >> On 7/11/19 9:31 AM, John Roesler wrote: > > >>> Hi Kafka devs, > > >>> > > >>> I'd like to propose KIP-488 as a minor cleanup of some of our metric > > >>> implementations. > > >>> > > >>> KIP-488: https://cwiki.apache.org/confluence/x/kkAyBw > > >>> > > >>> Over time, iterative updates to these metrics has resulted in a pretty > > >>> confusing little collection of classes, and I've personally been > > >>> involved in three separate moderately time-consuming iterations of me > > >>> or someone else trying to work out which metrics are available, and > > >>> which ones are desired for a given use case. One of these was actually > > >>> a long-running bug in Kafka Streams' metrics, so not only has this > > >>> confusion been a time sink, but it has also led to bugs. > > >>> > > >>> I'm hoping this change won't be too controversial. > > >>> > > >>> Thanks, > > >>> -John > > >>> > > >> > >
Re: [DISCUSS] KIP-488: Clean up Sum,Count,Total Metrics
OK, makes sense. Then, I am in favour of TotalCount and TotalSum. Best, Bruno On Fri, Jul 12, 2019 at 12:57 AM Matthias J. Sax wrote: > > `Sum` is an existing name, for the "sampled sum" metric, that gets > deprecated. Hence, we cannot use it. > > If we cannot use `Sum` and use `TotalSum`, we should also not use > `Count` but `TotalCount` for consistency. > > > -Matthias > > > > On 7/11/19 12:58 PM, Bruno Cadonna wrote: > > Hi John, > > > > Thank you for the KIP. > > > > LGTM > > > > I also do not like CumulativeSum/Count so much. I propose to just call > > it Sum and Count. > > > > I understand that you want to unequivocally distinguish the two metric > > functions by their names, but I have the feeling the names become > > artificially complex. The exact semantics can also be documented in > > the javadocs, which btw could also be improved in those classes. > > > > Best, > > Bruno > > > > > > > > On Thu, Jul 11, 2019 at 8:25 PM Matthias J. Sax > > wrote: > >> > >> Thanks for the KIP. Overall LGTM. > >> > >> The only though I have is, if we may want to use `TotalSum` and > >> `TotalCount` instead of `CumulativeSum/Count` as names? > >> > >> > >> -Matthias > >> > >> > >> On 7/11/19 9:31 AM, John Roesler wrote: > >>> Hi Kafka devs, > >>> > >>> I'd like to propose KIP-488 as a minor cleanup of some of our metric > >>> implementations. > >>> > >>> KIP-488: https://cwiki.apache.org/confluence/x/kkAyBw > >>> > >>> Over time, iterative updates to these metrics has resulted in a pretty > >>> confusing little collection of classes, and I've personally been > >>> involved in three separate moderately time-consuming iterations of me > >>> or someone else trying to work out which metrics are available, and > >>> which ones are desired for a given use case. One of these was actually > >>> a long-running bug in Kafka Streams' metrics, so not only has this > >>> confusion been a time sink, but it has also led to bugs. > >>> > >>> I'm hoping this change won't be too controversial. > >>> > >>> Thanks, > >>> -John > >>> > >> >
Re: [DISCUSS] KIP-488: Clean up Sum,Count,Total Metrics
`Sum` is an existing name, for the "sampled sum" metric, that gets deprecated. Hence, we cannot use it. If we cannot use `Sum` and use `TotalSum`, we should also not use `Count` but `TotalCount` for consistency. -Matthias On 7/11/19 12:58 PM, Bruno Cadonna wrote: > Hi John, > > Thank you for the KIP. > > LGTM > > I also do not like CumulativeSum/Count so much. I propose to just call > it Sum and Count. > > I understand that you want to unequivocally distinguish the two metric > functions by their names, but I have the feeling the names become > artificially complex. The exact semantics can also be documented in > the javadocs, which btw could also be improved in those classes. > > Best, > Bruno > > > > On Thu, Jul 11, 2019 at 8:25 PM Matthias J. Sax wrote: >> >> Thanks for the KIP. Overall LGTM. >> >> The only though I have is, if we may want to use `TotalSum` and >> `TotalCount` instead of `CumulativeSum/Count` as names? >> >> >> -Matthias >> >> >> On 7/11/19 9:31 AM, John Roesler wrote: >>> Hi Kafka devs, >>> >>> I'd like to propose KIP-488 as a minor cleanup of some of our metric >>> implementations. >>> >>> KIP-488: https://cwiki.apache.org/confluence/x/kkAyBw >>> >>> Over time, iterative updates to these metrics has resulted in a pretty >>> confusing little collection of classes, and I've personally been >>> involved in three separate moderately time-consuming iterations of me >>> or someone else trying to work out which metrics are available, and >>> which ones are desired for a given use case. One of these was actually >>> a long-running bug in Kafka Streams' metrics, so not only has this >>> confusion been a time sink, but it has also led to bugs. >>> >>> I'm hoping this change won't be too controversial. >>> >>> Thanks, >>> -John >>> >> signature.asc Description: OpenPGP digital signature
Re: [DISCUSS] KIP-488: Clean up Sum,Count,Total Metrics
Hi John, Thank you for the KIP. LGTM I also do not like CumulativeSum/Count so much. I propose to just call it Sum and Count. I understand that you want to unequivocally distinguish the two metric functions by their names, but I have the feeling the names become artificially complex. The exact semantics can also be documented in the javadocs, which btw could also be improved in those classes. Best, Bruno On Thu, Jul 11, 2019 at 8:25 PM Matthias J. Sax wrote: > > Thanks for the KIP. Overall LGTM. > > The only though I have is, if we may want to use `TotalSum` and > `TotalCount` instead of `CumulativeSum/Count` as names? > > > -Matthias > > > On 7/11/19 9:31 AM, John Roesler wrote: > > Hi Kafka devs, > > > > I'd like to propose KIP-488 as a minor cleanup of some of our metric > > implementations. > > > > KIP-488: https://cwiki.apache.org/confluence/x/kkAyBw > > > > Over time, iterative updates to these metrics has resulted in a pretty > > confusing little collection of classes, and I've personally been > > involved in three separate moderately time-consuming iterations of me > > or someone else trying to work out which metrics are available, and > > which ones are desired for a given use case. One of these was actually > > a long-running bug in Kafka Streams' metrics, so not only has this > > confusion been a time sink, but it has also led to bugs. > > > > I'm hoping this change won't be too controversial. > > > > Thanks, > > -John > > >
Re: [DISCUSS] KIP-488: Clean up Sum,Count,Total Metrics
Thanks for the KIP. Overall LGTM. The only though I have is, if we may want to use `TotalSum` and `TotalCount` instead of `CumulativeSum/Count` as names? -Matthias On 7/11/19 9:31 AM, John Roesler wrote: > Hi Kafka devs, > > I'd like to propose KIP-488 as a minor cleanup of some of our metric > implementations. > > KIP-488: https://cwiki.apache.org/confluence/x/kkAyBw > > Over time, iterative updates to these metrics has resulted in a pretty > confusing little collection of classes, and I've personally been > involved in three separate moderately time-consuming iterations of me > or someone else trying to work out which metrics are available, and > which ones are desired for a given use case. One of these was actually > a long-running bug in Kafka Streams' metrics, so not only has this > confusion been a time sink, but it has also led to bugs. > > I'm hoping this change won't be too controversial. > > Thanks, > -John > signature.asc Description: OpenPGP digital signature
[DISCUSS] KIP-488: Clean up Sum,Count,Total Metrics
Hi Kafka devs, I'd like to propose KIP-488 as a minor cleanup of some of our metric implementations. KIP-488: https://cwiki.apache.org/confluence/x/kkAyBw Over time, iterative updates to these metrics has resulted in a pretty confusing little collection of classes, and I've personally been involved in three separate moderately time-consuming iterations of me or someone else trying to work out which metrics are available, and which ones are desired for a given use case. One of these was actually a long-running bug in Kafka Streams' metrics, so not only has this confusion been a time sink, but it has also led to bugs. I'm hoping this change won't be too controversial. Thanks, -John