[GitHub] [kafka] ableegoldman commented on a change in pull request #8697: KAFKA-9983: KIP-613, add INFO level e2e latency metrics

GitBox Fri, 22 May 2020 19:09:30 -0700


ableegoldman commented on a change in pull request #8697:
URL: https://github.com/apache/kafka/pull/8697#discussion_r429505029




##########
File path: 
streams/src/main/java/org/apache/kafka/streams/processor/internals/metrics/StreamsMetricsImpl.java
##########
@@ -149,6 +154,10 @@ public int hashCode() {
     public static final String RATE_DESCRIPTION_PREFIX = "The average number 
of ";
     public static final String RATE_DESCRIPTION_SUFFIX = " per second";
 
+    public static final int PERCENTILES_SIZE_IN_BYTES = 1000 * 1000;    // 1 MB
+    public static long MAXIMUM_E2E_LATENCY = 10 * 24 * 60 * 60 * 1000L; // 
maximum latency is 10 days

Review comment:
       10 days was just rounding up from the 7 day default retention limit. The 
maximum is due to the percentiles calculation which is based on incrementally 
sized buckets. It's a tradeoff with accuracy
   
   For example if I increase it by a factor of 1000, the `StreamTask` 
percentiles test is off by almost 20% (p99 is 82.9 instead of 99). This test 
uses values between 0 and 100, which is probably considerably lower than most 
e2e latencies will be.If you look at the `MetricsTest` percentiles test I 
added, this uses random values up to the max value and can maintain the 10% 
accuracy up to a higher max value. 
   
   Of course we don't know what the distribution will be, but it seems likely 
to be somewhere in the middle (not in the 100s of ms, not in the 10s or 1000s 
of days) so for reasonable accuracy we need to pick a reasonable maximum. We 
can definitely go higher than 10 days, but I reasoned that if you have records 
earlier than 10 days you're probably processing historical data and in that 
case the e2e latency isn't that meaningful. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [kafka] ableegoldman commented on a change in pull request #8697: KAFKA-9983: KIP-613, add INFO level e2e latency metrics

Reply via email to