bbeaudreault commented on PR #4719: URL: https://github.com/apache/hbase/pull/4719#issuecomment-1343364658
Yea, I think there are a couple problems with this approach: 1. The histogram is initially created with some generic buckets that are used to create the distribution, which is then used to calculate the percentiles. Those generic buckets will get more accurate over time, because when you snapshotAndReset they use the boundaries of the old bins to modify the new bins. Take a look at the `Bins` constructor, which is called in the snapshotAndReset method. I would expect that if we never do snapshotAndReset, we'll have less accurate percentiles especially for outliers. This seems problematic given the typical use-case is for looking at 99th and 99.9th percentiles. 2. The FastLongHistogram has a few usages outside of jmx metrics. I haven't audited them, but we should be sure that any change here will not adversely affect the expectations of those usages. We'll need to address those issues in a way that still achieves the goal. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
