bbeaudreault commented on PR #4719:
URL: https://github.com/apache/hbase/pull/4719#issuecomment-1343364658

   Yea, I think there are a couple problems with this approach:
   
   1. The histogram is initially created with some generic buckets that are 
used to create the distribution, which is then used to calculate the 
percentiles. Those generic buckets will get more accurate over time, because 
when you snapshotAndReset they use the boundaries of the old bins to modify the 
new bins. Take a look at the `Bins` constructor, which is called in the 
snapshotAndReset method. I would expect that if we never do snapshotAndReset, 
we'll have less accurate percentiles especially for outliers. This seems 
problematic given the typical use-case is for looking at 99th and 99.9th 
percentiles. 
   2. The FastLongHistogram has a few usages outside of jmx metrics. I haven't 
audited them, but we should be sure that any change here will not adversely 
affect the expectations of those usages.
   
   We'll need to address those issues in a way that still achieves the goal. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to