[ 
https://issues.apache.org/jira/browse/HBASE-6261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402429#comment-13402429
 ] 

Andrew Wang commented on HBASE-6261:
------------------------------------

@Elliot: Moving averages can be cheaply computed on the existing reservoir 
sample, this is more about percentiles. I'm not sure how OpenTSDB factors into 
this, since you'd have to feed the latency stream to OpenTSDB to figure out 
percentiles, which seems expensive. Depending on how tight your speed and 
memory constraints are, I think we could do this in HBase at acceptably minimal 
cost, or make this configurable somehow.

@Ted: The additional cost to do sliding windows is somewhat significant (I 
think 10s of MB more memory). Both the sliding and non-sliding methods allow 
for arbitrary percentiles. Anyway, I think reporting the 50th, 90th, 95th, and 
99th should satisfy anyone. Mixing and matching algorithms is possible and 
probably even advised since it's only worth doing this for high-rate streams 
where accuracy is important. Implementations of the cheaper and less accurate 
algos are already available.
                
> Better approximate high-percentile percentile latency metrics
> -------------------------------------------------------------
>
>                 Key: HBASE-6261
>                 URL: https://issues.apache.org/jira/browse/HBASE-6261
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Andrew Wang
>              Labels: metrics
>         Attachments: Latencyestimation.pdf
>
>
> The existing reservoir-sampling based latency metrics in HBase are not 
> well-suited for providing accurate estimates of high-percentile (e.g. 90th, 
> 95th, or 99th) latency. This is a well-studied problem in the literature (see 
> [1] and [2]), the question is determining which methods best suit our needs 
> and then implementing it.
> Ideally, we should be able to estimate these high percentiles with minimal 
> memory and CPU usage as well as minimal error (e.g. 1% error on 90th, or .1% 
> on 99th). It's also desirable to provide this over different time-based 
> sliding windows, e.g. last 1 min, 5 mins, 15 mins, and 1 hour.
> I'll note that this would also be useful in HDFS, or really anywhere latency 
> metrics are kept.
> [1] http://www.cs.rutgers.edu/~muthu/bquant.pdf
> [2] http://infolab.stanford.edu/~manku/papers/04pods-sliding.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to