[
https://issues.apache.org/jira/browse/HBASE-6261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402429#comment-13402429
]
Andrew Wang commented on HBASE-6261:
------------------------------------
@Elliot: Moving averages can be cheaply computed on the existing reservoir
sample, this is more about percentiles. I'm not sure how OpenTSDB factors into
this, since you'd have to feed the latency stream to OpenTSDB to figure out
percentiles, which seems expensive. Depending on how tight your speed and
memory constraints are, I think we could do this in HBase at acceptably minimal
cost, or make this configurable somehow.
@Ted: The additional cost to do sliding windows is somewhat significant (I
think 10s of MB more memory). Both the sliding and non-sliding methods allow
for arbitrary percentiles. Anyway, I think reporting the 50th, 90th, 95th, and
99th should satisfy anyone. Mixing and matching algorithms is possible and
probably even advised since it's only worth doing this for high-rate streams
where accuracy is important. Implementations of the cheaper and less accurate
algos are already available.
> Better approximate high-percentile percentile latency metrics
> -------------------------------------------------------------
>
> Key: HBASE-6261
> URL: https://issues.apache.org/jira/browse/HBASE-6261
> Project: HBase
> Issue Type: New Feature
> Reporter: Andrew Wang
> Labels: metrics
> Attachments: Latencyestimation.pdf
>
>
> The existing reservoir-sampling based latency metrics in HBase are not
> well-suited for providing accurate estimates of high-percentile (e.g. 90th,
> 95th, or 99th) latency. This is a well-studied problem in the literature (see
> [1] and [2]), the question is determining which methods best suit our needs
> and then implementing it.
> Ideally, we should be able to estimate these high percentiles with minimal
> memory and CPU usage as well as minimal error (e.g. 1% error on 90th, or .1%
> on 99th). It's also desirable to provide this over different time-based
> sliding windows, e.g. last 1 min, 5 mins, 15 mins, and 1 hour.
> I'll note that this would also be useful in HDFS, or really anywhere latency
> metrics are kept.
> [1] http://www.cs.rutgers.edu/~muthu/bquant.pdf
> [2] http://infolab.stanford.edu/~manku/papers/04pods-sliding.pdf
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira