[
https://issues.apache.org/jira/browse/HBASE-14869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15021615#comment-15021615
]
Yu Li commented on HBASE-14869:
-------------------------------
FWIW, 0-10ms/10-100ms/100-500ms/500-1000ms/>1000ms might be enough?
And what granularity you plan to add the latency bands sir? puts/gets or calls?
Currently it seems we are recording latency of each get within multi
invocation, but for put we count a multi as a whole (check
RSRpcServices#doBatchOp), maybe need a uniform semantic here before computing
bands if in the fine granularity?
Another concern about SLA is that for clients doing batch op, maybe they care
more about service time (latency) of the whole batch rather than each single
put/get? Shall we support both fine (single put/get) and rough (call)
granularity?
> Better request latency histograms
> ---------------------------------
>
> Key: HBASE-14869
> URL: https://issues.apache.org/jira/browse/HBASE-14869
> Project: HBase
> Issue Type: Brainstorming
> Reporter: Lars Hofhansl
>
> I just discussed this with a colleague.
> The get, put, etc, histograms that each region server keeps are somewhat
> useless (depending on what you want to achieve of course), as they are
> aggregated and calculated by each region server.
> It would be better to record the number of requests in certainly latency
> bands in addition to what we do now.
> For example the number of gets that took 0-5ms, 6-10ms, 10-20ms, 20-50ms,
> 50-100ms, 100-1000ms, > 1000ms, etc. (just as an example, should be
> configurable).
> That way we can do further calculations after the fact, and answer questions
> like: How often did we miss our SLA? Percentage of requests that missed an
> SLA, etc.
> Comments?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)