[
https://issues.apache.org/jira/browse/HBASE-14869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lars Hofhansl updated HBASE-14869:
----------------------------------
Attachment: 14869-v1-0.98.txt
Here's a patch that I tested a bit. Reports the right values.
Changes:
* Renamed "bands" to "ranges".
* Does not report values for range for it hasn't seen a value. That allows us
to use this for operations that take very short (few ms) or very long (many
minutes) times, without reporting time ranges that make no sense for the
operation.
Question: How should I name these metrics? Presumably they'd be processed
mostly by software, and I have to give them some name.
I chose: "metricname"_start-end, i.e."Get_0-1", "Get_1-3", "Get_10-30" , etc,
and "Get_>600000".
Any better ideas?
> Better request latency histograms
> ---------------------------------
>
> Key: HBASE-14869
> URL: https://issues.apache.org/jira/browse/HBASE-14869
> Project: HBase
> Issue Type: Brainstorming
> Reporter: Lars Hofhansl
> Attachments: 14869-test-0.98.txt, 14869-v1-0.98.txt
>
>
> I just discussed this with a colleague.
> The get, put, etc, histograms that each region server keeps are somewhat
> useless (depending on what you want to achieve of course), as they are
> aggregated and calculated by each region server.
> It would be better to record the number of requests in certainly latency
> bands in addition to what we do now.
> For example the number of gets that took 0-5ms, 6-10ms, 10-20ms, 20-50ms,
> 50-100ms, 100-1000ms, > 1000ms, etc. (just as an example, should be
> configurable).
> That way we can do further calculations after the fact, and answer questions
> like: How often did we miss our SLA? Percentage of requests that missed an
> SLA, etc.
> Comments?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)