[
https://issues.apache.org/jira/browse/HBASE-19285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16257189#comment-16257189
]
Josh Elser commented on HBASE-19285:
------------------------------------
bq. Wouldn't these be as expensive as the per region latency histogram metrics
were? Since we have to track tables region by region and roll up?
Thanks for the questions! What I currently have staged (working through
connecting all the dots through the hadoop* indirection :P) would be tracking
these by table instead of rolling up Regions. My plan is to re-run the perf
testing that Enis mentioned over on HBASE-17017.
My _hope_ is that by having fewer histograms, we'll see some amortization of
the cost but, admittedly, I'm not optimistic. My expectation is that having
something in front of me that can show the bottleneck, a better solution will
eventually present itself. In some of those older issues, Enis did mention how
doing the aggregations in a separate through would be better, but I'm not sure
how we would do that for op latencies. Taking the advice to try out
HdrHistogram (or some other library) is also rolling around in my head. I
assume that this will be a tricky one to work out :)
At the end of the day, and echoing some comments offline by Clay, operators
lost some really nice information by the removal of the per-region histos. For
multi-tenant installations, there's no longer a mechanism to track "typical"
performance characteristics for a table which makes it impossible to know when
performance is "atypical" (and something that should be alerted on).
> Add per-table latency histograms
> --------------------------------
>
> Key: HBASE-19285
> URL: https://issues.apache.org/jira/browse/HBASE-19285
> Project: HBase
> Issue Type: Bug
> Components: metrics
> Reporter: Clay B.
> Assignee: Josh Elser
> Priority: Critical
> Fix For: 2.0.0, 1.4.0, 1.3.3
>
>
> HBASE-17017 removed the per-region latency histograms (e.g. Get, Put, Scan at
> p75, p85, etc)
> HBASE-15518 added some per-table metrics, but not the latency histograms.
> Given the previous conversations, it seems like it these per-table
> aggregations weren't intentionally omitted, just never re-implemented after
> the per-region removal. They're some really nice out-of-the-box metrics we
> can provide to our users/admins as long as it's not detrimental.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)