[ 
https://issues.apache.org/jira/browse/HBASE-19285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16257189#comment-16257189
 ] 

Josh Elser commented on HBASE-19285:
------------------------------------

bq. Wouldn't these be as expensive as the per region latency histogram metrics 
were? Since we have to track tables region by region and roll up?

Thanks for the questions! What I currently have staged (working through 
connecting all the dots through the hadoop* indirection :P) would be tracking 
these by table instead of rolling up Regions. My plan is to re-run the perf 
testing that Enis mentioned over on HBASE-17017.

My _hope_ is that by having fewer histograms, we'll see some amortization of 
the cost but, admittedly, I'm not optimistic. My expectation is that having 
something in front of me that can show the bottleneck, a better solution will 
eventually present itself. In some of those older issues, Enis did mention how 
doing the aggregations in a separate through would be better, but I'm not sure 
how we would do that for op latencies. Taking the advice to try out 
HdrHistogram (or some other library) is also rolling around in my head. I 
assume that this will be a tricky one to work out :)

At the end of the day, and echoing some comments offline by Clay, operators 
lost some really nice information by the removal of the per-region histos. For 
multi-tenant installations, there's no longer a mechanism to track "typical" 
performance characteristics for a table which makes it impossible to know when 
performance is "atypical" (and something that should be alerted on).

> Add per-table latency histograms
> --------------------------------
>
>                 Key: HBASE-19285
>                 URL: https://issues.apache.org/jira/browse/HBASE-19285
>             Project: HBase
>          Issue Type: Bug
>          Components: metrics
>            Reporter: Clay B.
>            Assignee: Josh Elser
>            Priority: Critical
>             Fix For: 2.0.0, 1.4.0, 1.3.3
>
>
> HBASE-17017 removed the per-region latency histograms (e.g. Get, Put, Scan at 
> p75, p85, etc)
> HBASE-15518 added some per-table metrics, but not the latency histograms.
> Given the previous conversations, it seems like it these per-table 
> aggregations weren't intentionally omitted, just never re-implemented after 
> the per-region removal. They're some really nice out-of-the-box metrics we 
> can provide to our users/admins as long as it's not detrimental.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to