[ 
https://issues.apache.org/jira/browse/HBASE-5533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13234121#comment-13234121
 ] 

stack commented on HBASE-5533:
------------------------------

I'm +1 on committing to trunk.  Would like to see how it does there first 
before bringing it back.  This is a nice addition.

Shaneal, do you think there'll be issue because you are measuring here using 
nanotime whereas elsewhere we do System.currentTimeMillis doing measurements?  
Do you think this will ever make for discrepency or do you think that in the 
wash, a slow call is a slow call and between the jigs and reels, the nanotime 
count will be about same as millisecond counts?

Also, what about these calls down in HRegion.get and deep in put where we do:
{code}
    long now = EnvironmentEdgeManager.currentTimeMillis();
{code}

I see the now is used to do things like:

{code}
      HRegion.incrTimeVaryingMetric(metricPrefix + "put_", after - now);
{code}

... is there overlap between these old metrics and what you are adding?  Or, 
since your additions work at a higher level up in HRegionServer rather long 
after the get or put has started, should we be pulling these metrics up into 
HRS so they envelope more of the server operation than they currently do?

Good stuff.

                
> Add more metrics to HBase
> -------------------------
>
>                 Key: HBASE-5533
>                 URL: https://issues.apache.org/jira/browse/HBASE-5533
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.92.2, 0.94.0
>            Reporter: Shaneal Manek
>            Assignee: Shaneal Manek
>            Priority: Minor
>         Attachments: BlockingQueueContention.java, HBASE-5533-0.92-v4.patch, 
> TimingOverhead.java, hbase-5533-0.92.patch, hbase5533-0.92-v2.patch, 
> hbase5533-0.92-v3.patch, hbase5533-0.92-v5.patch, histogram_web_ui.png
>
>
> To debug/monitor production clusters, there are some more metrics I wish I 
> had available.
> In particular:
> - Although the average FS latencies are useful, a 'histogram' of recent 
> latencies (90% of reads completed in under 100ms, 99% in under 200ms, etc) 
> would be more useful
> - Similar histograms of latencies on common operations (GET, PUT, DELETE) 
> would be useful
> - Counting the number of accesses to each region to detect hotspotting
> - Exposing the current number of HLog files

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to