[jira] [Updated] (HBASE-15160) Put back HFile's HDFS op latency sampling code and add metrics for monitoring

Enis Soztutar (JIRA) Thu, 25 May 2017 17:25:02 -0700

     [ 
https://issues.apache.org/jira/browse/HBASE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Enis Soztutar updated HBASE-15160:
----------------------------------
    Attachment: hbase-15160_v6.patch

[~carp84] how about this patch.  
I've removed the extra counters and made it so that we are passing a boolean 
down from the getMetaBlock() function so that metrics are not updated for the 
meta blocks.  
The reason that we cannot move the timing and updating of metrics up the stack 
is that, the callers of readBlock() do not know whether the returned block is 
read from disk, or comes from cache. Is it easy enough for you to replicate the 
YCSB tests? I've done some basic testing, and did not find meaningful perf 
regression. 

BTW, these metrics would have saved us days worth of debugging in a recent 
case, so let's get this patch in one way or the other. 

> Put back HFile's HDFS op latency sampling code and add metrics for monitoring
> -----------------------------------------------------------------------------
>
>                 Key: HBASE-15160
>                 URL: https://issues.apache.org/jira/browse/HBASE-15160
>             Project: HBase
>          Issue Type: Sub-task
>    Affects Versions: 2.0.0, 1.1.2
>            Reporter: Yu Li
>            Assignee: Yu Li
>         Attachments: HBASE-15160.patch, HBASE-15160_v2.patch, 
> HBASE-15160_v3.patch, hbase-15160_v4.patch, hbase-15160_v5.patch, 
> hbase-15160_v6.patch
>
>
> In HBASE-11586 all HDFS op latency sampling code, including fsReadLatency, 
> fsPreadLatency and fsWriteLatency, have been removed. There was some 
> discussion about putting them back in a new JIRA but never happened. 
> According to our experience, these metrics are useful to judge whether issue 
> lies on HDFS when slow request occurs, so we propose to put them back in this 
> JIRA, and add the metrics for monitoring as well.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HBASE-15160) Put back HFile's HDFS op latency sampling code and add metrics for monitoring

Reply via email to