[jira] [Commented] (HBASE-15160) Put back HFile's HDFS op latency sampling code and add metrics for monitoring

2017-06-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16039970#comment-16039970
 ] 

Hudson commented on HBASE-15160:


FAILURE: Integrated in Jenkins build HBase-1.4 #763 (See 
[https://builds.apache.org/job/HBase-1.4/763/])
HBASE-15160 Put back HFile's HDFS op latency sampling code and add (enis: rev 
ea3075e7fd2db40c7d8d74a4845a04b657b0d5d7)
* (add) 
hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/io/MetricsIOWrapper.java
* (add) 
hbase-server/src/main/java/org/apache/hadoop/hbase/io/MetricsIOWrapperImpl.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
* (add) 
hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/io/MetricsIOSourceImpl.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
* (add) 
hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/io/MetricsIOSource.java
* (add) hbase-server/src/main/java/org/apache/hadoop/hbase/io/MetricsIO.java
* (edit) 
hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionServerSourceFactory.java
* (edit) 
hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionServerSourceFactoryImpl.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java
* (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV3.java
* (add) hbase-server/src/test/java/org/apache/hadoop/hbase/io/TestMetricsIO.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileEncryption.java


> Put back HFile's HDFS op latency sampling code and add metrics for monitoring
> -
>
> Key: HBASE-15160
> URL: https://issues.apache.org/jira/browse/HBASE-15160
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0, 1.1.2
>Reporter: Yu Li
>Assignee: Yu Li
>Priority: Critical
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-15160.patch, HBASE-15160_v2.patch, 
> HBASE-15160_v3.patch, hbase-15160_v4.patch, hbase-15160_v5.patch, 
> hbase-15160_v6.patch, hbase-15160_v7.patch, hbase-15160_v8.branch-1.patch, 
> hbase-15160_v8.branch-1.patch, hbase-15160_v8.branch-1.patch, 
> hbase-15160_v8.patch
>
>
> In HBASE-11586 all HDFS op latency sampling code, including fsReadLatency, 
> fsPreadLatency and fsWriteLatency, have been removed. There was some 
> discussion about putting them back in a new JIRA but never happened. 
> According to our experience, these metrics are useful to judge whether issue 
> lies on HDFS when slow request occurs, so we propose to put them back in this 
> JIRA, and add the metrics for monitoring as well.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-15160) Put back HFile's HDFS op latency sampling code and add metrics for monitoring

2017-06-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16039635#comment-16039635
 ] 

Hadoop QA commented on HBASE-15160:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 13m 17s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 7 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 14s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
12s {color} | {color:green} branch-1 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 49s 
{color} | {color:green} branch-1 passed with JDK v1.8.0_131 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 56s 
{color} | {color:green} branch-1 passed with JDK v1.7.0_131 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
34s {color} | {color:green} branch-1 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
50s {color} | {color:green} branch-1 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 52s 
{color} | {color:red} hbase-server in branch-1 has 1 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 46s 
{color} | {color:green} branch-1 passed with JDK v1.8.0_131 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s 
{color} | {color:green} branch-1 passed with JDK v1.7.0_131 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
3s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 50s 
{color} | {color:green} the patch passed with JDK v1.8.0_131 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 50s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 56s 
{color} | {color:green} the patch passed with JDK v1.7.0_131 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 56s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
29s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
37s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
16m 16s {color} | {color:green} The patch does not cause any errors with Hadoop 
2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 0m 
35s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
22s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 42s 
{color} | {color:green} the patch passed with JDK v1.8.0_131 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s 
{color} | {color:green} the patch passed with JDK v1.7.0_131 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 19s 
{color} | {color:green} hbase-hadoop-compat in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 23s 
{color} | {color:green} hbase-hadoop2-compat in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 88m 12s {color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
54s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 144m 59s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.client.TestReplicasClient |
|   | 

[jira] [Commented] (HBASE-15160) Put back HFile's HDFS op latency sampling code and add metrics for monitoring

2017-06-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16037508#comment-16037508
 ] 

Hadoop QA commented on HBASE-15160:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} docker {color} | {color:red} 0m 32s 
{color} | {color:red} Docker failed to build yetus/hbase:58c504e. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12871291/hbase-15160_v8.branch-1.patch
 |
| JIRA Issue | HBASE-15160 |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/7079/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> Put back HFile's HDFS op latency sampling code and add metrics for monitoring
> -
>
> Key: HBASE-15160
> URL: https://issues.apache.org/jira/browse/HBASE-15160
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0, 1.1.2
>Reporter: Yu Li
>Assignee: Yu Li
>Priority: Critical
> Attachments: HBASE-15160.patch, HBASE-15160_v2.patch, 
> HBASE-15160_v3.patch, hbase-15160_v4.patch, hbase-15160_v5.patch, 
> hbase-15160_v6.patch, hbase-15160_v7.patch, hbase-15160_v8.branch-1.patch, 
> hbase-15160_v8.branch-1.patch, hbase-15160_v8.patch
>
>
> In HBASE-11586 all HDFS op latency sampling code, including fsReadLatency, 
> fsPreadLatency and fsWriteLatency, have been removed. There was some 
> discussion about putting them back in a new JIRA but never happened. 
> According to our experience, these metrics are useful to judge whether issue 
> lies on HDFS when slow request occurs, so we propose to put them back in this 
> JIRA, and add the metrics for monitoring as well.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-15160) Put back HFile's HDFS op latency sampling code and add metrics for monitoring

2017-06-05 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16037504#comment-16037504
 ] 

Enis Soztutar commented on HBASE-15160:
---

Docker failed to download Java 7. Let me reattach the patch. 

> Put back HFile's HDFS op latency sampling code and add metrics for monitoring
> -
>
> Key: HBASE-15160
> URL: https://issues.apache.org/jira/browse/HBASE-15160
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0, 1.1.2
>Reporter: Yu Li
>Assignee: Yu Li
>Priority: Critical
> Attachments: HBASE-15160.patch, HBASE-15160_v2.patch, 
> HBASE-15160_v3.patch, hbase-15160_v4.patch, hbase-15160_v5.patch, 
> hbase-15160_v6.patch, hbase-15160_v7.patch, hbase-15160_v8.branch-1.patch, 
> hbase-15160_v8.branch-1.patch, hbase-15160_v8.patch
>
>
> In HBASE-11586 all HDFS op latency sampling code, including fsReadLatency, 
> fsPreadLatency and fsWriteLatency, have been removed. There was some 
> discussion about putting them back in a new JIRA but never happened. 
> According to our experience, these metrics are useful to judge whether issue 
> lies on HDFS when slow request occurs, so we propose to put them back in this 
> JIRA, and add the metrics for monitoring as well.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-15160) Put back HFile's HDFS op latency sampling code and add metrics for monitoring

2017-06-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16037402#comment-16037402
 ] 

Hadoop QA commented on HBASE-15160:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} docker {color} | {color:red} 0m 24s 
{color} | {color:red} Docker failed to build yetus/hbase:58c504e. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12871279/hbase-15160_v8.branch-1.patch
 |
| JIRA Issue | HBASE-15160 |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/7076/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> Put back HFile's HDFS op latency sampling code and add metrics for monitoring
> -
>
> Key: HBASE-15160
> URL: https://issues.apache.org/jira/browse/HBASE-15160
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0, 1.1.2
>Reporter: Yu Li
>Assignee: Yu Li
>Priority: Critical
> Attachments: HBASE-15160.patch, HBASE-15160_v2.patch, 
> HBASE-15160_v3.patch, hbase-15160_v4.patch, hbase-15160_v5.patch, 
> hbase-15160_v6.patch, hbase-15160_v7.patch, hbase-15160_v8.branch-1.patch, 
> hbase-15160_v8.patch
>
>
> In HBASE-11586 all HDFS op latency sampling code, including fsReadLatency, 
> fsPreadLatency and fsWriteLatency, have been removed. There was some 
> discussion about putting them back in a new JIRA but never happened. 
> According to our experience, these metrics are useful to judge whether issue 
> lies on HDFS when slow request occurs, so we propose to put them back in this 
> JIRA, and add the metrics for monitoring as well.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-15160) Put back HFile's HDFS op latency sampling code and add metrics for monitoring

2017-06-03 Thread Yu Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035892#comment-16035892
 ] 

Yu Li commented on HBASE-15160:
---

Good findings [~yangzhe1991], the analysis makes sense, let's get HBASE-18151 
fixed and recheck the perf of using nano.

And thanks for all the efforts [~enis].



> Put back HFile's HDFS op latency sampling code and add metrics for monitoring
> -
>
> Key: HBASE-15160
> URL: https://issues.apache.org/jira/browse/HBASE-15160
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0, 1.1.2
>Reporter: Yu Li
>Assignee: Yu Li
>Priority: Critical
> Attachments: HBASE-15160.patch, HBASE-15160_v2.patch, 
> HBASE-15160_v3.patch, hbase-15160_v4.patch, hbase-15160_v5.patch, 
> hbase-15160_v6.patch, hbase-15160_v7.patch, hbase-15160_v8.patch
>
>
> In HBASE-11586 all HDFS op latency sampling code, including fsReadLatency, 
> fsPreadLatency and fsWriteLatency, have been removed. There was some 
> discussion about putting them back in a new JIRA but never happened. 
> According to our experience, these metrics are useful to judge whether issue 
> lies on HDFS when slow request occurs, so we propose to put them back in this 
> JIRA, and add the metrics for monitoring as well.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-15160) Put back HFile's HDFS op latency sampling code and add metrics for monitoring

2017-06-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035790#comment-16035790
 ] 

Hudson commented on HBASE-15160:


FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #3125 (See 
[https://builds.apache.org/job/HBase-Trunk_matrix/3125/])
HBASE-15160 Put back HFile's HDFS op latency sampling code and add (enis: rev 
118429cbac0d71d57d574db825b2f077146a961e)
* (edit) 
hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionServerSourceFactoryImpl.java
* (add) 
hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/io/MetricsIOSourceImpl.java
* (edit) 
hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionServerSourceFactory.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java
* (add) 
hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/io/MetricsIOWrapper.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderImpl.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterImpl.java
* (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
* (add) 
hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/io/MetricsIOSource.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileEncryption.java
* (add) 
hbase-server/src/main/java/org/apache/hadoop/hbase/io/MetricsIOWrapperImpl.java
* (add) hbase-server/src/test/java/org/apache/hadoop/hbase/io/TestMetricsIO.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV3.java
* (add) hbase-server/src/main/java/org/apache/hadoop/hbase/io/MetricsIO.java


> Put back HFile's HDFS op latency sampling code and add metrics for monitoring
> -
>
> Key: HBASE-15160
> URL: https://issues.apache.org/jira/browse/HBASE-15160
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0, 1.1.2
>Reporter: Yu Li
>Assignee: Yu Li
>Priority: Critical
> Attachments: HBASE-15160.patch, HBASE-15160_v2.patch, 
> HBASE-15160_v3.patch, hbase-15160_v4.patch, hbase-15160_v5.patch, 
> hbase-15160_v6.patch, hbase-15160_v7.patch, hbase-15160_v8.patch
>
>
> In HBASE-11586 all HDFS op latency sampling code, including fsReadLatency, 
> fsPreadLatency and fsWriteLatency, have been removed. There was some 
> discussion about putting them back in a new JIRA but never happened. 
> According to our experience, these metrics are useful to judge whether issue 
> lies on HDFS when slow request occurs, so we propose to put them back in this 
> JIRA, and add the metrics for monitoring as well.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-15160) Put back HFile's HDFS op latency sampling code and add metrics for monitoring

2017-06-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035718#comment-16035718
 ] 

Hadoop QA commented on HBASE-15160:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 35s {color} 
| {color:red} HBASE-15160 does not apply to master. Rebase required? Wrong 
Branch? See https://yetus.apache.org/documentation/0.3.0/precommit-patchnames 
for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12871092/hbase-15160_v8.patch |
| JIRA Issue | HBASE-15160 |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/7056/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> Put back HFile's HDFS op latency sampling code and add metrics for monitoring
> -
>
> Key: HBASE-15160
> URL: https://issues.apache.org/jira/browse/HBASE-15160
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0, 1.1.2
>Reporter: Yu Li
>Assignee: Yu Li
>Priority: Critical
> Attachments: HBASE-15160.patch, HBASE-15160_v2.patch, 
> HBASE-15160_v3.patch, hbase-15160_v4.patch, hbase-15160_v5.patch, 
> hbase-15160_v6.patch, hbase-15160_v7.patch, hbase-15160_v8.patch
>
>
> In HBASE-11586 all HDFS op latency sampling code, including fsReadLatency, 
> fsPreadLatency and fsWriteLatency, have been removed. There was some 
> discussion about putting them back in a new JIRA but never happened. 
> According to our experience, these metrics are useful to judge whether issue 
> lies on HDFS when slow request occurs, so we propose to put them back in this 
> JIRA, and add the metrics for monitoring as well.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-15160) Put back HFile's HDFS op latency sampling code and add metrics for monitoring

2017-06-02 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035209#comment-16035209
 ] 

Enis Soztutar commented on HBASE-15160:
---

Good finding. The histogram is supposed to adjust itself everytime 
{{snapshotAndReset()}} is called which is every collection interval (10 secs 
for example). Maybe this mechanism is not working as well as it should. 
Anyways, we can pursue further in HBASE-18151. 

Using millis is fine for now until HBASE-18151 is fixed. We typically use nanos 
in newer code bases to have more granularity, but in this specific instance it 
is better to have millis metrics than no metrics at all. Let me commmit v7 
patch with a small renaming of latencyNanos to latencyMillis. 

> Put back HFile's HDFS op latency sampling code and add metrics for monitoring
> -
>
> Key: HBASE-15160
> URL: https://issues.apache.org/jira/browse/HBASE-15160
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0, 1.1.2
>Reporter: Yu Li
>Assignee: Yu Li
>Priority: Critical
> Attachments: HBASE-15160.patch, HBASE-15160_v2.patch, 
> HBASE-15160_v3.patch, hbase-15160_v4.patch, hbase-15160_v5.patch, 
> hbase-15160_v6.patch, hbase-15160_v7.patch
>
>
> In HBASE-11586 all HDFS op latency sampling code, including fsReadLatency, 
> fsPreadLatency and fsWriteLatency, have been removed. There was some 
> discussion about putting them back in a new JIRA but never happened. 
> According to our experience, these metrics are useful to judge whether issue 
> lies on HDFS when slow request occurs, so we propose to put them back in this 
> JIRA, and add the metrics for monitoring as well.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-15160) Put back HFile's HDFS op latency sampling code and add metrics for monitoring

2017-06-02 Thread Phil Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034517#comment-16034517
 ] 

Phil Yang commented on HBASE-15160:
---

File a new issue HBASE-18151 to fix the Histogram problems. I think we can use 
ms-level here before we fix the issue.

> Put back HFile's HDFS op latency sampling code and add metrics for monitoring
> -
>
> Key: HBASE-15160
> URL: https://issues.apache.org/jira/browse/HBASE-15160
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0, 1.1.2
>Reporter: Yu Li
>Assignee: Yu Li
>Priority: Critical
> Attachments: HBASE-15160.patch, HBASE-15160_v2.patch, 
> HBASE-15160_v3.patch, hbase-15160_v4.patch, hbase-15160_v5.patch, 
> hbase-15160_v6.patch, hbase-15160_v7.patch
>
>
> In HBASE-11586 all HDFS op latency sampling code, including fsReadLatency, 
> fsPreadLatency and fsWriteLatency, have been removed. There was some 
> discussion about putting them back in a new JIRA but never happened. 
> According to our experience, these metrics are useful to judge whether issue 
> lies on HDFS when slow request occurs, so we propose to put them back in this 
> JIRA, and add the metrics for monitoring as well.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-15160) Put back HFile's HDFS op latency sampling code and add metrics for monitoring

2017-06-02 Thread Phil Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034497#comment-16034497
 ] 

Phil Yang commented on HBASE-15160:
---

[~carp84] I think the performance issue is not because the performance of 
nanoTime (although it is a little slower than currentTimeMillis), the issue is 
Histogram use several buckets to estimate percentile and it assums a uniform 
distribution within a range and the max value is only 1000 or maybe less (see 
https://github.com/apache/hbase/blob/master/hbase-metrics/src/main/java/org/apache/hadoop/hbase/metrics/impl/FastLongHistogram.java#L69-L93
 ) So if you use nanoTime, all values is much longer than 1000 and they are all 
in the last bucket. Then the bottleneck is single one LongAdder.

> Put back HFile's HDFS op latency sampling code and add metrics for monitoring
> -
>
> Key: HBASE-15160
> URL: https://issues.apache.org/jira/browse/HBASE-15160
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0, 1.1.2
>Reporter: Yu Li
>Assignee: Yu Li
>Priority: Critical
> Attachments: HBASE-15160.patch, HBASE-15160_v2.patch, 
> HBASE-15160_v3.patch, hbase-15160_v4.patch, hbase-15160_v5.patch, 
> hbase-15160_v6.patch, hbase-15160_v7.patch
>
>
> In HBASE-11586 all HDFS op latency sampling code, including fsReadLatency, 
> fsPreadLatency and fsWriteLatency, have been removed. There was some 
> discussion about putting them back in a new JIRA but never happened. 
> According to our experience, these metrics are useful to judge whether issue 
> lies on HDFS when slow request occurs, so we propose to put them back in this 
> JIRA, and add the metrics for monitoring as well.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-15160) Put back HFile's HDFS op latency sampling code and add metrics for monitoring

2017-06-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034454#comment-16034454
 ] 

Hadoop QA commented on HBASE-15160:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 6 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 41s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
58s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 10s 
{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
31s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
36s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
47s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 50s 
{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
6s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 59s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 59s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
26s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
32s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
30m 25s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha2. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
53s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 47s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 19s 
{color} | {color:green} hbase-hadoop-compat in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 23s 
{color} | {color:green} hbase-hadoop2-compat in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 19m 48s {color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
26s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 69m 57s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.master.locking.TestLockProcedure |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.12.3 Server=1.12.3 Image:yetus/hbase:757bf37 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12870938/hbase-15160_v7.patch |
| JIRA Issue | HBASE-15160 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  
hbaseanti  checkstyle  compile  |
| uname | Linux 316a455ddde0 3.13.0-107-generic #154-Ubuntu SMP Tue Dec 20 
09:57:27 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 5491714 |
| Default Java | 1.8.0_131 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-HBASE-Build/7048/artifact/patchprocess/patch-unit-hbase-server.txt
 |
| unit test logs 

[jira] [Commented] (HBASE-15160) Put back HFile's HDFS op latency sampling code and add metrics for monitoring

2017-06-02 Thread Yu Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034377#comment-16034377
 ] 

Yu Li commented on HBASE-15160:
---

[~enis] Please let me know your thoughts on changing the counting granularity 
from nano to milli seconds. Thanks. 

> Put back HFile's HDFS op latency sampling code and add metrics for monitoring
> -
>
> Key: HBASE-15160
> URL: https://issues.apache.org/jira/browse/HBASE-15160
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0, 1.1.2
>Reporter: Yu Li
>Assignee: Yu Li
>Priority: Critical
> Attachments: HBASE-15160.patch, HBASE-15160_v2.patch, 
> HBASE-15160_v3.patch, hbase-15160_v4.patch, hbase-15160_v5.patch, 
> hbase-15160_v6.patch, hbase-15160_v7.patch
>
>
> In HBASE-11586 all HDFS op latency sampling code, including fsReadLatency, 
> fsPreadLatency and fsWriteLatency, have been removed. There was some 
> discussion about putting them back in a new JIRA but never happened. 
> According to our experience, these metrics are useful to judge whether issue 
> lies on HDFS when slow request occurs, so we propose to put them back in this 
> JIRA, and add the metrics for monitoring as well.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-15160) Put back HFile's HDFS op latency sampling code and add metrics for monitoring

2017-06-02 Thread Yu Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034265#comment-16034265
 ] 

Yu Li commented on HBASE-15160:
---

Here is the performance testing result:
|| Case || Round || Throughput (ops/s)|| AverageLatency(us)||
| w/o patch| 1 | 120820.29 | 26309.65 |
| | 2|122079.26|26019.93|
|w/ patch v6| 1 | 85544.53| 37222.54|
| |2|87071.61|36563.49|


Test details:
{noformat}
# HBase Settings
-Xmx49152m -Xms49152m -Xmn6144m -XX:SurvivorRatio=2
hfile.block.cache.size => 0.16
hbase.regionserver.handler.count => 192
hbase.rpc.server.impl => org.apache.hadoop.hbase.ipc.NettyRpcServer

# YCSB settings
recordcount=1100 (11M)
fieldcount=1
fieldlength=1024
table schemaļ¼š{NAME => 'cf', DATA_BLOCK_ENCODING => 'DIFF', VERSIONS=> '1', 
COMPRESSION => 'SNAPPY', IN_MEMORY => 'false', BLOCKCACHE => 'true'},{SPLITS => 
(1..9).map {|i| "user#{1000+i*(-1000)/9}"}, METADATA => 
{'hbase.hstore.block.storage.policy' => 'ALL_SSD'}}

LRUCache hit ratio: 93%
{noformat}

Checking the patch in more details, I doubt the regression comes from the 
{{System.nanoTime}} call, and will change it to {{System.currentTimeMillis}} 
and re-test

> Put back HFile's HDFS op latency sampling code and add metrics for monitoring
> -
>
> Key: HBASE-15160
> URL: https://issues.apache.org/jira/browse/HBASE-15160
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0, 1.1.2
>Reporter: Yu Li
>Assignee: Yu Li
>Priority: Critical
> Attachments: HBASE-15160.patch, HBASE-15160_v2.patch, 
> HBASE-15160_v3.patch, hbase-15160_v4.patch, hbase-15160_v5.patch, 
> hbase-15160_v6.patch
>
>
> In HBASE-11586 all HDFS op latency sampling code, including fsReadLatency, 
> fsPreadLatency and fsWriteLatency, have been removed. There was some 
> discussion about putting them back in a new JIRA but never happened. 
> According to our experience, these metrics are useful to judge whether issue 
> lies on HDFS when slow request occurs, so we propose to put them back in this 
> JIRA, and add the metrics for monitoring as well.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-15160) Put back HFile's HDFS op latency sampling code and add metrics for monitoring

2017-05-31 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16031580#comment-16031580
 ] 

Enis Soztutar commented on HBASE-15160:
---

bq. Let me check YCSB to make sure no performance regression with it (should be 
since the current patch is quite similar to the one we're running online).
Thanks [~carp84]. Appreciate it. 

> Put back HFile's HDFS op latency sampling code and add metrics for monitoring
> -
>
> Key: HBASE-15160
> URL: https://issues.apache.org/jira/browse/HBASE-15160
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0, 1.1.2
>Reporter: Yu Li
>Assignee: Yu Li
>Priority: Critical
> Attachments: HBASE-15160.patch, HBASE-15160_v2.patch, 
> HBASE-15160_v3.patch, hbase-15160_v4.patch, hbase-15160_v5.patch, 
> hbase-15160_v6.patch
>
>
> In HBASE-11586 all HDFS op latency sampling code, including fsReadLatency, 
> fsPreadLatency and fsWriteLatency, have been removed. There was some 
> discussion about putting them back in a new JIRA but never happened. 
> According to our experience, these metrics are useful to judge whether issue 
> lies on HDFS when slow request occurs, so we propose to put them back in this 
> JIRA, and add the metrics for monitoring as well.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-15160) Put back HFile's HDFS op latency sampling code and add metrics for monitoring

2017-05-30 Thread Yu Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16030526#comment-16030526
 ] 

Yu Li commented on HBASE-15160:
---

bq. why do you think that updating the metrics should be pulled up the stack?
Since there's a synchronization in {{getMetaBlock}} and up the stack we could 
record the IO time of {{getMetaBlock}} outside the lock. But considering in 
real world meta is cached for most case, it's ok to exclude the IO time of it. 
So this is not an argument but just a clarification of my thought since you 
asked (smile).

I think reasons like making it easier for backport stands, so current patch 
LGTM. Let me check YCSB to make sure no performance regression with it (should 
be since the current patch is quite similar to the one we're running online).

> Put back HFile's HDFS op latency sampling code and add metrics for monitoring
> -
>
> Key: HBASE-15160
> URL: https://issues.apache.org/jira/browse/HBASE-15160
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0, 1.1.2
>Reporter: Yu Li
>Assignee: Yu Li
>Priority: Critical
> Attachments: HBASE-15160.patch, HBASE-15160_v2.patch, 
> HBASE-15160_v3.patch, hbase-15160_v4.patch, hbase-15160_v5.patch, 
> hbase-15160_v6.patch
>
>
> In HBASE-11586 all HDFS op latency sampling code, including fsReadLatency, 
> fsPreadLatency and fsWriteLatency, have been removed. There was some 
> discussion about putting them back in a new JIRA but never happened. 
> According to our experience, these metrics are useful to judge whether issue 
> lies on HDFS when slow request occurs, so we propose to put them back in this 
> JIRA, and add the metrics for monitoring as well.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-15160) Put back HFile's HDFS op latency sampling code and add metrics for monitoring

2017-05-30 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16030045#comment-16030045
 ] 

Enis Soztutar commented on HBASE-15160:
---

bq. Previously the concern on readAtOffset completely make sense, but 
HBASE-17917 has removed the stream lock so no more stream read when pread is 
true, which makes it possible to move the updating of the metrics up to the 
caller (smile).
Agreed that with the stream lock gone, we can always know when it was a pread 
and when it was not from the caller. However, why do you think that updating 
the metrics should be pulled up the stack? Since there are no other 
synchronization points, they are equal in terms of cost. The reason I wanted to 
be pushed down the stack is that in some cases (for example checksum failure) 
we are doing two reads transparently to the caller. The metrics pulled up the 
stack will be incorrect slightly when things like this happens. Also I want to 
backport this to branch-1, so keeping the metrics update here should give us 
better portability of future patches. 


> Put back HFile's HDFS op latency sampling code and add metrics for monitoring
> -
>
> Key: HBASE-15160
> URL: https://issues.apache.org/jira/browse/HBASE-15160
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0, 1.1.2
>Reporter: Yu Li
>Assignee: Yu Li
>Priority: Critical
> Attachments: HBASE-15160.patch, HBASE-15160_v2.patch, 
> HBASE-15160_v3.patch, hbase-15160_v4.patch, hbase-15160_v5.patch, 
> hbase-15160_v6.patch
>
>
> In HBASE-11586 all HDFS op latency sampling code, including fsReadLatency, 
> fsPreadLatency and fsWriteLatency, have been removed. There was some 
> discussion about putting them back in a new JIRA but never happened. 
> According to our experience, these metrics are useful to judge whether issue 
> lies on HDFS when slow request occurs, so we propose to put them back in this 
> JIRA, and add the metrics for monitoring as well.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-15160) Put back HFile's HDFS op latency sampling code and add metrics for monitoring

2017-05-27 Thread Yu Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027352#comment-16027352
 ] 

Yu Li commented on HBASE-15160:
---

Have checked the patch and some comments:

bq. the callers of readBlock() do not know whether the returned block is read 
from disk, or comes from cache
I could see there's a {{if (cacheConf.isBlockCacheEnabled())}} check in 
{{HFileReaderImpl#readBlock}} where the cached block will be returned if hit, 
so we could simply update the metrics outside the if check? And with the same 
method we could also record the IO time of {{getMetaBlock}} in the finally 
clause (if cache missed). Wdyt?

Previously the concern on {{readAtOffset}} completely make sense, but 
HBASE-17917 has removed the stream lock so no more stream read when {{pread}} 
is true, which makes it possible to move the updating of the metrics up to the 
caller (smile).

> Put back HFile's HDFS op latency sampling code and add metrics for monitoring
> -
>
> Key: HBASE-15160
> URL: https://issues.apache.org/jira/browse/HBASE-15160
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0, 1.1.2
>Reporter: Yu Li
>Assignee: Yu Li
>Priority: Critical
> Attachments: HBASE-15160.patch, HBASE-15160_v2.patch, 
> HBASE-15160_v3.patch, hbase-15160_v4.patch, hbase-15160_v5.patch, 
> hbase-15160_v6.patch
>
>
> In HBASE-11586 all HDFS op latency sampling code, including fsReadLatency, 
> fsPreadLatency and fsWriteLatency, have been removed. There was some 
> discussion about putting them back in a new JIRA but never happened. 
> According to our experience, these metrics are useful to judge whether issue 
> lies on HDFS when slow request occurs, so we propose to put them back in this 
> JIRA, and add the metrics for monitoring as well.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-15160) Put back HFile's HDFS op latency sampling code and add metrics for monitoring

2017-05-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16025893#comment-16025893
 ] 

Hadoop QA commented on HBASE-15160:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 36s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 6 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 41s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
46s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 24s 
{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
2s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 
12s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 6m 
19s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 8s 
{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 32s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
0s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 27s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 27s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
0s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 
18s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
69m 12s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha2. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 7m 8s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 58s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 41s 
{color} | {color:green} hbase-hadoop-compat in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 48s 
{color} | {color:green} hbase-hadoop2-compat in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 210m 18s 
{color} | {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 1m 
8s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 323m 41s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.client.TestAsyncProcedureAdminApi |
|   | hadoop.hbase.client.TestAsyncBalancerAdminApi |
| Timed out junit tests | 
org.apache.hadoop.hbase.filter.TestFuzzyRowFilterEndToEnd |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.03.0-ce Server=17.03.0-ce Image:yetus/hbase:757bf37 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12869966/hbase-15160_v6.patch |
| JIRA Issue | HBASE-15160 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  
hbaseanti  checkstyle  compile  |
| uname | Linux ef96a7877583 4.8.3-std-1 #1 SMP Fri Oct 21 11:15:43 UTC 2016 
x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@2/component/dev-support/hbase-personality.sh
 |
| git revision | master / f441ca0 |
| Default Java | 1.8.0_131 |
| findbugs | v3.0.0 |
| 

[jira] [Commented] (HBASE-15160) Put back HFile's HDFS op latency sampling code and add metrics for monitoring

2017-05-25 Thread Yu Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16025754#comment-16025754
 ] 

Yu Li commented on HBASE-15160:
---

Thanks for reviving this [~enis], will check the patch and try the YCSB tests 
ASAP

bq. BTW, these metrics would have saved us days worth of debugging in a recent 
case, so let's get this patch in one way or the other.
Understood, the same in our case and it's the exact reason we already put this 
(a customized version) online for one and a half years (smile).

> Put back HFile's HDFS op latency sampling code and add metrics for monitoring
> -
>
> Key: HBASE-15160
> URL: https://issues.apache.org/jira/browse/HBASE-15160
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0, 1.1.2
>Reporter: Yu Li
>Assignee: Yu Li
>Priority: Critical
> Attachments: HBASE-15160.patch, HBASE-15160_v2.patch, 
> HBASE-15160_v3.patch, hbase-15160_v4.patch, hbase-15160_v5.patch, 
> hbase-15160_v6.patch
>
>
> In HBASE-11586 all HDFS op latency sampling code, including fsReadLatency, 
> fsPreadLatency and fsWriteLatency, have been removed. There was some 
> discussion about putting them back in a new JIRA but never happened. 
> According to our experience, these metrics are useful to judge whether issue 
> lies on HDFS when slow request occurs, so we propose to put them back in this 
> JIRA, and add the metrics for monitoring as well.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-15160) Put back HFile's HDFS op latency sampling code and add metrics for monitoring

2017-03-15 Thread Yu Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15926523#comment-15926523
 ] 

Yu Li commented on HBASE-15160:
---

[~enis] Is it possible for us to get back on this and move on sir? In some 
[email 
thread|http://mail-archives.apache.org/mod_mbox/hbase-dev/201703.mbox/%3CCAM7-19%2BSVg%2BTWg13UO3%3DTQz3KQSoVu3FkaBLdR%3DdEb2fRimCKw%40mail.gmail.com%3E]
 user is asking when this could get committed. Thanks.

> Put back HFile's HDFS op latency sampling code and add metrics for monitoring
> -
>
> Key: HBASE-15160
> URL: https://issues.apache.org/jira/browse/HBASE-15160
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0, 1.1.2
>Reporter: Yu Li
>Assignee: Yu Li
> Attachments: HBASE-15160.patch, HBASE-15160_v2.patch, 
> HBASE-15160_v3.patch, hbase-15160_v4.patch, hbase-15160_v5.patch
>
>
> In HBASE-11586 all HDFS op latency sampling code, including fsReadLatency, 
> fsPreadLatency and fsWriteLatency, have been removed. There was some 
> discussion about putting them back in a new JIRA but never happened. 
> According to our experience, these metrics are useful to judge whether issue 
> lies on HDFS when slow request occurs, so we propose to put them back in this 
> JIRA, and add the metrics for monitoring as well.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-15160) Put back HFile's HDFS op latency sampling code and add metrics for monitoring

2016-04-25 Thread Yu Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15257501#comment-15257501
 ] 

Yu Li commented on HBASE-15160:
---

Sorry for the late response [~enis], busy resolving some online issues 
recently...

bq. are you also seeing these counters get reset?
No, never observed counters reset. Also checked source of {{MutableHistogram}} 
(both master and branch-1) and there should be no reset call to {{count}}. 
Could you tell more details about "the histograms are reset"? Thanks.

bq. Did you try with block cache disabled
Yes, the test was against a table with BLOCKCACHE=>false. Actually, this is 
part of the reason for the regression observed in HBASE-15619, you could find 
more details about the test there.

bq. since even if argument pread=false, we might end up doing a pread if we 
cannot get the lock
Oh yes, you are right, didn't quite notice this part... Then it's hard to 
decide whether to sacrifice such accuracy for performance... Personally, maybe 
I'd still choose performance.

> Put back HFile's HDFS op latency sampling code and add metrics for monitoring
> -
>
> Key: HBASE-15160
> URL: https://issues.apache.org/jira/browse/HBASE-15160
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0, 1.1.2
>Reporter: Yu Li
>Assignee: Yu Li
> Attachments: HBASE-15160.patch, HBASE-15160_v2.patch, 
> HBASE-15160_v3.patch, hbase-15160_v4.patch, hbase-15160_v5.patch
>
>
> In HBASE-11586 all HDFS op latency sampling code, including fsReadLatency, 
> fsPreadLatency and fsWriteLatency, have been removed. There was some 
> discussion about putting them back in a new JIRA but never happened. 
> According to our experience, these metrics are useful to judge whether issue 
> lies on HDFS when slow request occurs, so we propose to put them back in this 
> JIRA, and add the metrics for monitoring as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15160) Put back HFile's HDFS op latency sampling code and add metrics for monitoring

2016-04-22 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15254361#comment-15254361
 ] 

Enis Soztutar commented on HBASE-15160:
---

bq. 1. From the latest patch, we're adding keys for read/write count, could you 
clarify the reason for this when we already have the num_ops couting in 
histogram?
What I have noticed is that, the num_ops coming from the histograms are reset 
everytime the histograms are reset. We are relying on these counts at the 
regionserver level as well (like get_numOps, etc), but I think it is wrong and 
very hard to interpret because 

bq. 2. Since the read op happens inside a lock in HFileReaderImpl#getMetaBlock, 
cost of update histogram hurts, and confirmed to be the root cause of the ~3% 
performance regression in my test
Thanks [~carp84] for the perf test. I did not put up this patch against YCSB 
yet. Did you try with block cache disabled? The metrics will only get updated 
when an actual read happens obviously, so I was thinking of doing the test with 
block cache turned off. Let me try your suggestion.

I originally changed the location for the histogram update to be inside the 
HFileBlock.readAtOffset() rather than at the HFileReader level since even if 
argument {{pread=false}}, we might end up doing a {{pread}} if we cannot get 
the lock. Otherwise reporting for pread vs read will be slightly wrong if there 
is contention for the input stream lock. 

> Put back HFile's HDFS op latency sampling code and add metrics for monitoring
> -
>
> Key: HBASE-15160
> URL: https://issues.apache.org/jira/browse/HBASE-15160
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0, 1.1.2
>Reporter: Yu Li
>Assignee: Yu Li
> Attachments: HBASE-15160.patch, HBASE-15160_v2.patch, 
> HBASE-15160_v3.patch, hbase-15160_v4.patch, hbase-15160_v5.patch
>
>
> In HBASE-11586 all HDFS op latency sampling code, including fsReadLatency, 
> fsPreadLatency and fsWriteLatency, have been removed. There was some 
> discussion about putting them back in a new JIRA but never happened. 
> According to our experience, these metrics are useful to judge whether issue 
> lies on HDFS when slow request occurs, so we propose to put them back in this 
> JIRA, and add the metrics for monitoring as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15160) Put back HFile's HDFS op latency sampling code and add metrics for monitoring

2016-04-21 Thread Yu Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15253243#comment-15253243
 ] 

Yu Li commented on HBASE-15160:
---

Thanks for the efforts [~enis]! Some review comments below:

1. From the latest patch, we're adding keys for read/write count, could you 
clarify the reason for this when we already have the num_ops couting in 
histogram?
{noformat}
  "beans" : [ {
"name" : "Hadoop:service=HBase,name=RegionServer,sub=IO",
"modelerType" : "RegionServer,sub=IO",
"tag.Context" : "regionserver",
"tag.Hostname" : "hadoop0166.su18.tbsite.net",
"FsWriteTime_num_ops" : 12049,
"FsWriteTime_min" : 46284,
"FsWriteTime_max" : 133946271,
{noformat}

2. Since the read op happens inside a lock in {{HFileReaderImpl#getMetaBlock}}, 
cost of update histogram hurts, and confirmed to be the root cause of the ~3% 
performance regression in my test. I'd suggest to record time of the whole 
{{readBlockData}} call instead of inside {{readAtOffset}} and update the 
histogram out of the synchronized block, which will save the performance 
although causing metrics not that accurate. Excerpt of codes below:
{code:title=HFileReaderImpl#getMetaBlock|borderStyle=solid}
// Per meta key from any given file, synchronize reads for said block. This
// is OK to do for meta blocks because the meta block index is always
// single-level.
synchronized (metaBlockIndexReader.getRootBlockKey(block)) {
  ...
  HFileBlock metaBlock = fsBlockReader.readBlockData(metaBlockOffset, 
blockSize, true).
  unpack(hfileContext, fsBlockReader);
  ...
  return metaBlock;
}
{code}

> Put back HFile's HDFS op latency sampling code and add metrics for monitoring
> -
>
> Key: HBASE-15160
> URL: https://issues.apache.org/jira/browse/HBASE-15160
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0, 1.1.2
>Reporter: Yu Li
>Assignee: Yu Li
> Attachments: HBASE-15160.patch, HBASE-15160_v2.patch, 
> HBASE-15160_v3.patch, hbase-15160_v4.patch, hbase-15160_v5.patch
>
>
> In HBASE-11586 all HDFS op latency sampling code, including fsReadLatency, 
> fsPreadLatency and fsWriteLatency, have been removed. There was some 
> discussion about putting them back in a new JIRA but never happened. 
> According to our experience, these metrics are useful to judge whether issue 
> lies on HDFS when slow request occurs, so we propose to put them back in this 
> JIRA, and add the metrics for monitoring as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15160) Put back HFile's HDFS op latency sampling code and add metrics for monitoring

2016-04-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15253191#comment-15253191
 ] 

Hadoop QA commented on HBASE-15160:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 23s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
4s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 23s 
{color} | {color:green} master passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 53s 
{color} | {color:green} master passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
39s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
34s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
40s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 7s 
{color} | {color:green} master passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s 
{color} | {color:green} master passed with JDK v1.7.0_79 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
9s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 17s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 17s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 54s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 54s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
37s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
32s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
25m 21s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
21s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 8s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 15s 
{color} | {color:green} hbase-hadoop-compat in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 21s 
{color} | {color:green} hbase-hadoop2-compat in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 124m 50s 
{color} | {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
42s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 174m 10s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hbase.master.procedure.TestMasterFailoverWithProcedures |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12800123/hbase-15160_v5.patch |
| JIRA Issue | HBASE-15160 |
| Optional Tests 

[jira] [Commented] (HBASE-15160) Put back HFile's HDFS op latency sampling code and add metrics for monitoring

2016-04-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15252944#comment-15252944
 ] 

Hadoop QA commented on HBASE-15160:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 7s {color} 
| {color:red} HBASE-15160 does not apply to master. Rebase required? Wrong 
Branch? See https://yetus.apache.org/documentation/0.2.1/precommit-patchnames 
for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12800099/hbase-15160_v4.patch |
| JIRA Issue | HBASE-15160 |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/1542/console |
| Powered by | Apache Yetus 0.2.1   http://yetus.apache.org |


This message was automatically generated.



> Put back HFile's HDFS op latency sampling code and add metrics for monitoring
> -
>
> Key: HBASE-15160
> URL: https://issues.apache.org/jira/browse/HBASE-15160
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0, 1.1.2
>Reporter: Yu Li
>Assignee: Yu Li
> Attachments: HBASE-15160.patch, HBASE-15160_v2.patch, 
> HBASE-15160_v3.patch, hbase-15160_v4.patch
>
>
> In HBASE-11586 all HDFS op latency sampling code, including fsReadLatency, 
> fsPreadLatency and fsWriteLatency, have been removed. There was some 
> discussion about putting them back in a new JIRA but never happened. 
> According to our experience, these metrics are useful to judge whether issue 
> lies on HDFS when slow request occurs, so we propose to put them back in this 
> JIRA, and add the metrics for monitoring as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15160) Put back HFile's HDFS op latency sampling code and add metrics for monitoring

2016-04-14 Thread Yu Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241563#comment-15241563
 ] 

Yu Li commented on HBASE-15160:
---

bq. Mind if I finish that, and you can do a review?
Sure, then will wait for your patch sir, and will do review (actually I've got 
one at hand also :-)). Thanks.

bq. The JHM test using the mutable histograms was done against the mutable 
histogram implementation from HBase-1.2.0.
Ah I see, then no wonder the difference, my testing was using implementation of 
master branch.

> Put back HFile's HDFS op latency sampling code and add metrics for monitoring
> -
>
> Key: HBASE-15160
> URL: https://issues.apache.org/jira/browse/HBASE-15160
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0, 1.1.2
>Reporter: Yu Li
>Assignee: Yu Li
> Attachments: HBASE-15160.patch, HBASE-15160_v2.patch, 
> HBASE-15160_v3.patch
>
>
> In HBASE-11586 all HDFS op latency sampling code, including fsReadLatency, 
> fsPreadLatency and fsWriteLatency, have been removed. There was some 
> discussion about putting them back in a new JIRA but never happened. 
> According to our experience, these metrics are useful to judge whether issue 
> lies on HDFS when slow request occurs, so we propose to put them back in this 
> JIRA, and add the metrics for monitoring as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15160) Put back HFile's HDFS op latency sampling code and add metrics for monitoring

2016-04-14 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240975#comment-15240975
 ] 

Enis Soztutar commented on HBASE-15160:
---

Thanks [~carp84] for the update. I have a half-finished patch for the next 
version of this. Mind if I finish that, and you can do a review? 

bq. Benchmarking with the JMH case Enis wrote up we could see using 
MutableHistogram is less expensive thus promising. 
BTW, I could not update the parent issue. The JHM test using the mutable 
histograms was done against the mutable histogram implementation from 
HBase-1.2.0. This is Hadoop's slow histograms. HBASE-15222 makes a huge 
difference in terms of the micro-benchmark numbers, so the comparison between 
inline histogram update versus ArrayBlockingQueue is even higher. 

> Put back HFile's HDFS op latency sampling code and add metrics for monitoring
> -
>
> Key: HBASE-15160
> URL: https://issues.apache.org/jira/browse/HBASE-15160
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0, 1.1.2
>Reporter: Yu Li
>Assignee: Yu Li
> Attachments: HBASE-15160.patch, HBASE-15160_v2.patch, 
> HBASE-15160_v3.patch
>
>
> In HBASE-11586 all HDFS op latency sampling code, including fsReadLatency, 
> fsPreadLatency and fsWriteLatency, have been removed. There was some 
> discussion about putting them back in a new JIRA but never happened. 
> According to our experience, these metrics are useful to judge whether issue 
> lies on HDFS when slow request occurs, so we propose to put them back in this 
> JIRA, and add the metrics for monitoring as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15160) Put back HFile's HDFS op latency sampling code and add metrics for monitoring

2016-04-13 Thread Yu Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239224#comment-15239224
 ] 

Yu Li commented on HBASE-15160:
---

Back to this JIRA. Confirmed that under heavy random read(get) load, the 
current patch using BlockingQueue to collect metrics will cause *~3%* 
performance regression. Benchmarking with the JMH case Enis wrote up we could 
see using MutableHistogram is less expensive thus promising. Will re-implement 
with MutableHistogram and see how it works out.

[~enis], mind if I take this one back? Or you already have something in 
progress? Thanks.

> Put back HFile's HDFS op latency sampling code and add metrics for monitoring
> -
>
> Key: HBASE-15160
> URL: https://issues.apache.org/jira/browse/HBASE-15160
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0, 1.1.2
>Reporter: Yu Li
>Assignee: Yu Li
> Attachments: HBASE-15160.patch, HBASE-15160_v2.patch, 
> HBASE-15160_v3.patch
>
>
> In HBASE-11586 all HDFS op latency sampling code, including fsReadLatency, 
> fsPreadLatency and fsWriteLatency, have been removed. There was some 
> discussion about putting them back in a new JIRA but never happened. 
> According to our experience, these metrics are useful to judge whether issue 
> lies on HDFS when slow request occurs, so we propose to put them back in this 
> JIRA, and add the metrics for monitoring as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15160) Put back HFile's HDFS op latency sampling code and add metrics for monitoring

2016-03-09 Thread Yu Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188588#comment-15188588
 ] 

Yu Li commented on HBASE-15160:
---

Sure, please go ahead [~enis], and thanks for offering help. Sorry for the lag, 
kinda busy recently resolving online issues...

> Put back HFile's HDFS op latency sampling code and add metrics for monitoring
> -
>
> Key: HBASE-15160
> URL: https://issues.apache.org/jira/browse/HBASE-15160
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0, 1.1.2
>Reporter: Yu Li
>Assignee: Yu Li
> Attachments: HBASE-15160.patch, HBASE-15160_v2.patch, 
> HBASE-15160_v3.patch
>
>
> In HBASE-11586 all HDFS op latency sampling code, including fsReadLatency, 
> fsPreadLatency and fsWriteLatency, have been removed. There was some 
> discussion about putting them back in a new JIRA but never happened. 
> According to our experience, these metrics are useful to judge whether issue 
> lies on HDFS when slow request occurs, so we propose to put them back in this 
> JIRA, and add the metrics for monitoring as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15160) Put back HFile's HDFS op latency sampling code and add metrics for monitoring

2016-03-09 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188526#comment-15188526
 ] 

Enis Soztutar commented on HBASE-15160:
---

[~carp84] did you get a chance to update the patch? I can take this on if you 
want. 

> Put back HFile's HDFS op latency sampling code and add metrics for monitoring
> -
>
> Key: HBASE-15160
> URL: https://issues.apache.org/jira/browse/HBASE-15160
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0, 1.1.2
>Reporter: Yu Li
>Assignee: Yu Li
> Attachments: HBASE-15160.patch, HBASE-15160_v2.patch, 
> HBASE-15160_v3.patch
>
>
> In HBASE-11586 all HDFS op latency sampling code, including fsReadLatency, 
> fsPreadLatency and fsWriteLatency, have been removed. There was some 
> discussion about putting them back in a new JIRA but never happened. 
> According to our experience, these metrics are useful to judge whether issue 
> lies on HDFS when slow request occurs, so we propose to put them back in this 
> JIRA, and add the metrics for monitoring as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15160) Put back HFile's HDFS op latency sampling code and add metrics for monitoring

2016-03-03 Thread Yu Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15177513#comment-15177513
 ] 

Yu Li commented on HBASE-15160:
---

bq. rather than keeping track of individual points like the one in the patch 
and bulk updating the histograms frequently
Ok, got your point now, and makes sense. The BlockingQueue part is a simply 
revert of HBASE-11586, and I didn't think much about it... Will prepare a patch 
based on using histogram inline, thanks for the explanation [~enis]

> Put back HFile's HDFS op latency sampling code and add metrics for monitoring
> -
>
> Key: HBASE-15160
> URL: https://issues.apache.org/jira/browse/HBASE-15160
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0, 1.1.2
>Reporter: Yu Li
>Assignee: Yu Li
> Attachments: HBASE-15160.patch, HBASE-15160_v2.patch, 
> HBASE-15160_v3.patch
>
>
> In HBASE-11586 all HDFS op latency sampling code, including fsReadLatency, 
> fsPreadLatency and fsWriteLatency, have been removed. There was some 
> discussion about putting them back in a new JIRA but never happened. 
> According to our experience, these metrics are useful to judge whether issue 
> lies on HDFS when slow request occurs, so we propose to put them back in this 
> JIRA, and add the metrics for monitoring as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15160) Put back HFile's HDFS op latency sampling code and add metrics for monitoring

2016-03-02 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15176718#comment-15176718
 ] 

Enis Soztutar commented on HBASE-15160:
---

bq. Yes, already made the change in the latest patch. 
Ok, I was looking at the following for why we are not using a histogram for 
this: 
{code}
+  private static final BlockingQueue fsReadLatenciesNanos =
+  new ArrayBlockingQueue(LATENCY_BUFFER_SIZE);
+  private static final BlockingQueue fsWriteLatenciesNanos =
+  new ArrayBlockingQueue(LATENCY_BUFFER_SIZE);
{code}

For every RPC and for every operation (get, etc), we already increment counters 
or histograms directly inline, rather than keeping track of individual points 
like the one in the patch and bulk updating the histograms frequently. Since 
num gets > num fs operations in theory, doing the counter updates inline should 
not be a perf regression. This is of course to be verified if possible. 

One other thing is that instead of using the histogram inline (which is based 
on FastLongHistogram / Counters and high perf counters) we are using a 
BlockingQueue which is using a RWLock and in-theory more costly. So doing this 
indirect way maybe even worse than doing inline updates. 

> Put back HFile's HDFS op latency sampling code and add metrics for monitoring
> -
>
> Key: HBASE-15160
> URL: https://issues.apache.org/jira/browse/HBASE-15160
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0, 1.1.2
>Reporter: Yu Li
>Assignee: Yu Li
> Attachments: HBASE-15160.patch, HBASE-15160_v2.patch, 
> HBASE-15160_v3.patch
>
>
> In HBASE-11586 all HDFS op latency sampling code, including fsReadLatency, 
> fsPreadLatency and fsWriteLatency, have been removed. There was some 
> discussion about putting them back in a new JIRA but never happened. 
> According to our experience, these metrics are useful to judge whether issue 
> lies on HDFS when slow request occurs, so we propose to put them back in this 
> JIRA, and add the metrics for monitoring as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15160) Put back HFile's HDFS op latency sampling code and add metrics for monitoring

2016-03-02 Thread Yu Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15175475#comment-15175475
 ] 

Yu Li commented on HBASE-15160:
---

Thanks for checking this [~enis]

bq. Yu Li did you see Elliott's review comment above. Using histograms that we 
use elsewhere is the correct way to go.
Yes, already made the change in the latest patch. Please allow me to quote my 
previous description of the latest patch:
{quote}
Update the patch to avoid missing spike. Changes include:
1. Instead of get and reset a single latency point, now we get all latencies of 
operations happened during a collection interval and insert them into the 
histogram in one go
2. Since the histogram updating time differs from the latencies' recording 
time, we introduce another gauge to show the average latency of the collection 
interval
3. With these two metrics, we could see the brief situation from latency gauge 
and check whether there's any spike in this interval from histogram.
{quote}

> Put back HFile's HDFS op latency sampling code and add metrics for monitoring
> -
>
> Key: HBASE-15160
> URL: https://issues.apache.org/jira/browse/HBASE-15160
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0, 1.1.2
>Reporter: Yu Li
>Assignee: Yu Li
> Attachments: HBASE-15160.patch, HBASE-15160_v2.patch, 
> HBASE-15160_v3.patch
>
>
> In HBASE-11586 all HDFS op latency sampling code, including fsReadLatency, 
> fsPreadLatency and fsWriteLatency, have been removed. There was some 
> discussion about putting them back in a new JIRA but never happened. 
> According to our experience, these metrics are useful to judge whether issue 
> lies on HDFS when slow request occurs, so we propose to put them back in this 
> JIRA, and add the metrics for monitoring as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15160) Put back HFile's HDFS op latency sampling code and add metrics for monitoring

2016-03-01 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15174896#comment-15174896
 ] 

Enis Soztutar commented on HBASE-15160:
---

This is still important to get in if we can quantify the perf effect. 

bq. Rather than storing the times, we should just put them into a histogram in 
the metrics system.
[~carp84] did you see Elliott's review comment above. Using histograms that we 
use elsewhere is the correct way to go. 

> Put back HFile's HDFS op latency sampling code and add metrics for monitoring
> -
>
> Key: HBASE-15160
> URL: https://issues.apache.org/jira/browse/HBASE-15160
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0, 1.1.2
>Reporter: Yu Li
>Assignee: Yu Li
> Attachments: HBASE-15160.patch, HBASE-15160_v2.patch, 
> HBASE-15160_v3.patch
>
>
> In HBASE-11586 all HDFS op latency sampling code, including fsReadLatency, 
> fsPreadLatency and fsWriteLatency, have been removed. There was some 
> discussion about putting them back in a new JIRA but never happened. 
> According to our experience, these metrics are useful to judge whether issue 
> lies on HDFS when slow request occurs, so we propose to put them back in this 
> JIRA, and add the metrics for monitoring as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15160) Put back HFile's HDFS op latency sampling code and add metrics for monitoring

2016-02-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15136010#comment-15136010
 ] 

Hadoop QA commented on HBASE-15160:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 
53s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 55s 
{color} | {color:green} master passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 53s 
{color} | {color:green} master passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 4m 
27s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
35s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
36s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 45s 
{color} | {color:green} master passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s 
{color} | {color:green} master passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 11s 
{color} | {color:red} hbase-hadoop2-compat in the patch failed. {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 49s 
{color} | {color:green} the patch passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 49s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 53s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 53s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 4m 0s 
{color} | {color:red} Patch generated 1 new checkstyle issues in hbase-server 
(total was 126, now 124). {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
34s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
22m 6s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
56s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 44s 
{color} | {color:green} the patch passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 18s 
{color} | {color:green} hbase-hadoop2-compat in the patch passed with JDK 
v1.8.0_72. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 15s 
{color} | {color:green} hbase-hadoop-compat in the patch passed with JDK 
v1.8.0_72. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 112m 52s 
{color} | {color:red} hbase-server in the patch failed with JDK v1.8.0_72. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 27s 
{color} | {color:green} hbase-hadoop2-compat in the patch passed with JDK 
v1.7.0_91. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 22s 
{color} | {color:green} hbase-hadoop-compat in the patch passed with JDK 
v1.7.0_91. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 113m 45s 
{color} | {color:red} hbase-server in the patch failed with JDK v1.7.0_91. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 

[jira] [Commented] (HBASE-15160) Put back HFile's HDFS op latency sampling code and add metrics for monitoring

2016-02-06 Thread Yu Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15135871#comment-15135871
 ] 

Yu Li commented on HBASE-15160:
---

Just noticed this comment... Could you point me to the JIRA number sir? Thanks.

We didn't observe the perf downgrade in our applications, but maybe neglected 
somehow. To be cautious, I'll perform some comparison testing on the scan with 
multiple CF scenario and get back with the number later.

> Put back HFile's HDFS op latency sampling code and add metrics for monitoring
> -
>
> Key: HBASE-15160
> URL: https://issues.apache.org/jira/browse/HBASE-15160
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0, 1.1.2
>Reporter: Yu Li
>Assignee: Yu Li
> Attachments: HBASE-15160.patch, HBASE-15160_v2.patch, 
> HBASE-15160_v3.patch
>
>
> In HBASE-11586 all HDFS op latency sampling code, including fsReadLatency, 
> fsPreadLatency and fsWriteLatency, have been removed. There was some 
> discussion about putting them back in a new JIRA but never happened. 
> According to our experience, these metrics are useful to judge whether issue 
> lies on HDFS when slow request occurs, so we propose to put them back in this 
> JIRA, and add the metrics for monitoring as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15160) Put back HFile's HDFS op latency sampling code and add metrics for monitoring

2016-01-27 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15119317#comment-15119317
 ] 

Elliott Clark commented on HBASE-15160:
---

The actual removal was done by me when moving to metrics2. It gave us an almost 
40% throughput boost on scans of tables with lots of CF's.

> Put back HFile's HDFS op latency sampling code and add metrics for monitoring
> -
>
> Key: HBASE-15160
> URL: https://issues.apache.org/jira/browse/HBASE-15160
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0, 1.1.2
>Reporter: Yu Li
>Assignee: Yu Li
> Attachments: HBASE-15160.patch, HBASE-15160_v2.patch
>
>
> In HBASE-11586 all HDFS op latency sampling code, including fsReadLatency, 
> fsPreadLatency and fsWriteLatency, have been removed. There was some 
> discussion about putting them back in a new JIRA but never happened. 
> According to our experience, these metrics are useful to judge whether issue 
> lies on HDFS when slow request occurs, so we propose to put them back in this 
> JIRA, and add the metrics for monitoring as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15160) Put back HFile's HDFS op latency sampling code and add metrics for monitoring

2016-01-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15117857#comment-15117857
 ] 

Hadoop QA commented on HBASE-15160:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 
52s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 12s 
{color} | {color:green} master passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 2s 
{color} | {color:green} master passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 4m 
51s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
39s {color} | {color:green} master passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 10s 
{color} | {color:red} hbase-server in master has 1 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 4s 
{color} | {color:green} master passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 12s 
{color} | {color:green} master passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 12s 
{color} | {color:red} hbase-hadoop2-compat in the patch failed. {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 11s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 11s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 2s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 2s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 4m 4s 
{color} | {color:red} Patch generated 1 new checkstyle issues in hbase-server 
(total was 126, now 124). {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
37s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
24m 8s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
22s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 59s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 2s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 25s 
{color} | {color:green} hbase-hadoop2-compat in the patch passed with JDK 
v1.8.0_66. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 20s 
{color} | {color:green} hbase-hadoop-compat in the patch passed with JDK 
v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 89m 30s {color} 
| {color:red} hbase-server in the patch failed with JDK v1.8.0_66. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 53s 
{color} | {color:green} hbase-hadoop2-compat in the patch passed with JDK 
v1.7.0_91. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 30s 
{color} | {color:green} hbase-hadoop-compat in the patch passed with JDK 
v1.7.0_91. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 104m 43s 
{color} 

[jira] [Commented] (HBASE-15160) Put back HFile's HDFS op latency sampling code and add metrics for monitoring

2016-01-26 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118094#comment-15118094
 ] 

Elliott Clark commented on HBASE-15160:
---

So one of the reasons that the metrics in the deep read paths were turned off 
is that they were really expensive in the tight loops of readers. Those 
errors/hangs seems more than normal on an apache run. Do we have any before and 
after metrics on reads ?

> Put back HFile's HDFS op latency sampling code and add metrics for monitoring
> -
>
> Key: HBASE-15160
> URL: https://issues.apache.org/jira/browse/HBASE-15160
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0, 1.1.2
>Reporter: Yu Li
>Assignee: Yu Li
> Attachments: HBASE-15160.patch, HBASE-15160_v2.patch
>
>
> In HBASE-11586 all HDFS op latency sampling code, including fsReadLatency, 
> fsPreadLatency and fsWriteLatency, have been removed. There was some 
> discussion about putting them back in a new JIRA but never happened. 
> According to our experience, these metrics are useful to judge whether issue 
> lies on HDFS when slow request occurs, so we propose to put them back in this 
> JIRA, and add the metrics for monitoring as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15160) Put back HFile's HDFS op latency sampling code and add metrics for monitoring

2016-01-26 Thread Yu Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118531#comment-15118531
 ] 

Yu Li commented on HBASE-15160:
---

Thanks for check here [~eclark]
bq. So one of the reasons that the metrics in the deep read paths were turned 
off is that they were really expensive in the tight loops of readers.
>From HBASE-11586 it seems the HDFS op latency sampling codes were removed 
>because we found them not used/referenced. Not sure whether we did any 
>performance comparison before/after HBASE-11586, but IMHO all metrics will 
>affect performance slightly. Since the HDFS op latency could help us judge 
>whether or not issue happens in HDFS, I think it's worthwhile to add the 
>sampling back.

bq. Those errors/hangs seems more than normal on an apache run
Agreed the errors/hangs seems abnormal, but I observed similar errors in UT of 
HBASE-15163 where changes of the patch should be safe enough. Allow me to 
resubmit patch and check the result here.

bq. Do we have any before and after metrics on reads ?
AFAICS, we will update get metrics at end of RSRpcServices#get, but nothing 
deeper inside for reads. While for writes we have the syncTime metrics which 
will get updated in postSync

> Put back HFile's HDFS op latency sampling code and add metrics for monitoring
> -
>
> Key: HBASE-15160
> URL: https://issues.apache.org/jira/browse/HBASE-15160
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0, 1.1.2
>Reporter: Yu Li
>Assignee: Yu Li
> Attachments: HBASE-15160.patch, HBASE-15160_v2.patch
>
>
> In HBASE-11586 all HDFS op latency sampling code, including fsReadLatency, 
> fsPreadLatency and fsWriteLatency, have been removed. There was some 
> discussion about putting them back in a new JIRA but never happened. 
> According to our experience, these metrics are useful to judge whether issue 
> lies on HDFS when slow request occurs, so we propose to put them back in this 
> JIRA, and add the metrics for monitoring as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15160) Put back HFile's HDFS op latency sampling code and add metrics for monitoring

2016-01-25 Thread Yu Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15116770#comment-15116770
 ] 

Yu Li commented on HBASE-15160:
---

I failed to find any histogram for compaction time but I guess you mean like 
splitTimeHisto. Please check the updated patch in rb and let me know whether it 
follows your thoughts, thanks.

> Put back HFile's HDFS op latency sampling code and add metrics for monitoring
> -
>
> Key: HBASE-15160
> URL: https://issues.apache.org/jira/browse/HBASE-15160
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0, 1.1.2
>Reporter: Yu Li
>Assignee: Yu Li
> Attachments: HBASE-15160.patch
>
>
> In HBASE-11586 all HDFS op latency sampling code, including fsReadLatency, 
> fsPreadLatency and fsWriteLatency, have been removed. There was some 
> discussion about putting them back in a new JIRA but never happened. 
> According to our experience, these metrics are useful to judge whether issue 
> lies on HDFS when slow request occurs, so we propose to put them back in this 
> JIRA, and add the metrics for monitoring as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15160) Put back HFile's HDFS op latency sampling code and add metrics for monitoring

2016-01-25 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15116003#comment-15116003
 ] 

Elliott Clark commented on HBASE-15160:
---

Rather than storing the times, we should just put them into a histogram in the 
metrics system.

Something like compaction time does.

> Put back HFile's HDFS op latency sampling code and add metrics for monitoring
> -
>
> Key: HBASE-15160
> URL: https://issues.apache.org/jira/browse/HBASE-15160
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0, 1.1.2
>Reporter: Yu Li
>Assignee: Yu Li
> Attachments: HBASE-15160.patch
>
>
> In HBASE-11586 all HDFS op latency sampling code, including fsReadLatency, 
> fsPreadLatency and fsWriteLatency, have been removed. There was some 
> discussion about putting them back in a new JIRA but never happened. 
> According to our experience, these metrics are useful to judge whether issue 
> lies on HDFS when slow request occurs, so we propose to put them back in this 
> JIRA, and add the metrics for monitoring as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)