[jira] [Commented] (HBASE-16345) RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer Exceptions
[ https://issues.apache.org/jira/browse/HBASE-16345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15590571#comment-15590571 ] Hudson commented on HBASE-16345: FAILURE: Integrated in Jenkins build HBase-1.4 #479 (See [https://builds.apache.org/job/HBase-1.4/479/]) HBASE-16345 RpcRetryingCallerWithReadReplicas#call() should catch some (esteban: rev a97aef51635539ea382699495613ebe1bf89e475) * (edit) hbase-client/src/main/java/org/apache/hadoop/hbase/client/ScannerCallableWithReplicas.java * (edit) hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestReplicaWithCluster.java * (edit) hbase-client/src/main/java/org/apache/hadoop/hbase/client/ResultBoundedCompletionService.java > RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer > Exceptions > -- > > Key: HBASE-16345 > URL: https://issues.apache.org/jira/browse/HBASE-16345 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 2.0.0, 1.3.0, 1.0.3, 1.4.0, 1.2.3, 1.1.7 >Reporter: huaxiang sun >Assignee: huaxiang sun > Fix For: 2.0.0, 1.4.0 > > Attachments: HBASE-16345-v001.patch, HBASE-16345.branch-1.001.patch, > HBASE-16345.branch-1.001.patch, HBASE-16345.master.001.patch, > HBASE-16345.master.002.patch, HBASE-16345.master.003.patch, > HBASE-16345.master.004.patch, HBASE-16345.master.005.patch, > HBASE-16345.master.005.patch, HBASE-16345.master.006.patch > > > Update for the description. Debugged more at this front based on the comments > from Enis. > The cause is that for the primary replica, if its retry is exhausted too > fast, f.get() [1] returns ExecutionException. This Exception needs to be > ignored and continue with the replicas. > The other issue is that after adding calls for the replicas, if the first > completed task gets ExecutionException (due to the retry exhausted), it > throws the exception to the client[2]. > In this case, it needs to loop through these tasks, waiting for the success > one. If no one succeeds, throw exception. > Similar for the scan as well > [1] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L197 > [2] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L219 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16345) RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer Exceptions
[ https://issues.apache.org/jira/browse/HBASE-16345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15590208#comment-15590208 ] Hudson commented on HBASE-16345: FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #1817 (See [https://builds.apache.org/job/HBase-Trunk_matrix/1817/]) HBASE-16345 RpcRetryingCallerWithReadReplicas#call() should catch some (esteban: rev 72db95388664fb23314b2e6bb437b961c797f579) * (edit) hbase-client/src/main/java/org/apache/hadoop/hbase/client/ScannerCallableWithReplicas.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestReplicaWithCluster.java * (edit) hbase-client/src/main/java/org/apache/hadoop/hbase/client/ResultBoundedCompletionService.java * (edit) hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java > RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer > Exceptions > -- > > Key: HBASE-16345 > URL: https://issues.apache.org/jira/browse/HBASE-16345 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 2.0.0, 1.3.0, 1.0.3, 1.4.0, 1.2.3, 1.1.7 >Reporter: huaxiang sun >Assignee: huaxiang sun > Attachments: HBASE-16345-v001.patch, HBASE-16345.branch-1.001.patch, > HBASE-16345.branch-1.001.patch, HBASE-16345.master.001.patch, > HBASE-16345.master.002.patch, HBASE-16345.master.003.patch, > HBASE-16345.master.004.patch, HBASE-16345.master.005.patch, > HBASE-16345.master.005.patch, HBASE-16345.master.006.patch > > > Update for the description. Debugged more at this front based on the comments > from Enis. > The cause is that for the primary replica, if its retry is exhausted too > fast, f.get() [1] returns ExecutionException. This Exception needs to be > ignored and continue with the replicas. > The other issue is that after adding calls for the replicas, if the first > completed task gets ExecutionException (due to the retry exhausted), it > throws the exception to the client[2]. > In this case, it needs to loop through these tasks, waiting for the success > one. If no one succeeds, throw exception. > Similar for the scan as well > [1] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L197 > [2] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L219 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16345) RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer Exceptions
[ https://issues.apache.org/jira/browse/HBASE-16345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15587043#comment-15587043 ] Hadoop QA commented on HBASE-16345: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 37s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 12s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 48s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 42s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 21s {color} | {color:green} master passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 39s {color} | {color:red} hbase-server in master has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 41s {color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 8s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 59s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 50s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 50s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 41s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 22s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 31m 19s {color} | {color:green} Patch does not cause any errors with Hadoop 2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1 or 3.0.0-alpha1. {color} | | {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 0m 22s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 56s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 43s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 0s {color} | {color:green} hbase-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 83m 11s {color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 26s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 132m 25s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | Timed out junit tests | org.apache.hadoop.hbase.TestMultiVersions | | | org.apache.hadoop.hbase.constraint.TestConstraint | | | org.apache.hadoop.hbase.TestNamespace | | | org.apache.hadoop.hbase.security.visibility.TestVisibilityLabelsWithDeletes | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.11.2 Server=1.11.2 Image:yetus/hbase:7bda515 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12834038/HBASE-16345.master.006.patch | | JIRA Issue | HBASE-16345 | | Optional Tests | asflicense javac javadoc unit findbugs hadoopcheck hbaseanti checkstyle compile | | uname | Linux f5f3724bea20 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 6c89c62 | | Default Java | 1.8.0_101 | | findbugs | v3.0.0 | | findbugs |
[jira] [Commented] (HBASE-16345) RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer Exceptions
[ https://issues.apache.org/jira/browse/HBASE-16345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15586696#comment-15586696 ] huaxiang sun commented on HBASE-16345: -- master-006.patch is the rebased one, thanks. > RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer > Exceptions > -- > > Key: HBASE-16345 > URL: https://issues.apache.org/jira/browse/HBASE-16345 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 2.0.0 >Reporter: huaxiang sun >Assignee: huaxiang sun > Attachments: HBASE-16345-v001.patch, HBASE-16345.branch-1.001.patch, > HBASE-16345.branch-1.001.patch, HBASE-16345.master.001.patch, > HBASE-16345.master.002.patch, HBASE-16345.master.003.patch, > HBASE-16345.master.004.patch, HBASE-16345.master.005.patch, > HBASE-16345.master.005.patch, HBASE-16345.master.006.patch > > > Update for the description. Debugged more at this front based on the comments > from Enis. > The cause is that for the primary replica, if its retry is exhausted too > fast, f.get() [1] returns ExecutionException. This Exception needs to be > ignored and continue with the replicas. > The other issue is that after adding calls for the replicas, if the first > completed task gets ExecutionException (due to the retry exhausted), it > throws the exception to the client[2]. > In this case, it needs to loop through these tasks, waiting for the success > one. If no one succeeds, throw exception. > Similar for the scan as well > [1] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L197 > [2] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L219 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16345) RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer Exceptions
[ https://issues.apache.org/jira/browse/HBASE-16345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15586100#comment-15586100 ] huaxiang sun commented on HBASE-16345: -- Thanks [~esteban]. I will rebase the patch for the master. > RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer > Exceptions > -- > > Key: HBASE-16345 > URL: https://issues.apache.org/jira/browse/HBASE-16345 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 2.0.0 >Reporter: huaxiang sun >Assignee: huaxiang sun > Attachments: HBASE-16345-v001.patch, HBASE-16345.branch-1.001.patch, > HBASE-16345.branch-1.001.patch, HBASE-16345.master.001.patch, > HBASE-16345.master.002.patch, HBASE-16345.master.003.patch, > HBASE-16345.master.004.patch, HBASE-16345.master.005.patch, > HBASE-16345.master.005.patch > > > Update for the description. Debugged more at this front based on the comments > from Enis. > The cause is that for the primary replica, if its retry is exhausted too > fast, f.get() [1] returns ExecutionException. This Exception needs to be > ignored and continue with the replicas. > The other issue is that after adding calls for the replicas, if the first > completed task gets ExecutionException (due to the retry exhausted), it > throws the exception to the client[2]. > In this case, it needs to loop through these tasks, waiting for the success > one. If no one succeeds, throw exception. > Similar for the scan as well > [1] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L197 > [2] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L219 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16345) RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer Exceptions
[ https://issues.apache.org/jira/browse/HBASE-16345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15585763#comment-15585763 ] Esteban Gutierrez commented on HBASE-16345: --- [~huaxiang] can you rebase your patch for master? thanks! > RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer > Exceptions > -- > > Key: HBASE-16345 > URL: https://issues.apache.org/jira/browse/HBASE-16345 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 2.0.0 >Reporter: huaxiang sun >Assignee: huaxiang sun > Attachments: HBASE-16345-v001.patch, HBASE-16345.branch-1.001.patch, > HBASE-16345.branch-1.001.patch, HBASE-16345.master.001.patch, > HBASE-16345.master.002.patch, HBASE-16345.master.003.patch, > HBASE-16345.master.004.patch, HBASE-16345.master.005.patch, > HBASE-16345.master.005.patch > > > Update for the description. Debugged more at this front based on the comments > from Enis. > The cause is that for the primary replica, if its retry is exhausted too > fast, f.get() [1] returns ExecutionException. This Exception needs to be > ignored and continue with the replicas. > The other issue is that after adding calls for the replicas, if the first > completed task gets ExecutionException (due to the retry exhausted), it > throws the exception to the client[2]. > In this case, it needs to loop through these tasks, waiting for the success > one. If no one succeeds, throw exception. > Similar for the scan as well > [1] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L197 > [2] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L219 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16345) RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer Exceptions
[ https://issues.apache.org/jira/browse/HBASE-16345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15585729#comment-15585729 ] Esteban Gutierrez commented on HBASE-16345: --- I tried to reproduce the test failure and some of tests that failed due timeouts but no luck. I think is ok to commit the latest patch. > RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer > Exceptions > -- > > Key: HBASE-16345 > URL: https://issues.apache.org/jira/browse/HBASE-16345 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 2.0.0 >Reporter: huaxiang sun >Assignee: huaxiang sun > Attachments: HBASE-16345-v001.patch, HBASE-16345.branch-1.001.patch, > HBASE-16345.branch-1.001.patch, HBASE-16345.master.001.patch, > HBASE-16345.master.002.patch, HBASE-16345.master.003.patch, > HBASE-16345.master.004.patch, HBASE-16345.master.005.patch, > HBASE-16345.master.005.patch > > > Update for the description. Debugged more at this front based on the comments > from Enis. > The cause is that for the primary replica, if its retry is exhausted too > fast, f.get() [1] returns ExecutionException. This Exception needs to be > ignored and continue with the replicas. > The other issue is that after adding calls for the replicas, if the first > completed task gets ExecutionException (due to the retry exhausted), it > throws the exception to the client[2]. > In this case, it needs to loop through these tasks, waiting for the success > one. If no one succeeds, throw exception. > Similar for the scan as well > [1] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L197 > [2] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L219 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16345) RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer Exceptions
[ https://issues.apache.org/jira/browse/HBASE-16345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15584832#comment-15584832 ] Hadoop QA commented on HBASE-16345: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 10m 49s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 8s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 18s {color} | {color:green} branch-1 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 57s {color} | {color:green} branch-1 passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 59s {color} | {color:green} branch-1 passed with JDK v1.7.0_80 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 10s {color} | {color:green} branch-1 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 42s {color} | {color:green} branch-1 passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 9s {color} | {color:red} hbase-server in branch-1 has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 0s {color} | {color:green} branch-1 passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 59s {color} | {color:green} branch-1 passed with JDK v1.7.0_80 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 11s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 48s {color} | {color:green} the patch passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 48s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 55s {color} | {color:green} the patch passed with JDK v1.7.0_80 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 55s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 58s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 29s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 18m 50s {color} | {color:green} The patch does not cause any errors with Hadoop 2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} | | {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 0m 28s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 36s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 47s {color} | {color:green} the patch passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 1s {color} | {color:green} the patch passed with JDK v1.7.0_80 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 56s {color} | {color:green} hbase-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 88m 7s {color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 31s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 147m 54s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.regionserver.TestHRegionServerBulkLoad | | Timed out junit tests | org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat | | | org.apache.hadoop.hbase.filter.TestMultiRowRangeFilter | | |
[jira] [Commented] (HBASE-16345) RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer Exceptions
[ https://issues.apache.org/jira/browse/HBASE-16345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15584396#comment-15584396 ] Esteban Gutierrez commented on HBASE-16345: --- Thanks [~huaxiang] can you want to resubmit your patch to see if the OOME persists? > RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer > Exceptions > -- > > Key: HBASE-16345 > URL: https://issues.apache.org/jira/browse/HBASE-16345 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 2.0.0 >Reporter: huaxiang sun >Assignee: huaxiang sun > Attachments: HBASE-16345-v001.patch, HBASE-16345.branch-1.001.patch, > HBASE-16345.master.001.patch, HBASE-16345.master.002.patch, > HBASE-16345.master.003.patch, HBASE-16345.master.004.patch, > HBASE-16345.master.005.patch, HBASE-16345.master.005.patch > > > Update for the description. Debugged more at this front based on the comments > from Enis. > The cause is that for the primary replica, if its retry is exhausted too > fast, f.get() [1] returns ExecutionException. This Exception needs to be > ignored and continue with the replicas. > The other issue is that after adding calls for the replicas, if the first > completed task gets ExecutionException (due to the retry exhausted), it > throws the exception to the client[2]. > In this case, it needs to loop through these tasks, waiting for the success > one. If no one succeeds, throw exception. > Similar for the scan as well > [1] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L197 > [2] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L219 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16345) RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer Exceptions
[ https://issues.apache.org/jira/browse/HBASE-16345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15584202#comment-15584202 ] huaxiang sun commented on HBASE-16345: -- Unittest failure is due to running out of memory. Running org.apache.hadoop.hbase.mapreduce.TestLoadIncrementalHFilesSplitRecovery Exception in thread "process reaper" Exception in thread "Thread-2483" Exception in thread "Thread-2495" Exception in thread "Thread-2487" java.lang.OutOfMemoryError: Java heap space Exception in thread "Thread-2499" java.lang.OutOfMemoryError: Java heap space java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOfRange(Arrays.java:2694) at java.lang.String.(String.java:203) at java.lang.StringBuffer.toString(StringBuffer.java:561) at java.io.BufferedReader.readLine(BufferedReader.java:352) at java.io.BufferedReader.readLine(BufferedReader.java:382) at org.apache.maven.surefire.shade.org.apache.maven.shared.utils.cli.StreamPumper.run(StreamPumper.java:66) java.lang.OutOfMemoryError: Java heap space at java.lang.UNIXProcess$ProcessPipeInputStream.drainInputStream(UNIXProcess.java:320) at java.lang.UNIXProcess$ProcessPipeInputStream.processExited(UNIXProcess.java:333) at java.lang.UNIXProcess.processExited(UNIXProcess.java:241) at java.lang.UNIXProcess$4.run(UNIXProcess.java:229) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:2367) at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:130) at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:114) at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:535) at java.lang.StringBuffer.append(StringBuffer.java:322) at java.io.BufferedReader.readLine(BufferedReader.java:363) at java.io.BufferedReader.readLine(BufferedReader.java:382) at org.apache.maven.surefire.shade.org.apache.maven.shared.utils.cli.StreamPumper.run(StreamPumper.java:66) Running org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Exception in thread "Thread-2505" Exception in thread "Thread-2507" java.lang.OutOfMemoryError: Java heap space > RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer > Exceptions > -- > > Key: HBASE-16345 > URL: https://issues.apache.org/jira/browse/HBASE-16345 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 2.0.0 >Reporter: huaxiang sun >Assignee: huaxiang sun > Attachments: HBASE-16345-v001.patch, HBASE-16345.branch-1.001.patch, > HBASE-16345.master.001.patch, HBASE-16345.master.002.patch, > HBASE-16345.master.003.patch, HBASE-16345.master.004.patch, > HBASE-16345.master.005.patch, HBASE-16345.master.005.patch > > > Update for the description. Debugged more at this front based on the comments > from Enis. > The cause is that for the primary replica, if its retry is exhausted too > fast, f.get() [1] returns ExecutionException. This Exception needs to be > ignored and continue with the replicas. > The other issue is that after adding calls for the replicas, if the first > completed task gets ExecutionException (due to the retry exhausted), it > throws the exception to the client[2]. > In this case, it needs to loop through these tasks, waiting for the success > one. If no one succeeds, throw exception. > Similar for the scan as well > [1] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L197 > [2] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L219 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16345) RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer Exceptions
[ https://issues.apache.org/jira/browse/HBASE-16345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15584195#comment-15584195 ] huaxiang sun commented on HBASE-16345: -- The findbug warning is as the following, I am not sure if it is related with the patch. Bug type RV_RETURN_VALUE_IGNORED (click for details) In class org.apache.hadoop.hbase.regionserver.HRegionServer In method org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper() Called method java.util.concurrent.CountDownLatch.await(long, TimeUnit) At HRegionServer.java:[line 807] > RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer > Exceptions > -- > > Key: HBASE-16345 > URL: https://issues.apache.org/jira/browse/HBASE-16345 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 2.0.0 >Reporter: huaxiang sun >Assignee: huaxiang sun > Attachments: HBASE-16345-v001.patch, HBASE-16345.branch-1.001.patch, > HBASE-16345.master.001.patch, HBASE-16345.master.002.patch, > HBASE-16345.master.003.patch, HBASE-16345.master.004.patch, > HBASE-16345.master.005.patch, HBASE-16345.master.005.patch > > > Update for the description. Debugged more at this front based on the comments > from Enis. > The cause is that for the primary replica, if its retry is exhausted too > fast, f.get() [1] returns ExecutionException. This Exception needs to be > ignored and continue with the replicas. > The other issue is that after adding calls for the replicas, if the first > completed task gets ExecutionException (due to the retry exhausted), it > throws the exception to the client[2]. > In this case, it needs to loop through these tasks, waiting for the success > one. If no one succeeds, throw exception. > Similar for the scan as well > [1] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L197 > [2] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L219 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16345) RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer Exceptions
[ https://issues.apache.org/jira/browse/HBASE-16345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15584161#comment-15584161 ] Hadoop QA commented on HBASE-16345: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 11m 13s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 16s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 55s {color} | {color:green} branch-1 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 43s {color} | {color:green} branch-1 passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 50s {color} | {color:green} branch-1 passed with JDK v1.7.0_80 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 52s {color} | {color:green} branch-1 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 27s {color} | {color:green} branch-1 passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 52s {color} | {color:red} hbase-server in branch-1 has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 40s {color} | {color:green} branch-1 passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 52s {color} | {color:green} branch-1 passed with JDK v1.7.0_80 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 4s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 44s {color} | {color:green} the patch passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 44s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 51s {color} | {color:green} the patch passed with JDK v1.7.0_80 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 51s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 53s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 27s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 16m 48s {color} | {color:green} The patch does not cause any errors with Hadoop 2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} | | {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 0m 25s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 38s {color} | {color:green} the patch passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 52s {color} | {color:green} the patch passed with JDK v1.7.0_80 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 49s {color} | {color:green} hbase-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 83m 24s {color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 30s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 133m 17s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | Timed out junit tests | org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat | | | org.apache.hadoop.hbase.mapreduce.TestSecureLoadIncrementalHFiles | | |
[jira] [Commented] (HBASE-16345) RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer Exceptions
[ https://issues.apache.org/jira/browse/HBASE-16345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15572465#comment-15572465 ] Esteban Gutierrez commented on HBASE-16345: --- [~enis] we are going to commit this in the next 24 hours unless any concerns are brought up. Thanks! > RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer > Exceptions > -- > > Key: HBASE-16345 > URL: https://issues.apache.org/jira/browse/HBASE-16345 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 2.0.0 >Reporter: huaxiang sun >Assignee: huaxiang sun > Attachments: HBASE-16345-v001.patch, HBASE-16345.master.001.patch, > HBASE-16345.master.002.patch, HBASE-16345.master.003.patch, > HBASE-16345.master.004.patch, HBASE-16345.master.005.patch, > HBASE-16345.master.005.patch > > > Update for the description. Debugged more at this front based on the comments > from Enis. > The cause is that for the primary replica, if its retry is exhausted too > fast, f.get() [1] returns ExecutionException. This Exception needs to be > ignored and continue with the replicas. > The other issue is that after adding calls for the replicas, if the first > completed task gets ExecutionException (due to the retry exhausted), it > throws the exception to the client[2]. > In this case, it needs to loop through these tasks, waiting for the success > one. If no one succeeds, throw exception. > Similar for the scan as well > [1] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L197 > [2] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L219 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16345) RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer Exceptions
[ https://issues.apache.org/jira/browse/HBASE-16345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15566619#comment-15566619 ] huaxiang sun commented on HBASE-16345: -- Hi [~enis], any concerns/comments with v5 patch? Thanks a lot! > RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer > Exceptions > -- > > Key: HBASE-16345 > URL: https://issues.apache.org/jira/browse/HBASE-16345 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 2.0.0 >Reporter: huaxiang sun >Assignee: huaxiang sun > Attachments: HBASE-16345-v001.patch, HBASE-16345.master.001.patch, > HBASE-16345.master.002.patch, HBASE-16345.master.003.patch, > HBASE-16345.master.004.patch, HBASE-16345.master.005.patch, > HBASE-16345.master.005.patch > > > Update for the description. Debugged more at this front based on the comments > from Enis. > The cause is that for the primary replica, if its retry is exhausted too > fast, f.get() [1] returns ExecutionException. This Exception needs to be > ignored and continue with the replicas. > The other issue is that after adding calls for the replicas, if the first > completed task gets ExecutionException (due to the retry exhausted), it > throws the exception to the client[2]. > In this case, it needs to loop through these tasks, waiting for the success > one. If no one succeeds, throw exception. > Similar for the scan as well > [1] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L197 > [2] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L219 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16345) RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer Exceptions
[ https://issues.apache.org/jira/browse/HBASE-16345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15524094#comment-15524094 ] Matteo Bertozzi commented on HBASE-16345: - patch looks good to me, [~enis] do you have more comments or are you ok with it? > RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer > Exceptions > -- > > Key: HBASE-16345 > URL: https://issues.apache.org/jira/browse/HBASE-16345 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 2.0.0 >Reporter: huaxiang sun >Assignee: huaxiang sun > Attachments: HBASE-16345-v001.patch, HBASE-16345.master.001.patch, > HBASE-16345.master.002.patch, HBASE-16345.master.003.patch, > HBASE-16345.master.004.patch, HBASE-16345.master.005.patch, > HBASE-16345.master.005.patch > > > Update for the description. Debugged more at this front based on the comments > from Enis. > The cause is that for the primary replica, if its retry is exhausted too > fast, f.get() [1] returns ExecutionException. This Exception needs to be > ignored and continue with the replicas. > The other issue is that after adding calls for the replicas, if the first > completed task gets ExecutionException (due to the retry exhausted), it > throws the exception to the client[2]. > In this case, it needs to loop through these tasks, waiting for the success > one. If no one succeeds, throw exception. > Similar for the scan as well > [1] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L197 > [2] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L219 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16345) RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer Exceptions
[ https://issues.apache.org/jira/browse/HBASE-16345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15497689#comment-15497689 ] huaxiang sun commented on HBASE-16345: -- Hi [~enis], any comments for v5 patch? Thanks. > RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer > Exceptions > -- > > Key: HBASE-16345 > URL: https://issues.apache.org/jira/browse/HBASE-16345 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 2.0.0 >Reporter: huaxiang sun >Assignee: huaxiang sun > Attachments: HBASE-16345-v001.patch, HBASE-16345.master.001.patch, > HBASE-16345.master.002.patch, HBASE-16345.master.003.patch, > HBASE-16345.master.004.patch, HBASE-16345.master.005.patch, > HBASE-16345.master.005.patch > > > Update for the description. Debugged more at this front based on the comments > from Enis. > The cause is that for the primary replica, if its retry is exhausted too > fast, f.get() [1] returns ExecutionException. This Exception needs to be > ignored and continue with the replicas. > The other issue is that after adding calls for the replicas, if the first > completed task gets ExecutionException (due to the retry exhausted), it > throws the exception to the client[2]. > In this case, it needs to loop through these tasks, waiting for the success > one. If no one succeeds, throw exception. > Similar for the scan as well > [1] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L197 > [2] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L219 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16345) RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer Exceptions
[ https://issues.apache.org/jira/browse/HBASE-16345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15475996#comment-15475996 ] Hadoop QA commented on HBASE-16345: --- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 18s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 9s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s {color} | {color:green} master passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 53s {color} | {color:green} master passed with JDK v1.7.0_111 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 48s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 28s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 56s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 43s {color} | {color:green} master passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 56s {color} | {color:green} master passed with JDK v1.7.0_111 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 5s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 51s {color} | {color:green} the patch passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 51s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 54s {color} | {color:green} the patch passed with JDK v1.7.0_111 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 54s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 48s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 28s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 27m 58s {color} | {color:green} Patch does not cause any errors with Hadoop 2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} | | {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 0m 28s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 27s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 43s {color} | {color:green} the patch passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 54s {color} | {color:green} the patch passed with JDK v1.7.0_111 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 1s {color} | {color:green} hbase-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 101m 12s {color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 38s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 152m 29s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.11.2 Server=1.11.2 Image:yetus/hbase:date2016-09-09 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12827694/HBASE-16345.master.005.patch | | JIRA Issue | HBASE-16345 | | Optional Tests | asflicense javac javadoc unit
[jira] [Commented] (HBASE-16345) RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer Exceptions
[ https://issues.apache.org/jira/browse/HBASE-16345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15475723#comment-15475723 ] huaxiang sun commented on HBASE-16345: -- Locally run TestMasterFailoverWithProcedures passed. Reattach to trigger a new run. > RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer > Exceptions > -- > > Key: HBASE-16345 > URL: https://issues.apache.org/jira/browse/HBASE-16345 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 2.0.0 >Reporter: huaxiang sun >Assignee: huaxiang sun > Attachments: HBASE-16345-v001.patch, HBASE-16345.master.001.patch, > HBASE-16345.master.002.patch, HBASE-16345.master.003.patch, > HBASE-16345.master.004.patch, HBASE-16345.master.005.patch > > > Update for the description. Debugged more at this front based on the comments > from Enis. > The cause is that for the primary replica, if its retry is exhausted too > fast, f.get() [1] returns ExecutionException. This Exception needs to be > ignored and continue with the replicas. > The other issue is that after adding calls for the replicas, if the first > completed task gets ExecutionException (due to the retry exhausted), it > throws the exception to the client[2]. > In this case, it needs to loop through these tasks, waiting for the success > one. If no one succeeds, throw exception. > Similar for the scan as well > [1] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L197 > [2] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L219 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16345) RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer Exceptions
[ https://issues.apache.org/jira/browse/HBASE-16345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15475654#comment-15475654 ] Hadoop QA commented on HBASE-16345: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 19m 23s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 37s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 21s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 46s {color} | {color:green} master passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 52s {color} | {color:green} master passed with JDK v1.7.0_111 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 49s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 28s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 56s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 40s {color} | {color:green} master passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 54s {color} | {color:green} master passed with JDK v1.7.0_111 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 5s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s {color} | {color:green} the patch passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 45s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 52s {color} | {color:green} the patch passed with JDK v1.7.0_111 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 52s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 48s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 28s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 29m 9s {color} | {color:green} Patch does not cause any errors with Hadoop 2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} | | {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 0m 31s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 23s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 42s {color} | {color:green} the patch passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 54s {color} | {color:green} the patch passed with JDK v1.7.0_111 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 2s {color} | {color:green} hbase-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 118m 7s {color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 48s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 190m 43s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.master.procedure.TestMasterFailoverWithProcedures | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.11.2 Server=1.11.2 Image:yetus/hbase:date2016-09-08 | | JIRA Patch URL |
[jira] [Commented] (HBASE-16345) RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer Exceptions
[ https://issues.apache.org/jira/browse/HBASE-16345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15475108#comment-15475108 ] huaxiang sun commented on HBASE-16345: -- HBASE-16590 is invalid so I reviewed ScannerCallableWithReplicas's code. Found that Consistenty.STRONG is not handled in a specific case which caused unittest failure. Addressed that and a new patch will be uploaded soon. > RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer > Exceptions > -- > > Key: HBASE-16345 > URL: https://issues.apache.org/jira/browse/HBASE-16345 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 2.0.0 >Reporter: huaxiang sun >Assignee: huaxiang sun > Attachments: HBASE-16345-v001.patch, HBASE-16345.master.001.patch, > HBASE-16345.master.002.patch, HBASE-16345.master.003.patch, > HBASE-16345.master.004.patch > > > Update for the description. Debugged more at this front based on the comments > from Enis. > The cause is that for the primary replica, if its retry is exhausted too > fast, f.get() [1] returns ExecutionException. This Exception needs to be > ignored and continue with the replicas. > The other issue is that after adding calls for the replicas, if the first > completed task gets ExecutionException (due to the retry exhausted), it > throws the exception to the client[2]. > In this case, it needs to loop through these tasks, waiting for the success > one. If no one succeeds, throw exception. > Similar for the scan as well > [1] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L197 > [2] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L219 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16345) RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer Exceptions
[ https://issues.apache.org/jira/browse/HBASE-16345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15474792#comment-15474792 ] huaxiang sun commented on HBASE-16345: -- Hi [~enis], I uploaded v4 patch which addressed your latest comments. The unittest failure here is fixed with HBASE-16590 so I am not attaching a new patch for this jira. Thanks. > RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer > Exceptions > -- > > Key: HBASE-16345 > URL: https://issues.apache.org/jira/browse/HBASE-16345 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 2.0.0 >Reporter: huaxiang sun >Assignee: huaxiang sun > Attachments: HBASE-16345-v001.patch, HBASE-16345.master.001.patch, > HBASE-16345.master.002.patch, HBASE-16345.master.003.patch, > HBASE-16345.master.004.patch > > > Update for the description. Debugged more at this front based on the comments > from Enis. > The cause is that for the primary replica, if its retry is exhausted too > fast, f.get() [1] returns ExecutionException. This Exception needs to be > ignored and continue with the replicas. > The other issue is that after adding calls for the replicas, if the first > completed task gets ExecutionException (due to the retry exhausted), it > throws the exception to the client[2]. > In this case, it needs to loop through these tasks, waiting for the success > one. If no one succeeds, throw exception. > Similar for the scan as well > [1] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L197 > [2] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L219 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16345) RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer Exceptions
[ https://issues.apache.org/jira/browse/HBASE-16345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15474788#comment-15474788 ] huaxiang sun commented on HBASE-16345: -- The unitest failure is fixed with the patch at HBASE-16590. Basically, TestRestoreSnapshotFromClientWithRegionReplicas is using scan with strong consistency to count rows. With the behavior fixed in this jira, it expose an issue which count-rows returnes TimeoutException while NoColumnFamilyException is expected. > RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer > Exceptions > -- > > Key: HBASE-16345 > URL: https://issues.apache.org/jira/browse/HBASE-16345 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 2.0.0 >Reporter: huaxiang sun >Assignee: huaxiang sun > Attachments: HBASE-16345-v001.patch, HBASE-16345.master.001.patch, > HBASE-16345.master.002.patch, HBASE-16345.master.003.patch, > HBASE-16345.master.004.patch > > > Update for the description. Debugged more at this front based on the comments > from Enis. > The cause is that for the primary replica, if its retry is exhausted too > fast, f.get() [1] returns ExecutionException. This Exception needs to be > ignored and continue with the replicas. > The other issue is that after adding calls for the replicas, if the first > completed task gets ExecutionException (due to the retry exhausted), it > throws the exception to the client[2]. > In this case, it needs to loop through these tasks, waiting for the success > one. If no one succeeds, throw exception. > Similar for the scan as well > [1] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L197 > [2] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L219 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16345) RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer Exceptions
[ https://issues.apache.org/jira/browse/HBASE-16345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15474744#comment-15474744 ] huaxiang sun commented on HBASE-16345: -- Figured out the root cause, it is TestRestoreSnapshotFromClientWithRegionReplicas is using scan with STRONG consistency, which is wrong. I will upload a new patch with this testing case fixed shortly. > RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer > Exceptions > -- > > Key: HBASE-16345 > URL: https://issues.apache.org/jira/browse/HBASE-16345 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 2.0.0 >Reporter: huaxiang sun >Assignee: huaxiang sun > Attachments: HBASE-16345-v001.patch, HBASE-16345.master.001.patch, > HBASE-16345.master.002.patch, HBASE-16345.master.003.patch, > HBASE-16345.master.004.patch > > > Update for the description. Debugged more at this front based on the comments > from Enis. > The cause is that for the primary replica, if its retry is exhausted too > fast, f.get() [1] returns ExecutionException. This Exception needs to be > ignored and continue with the replicas. > The other issue is that after adding calls for the replicas, if the first > completed task gets ExecutionException (due to the retry exhausted), it > throws the exception to the client[2]. > In this case, it needs to loop through these tasks, waiting for the success > one. If no one succeeds, throw exception. > Similar for the scan as well > [1] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L197 > [2] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L219 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16345) RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer Exceptions
[ https://issues.apache.org/jira/browse/HBASE-16345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15474506#comment-15474506 ] huaxiang sun commented on HBASE-16345: -- Looking at the unitest failure, I can reproduce locally. Seem I missed it last time. > RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer > Exceptions > -- > > Key: HBASE-16345 > URL: https://issues.apache.org/jira/browse/HBASE-16345 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 2.0.0 >Reporter: huaxiang sun >Assignee: huaxiang sun > Attachments: HBASE-16345-v001.patch, HBASE-16345.master.001.patch, > HBASE-16345.master.002.patch, HBASE-16345.master.003.patch, > HBASE-16345.master.004.patch > > > Update for the description. Debugged more at this front based on the comments > from Enis. > The cause is that for the primary replica, if its retry is exhausted too > fast, f.get() [1] returns ExecutionException. This Exception needs to be > ignored and continue with the replicas. > The other issue is that after adding calls for the replicas, if the first > completed task gets ExecutionException (due to the retry exhausted), it > throws the exception to the client[2]. > In this case, it needs to loop through these tasks, waiting for the success > one. If no one succeeds, throw exception. > Similar for the scan as well > [1] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L197 > [2] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L219 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16345) RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer Exceptions
[ https://issues.apache.org/jira/browse/HBASE-16345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15473463#comment-15473463 ] Hadoop QA commented on HBASE-16345: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 17m 20s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 1s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 39s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 18s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 46s {color} | {color:green} master passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 53s {color} | {color:green} master passed with JDK v1.7.0_111 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 49s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 29s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 53s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 41s {color} | {color:green} master passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 53s {color} | {color:green} master passed with JDK v1.7.0_111 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 6s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s {color} | {color:green} the patch passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 45s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 53s {color} | {color:green} the patch passed with JDK v1.7.0_111 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 53s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 48s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 28s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 27m 59s {color} | {color:green} Patch does not cause any errors with Hadoop 2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} | | {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 0m 27s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 20s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 44s {color} | {color:green} the patch passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 0s {color} | {color:green} the patch passed with JDK v1.7.0_111 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 3s {color} | {color:green} hbase-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 122m 56s {color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 47s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 192m 2s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.client.TestRestoreSnapshotFromClientWithRegionReplicas | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.12.0 Server=1.12.0 Image:yetus/hbase:date2016-09-08 | | JIRA Patch URL |
[jira] [Commented] (HBASE-16345) RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer Exceptions
[ https://issues.apache.org/jira/browse/HBASE-16345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15473022#comment-15473022 ] huaxiang sun commented on HBASE-16345: -- Uploaded v4 patch which addresses Enis' latest comments. Run hbase unittest. Run newly added case w/o patch, it failed without patch and succeeded with patch. > RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer > Exceptions > -- > > Key: HBASE-16345 > URL: https://issues.apache.org/jira/browse/HBASE-16345 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 2.0.0 >Reporter: huaxiang sun >Assignee: huaxiang sun > Attachments: HBASE-16345-v001.patch, HBASE-16345.master.001.patch, > HBASE-16345.master.002.patch, HBASE-16345.master.003.patch, > HBASE-16345.master.004.patch > > > Update for the description. Debugged more at this front based on the comments > from Enis. > The cause is that for the primary replica, if its retry is exhausted too > fast, f.get() [1] returns ExecutionException. This Exception needs to be > ignored and continue with the replicas. > The other issue is that after adding calls for the replicas, if the first > completed task gets ExecutionException (due to the retry exhausted), it > throws the exception to the client[2]. > In this case, it needs to loop through these tasks, waiting for the success > one. If no one succeeds, throw exception. > Similar for the scan as well > [1] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L197 > [2] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L219 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16345) RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer Exceptions
[ https://issues.apache.org/jira/browse/HBASE-16345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15471295#comment-15471295 ] huaxiang sun commented on HBASE-16345: -- Thanks [~enis] for the comments, I will address the comments and upload a new patch shortly. > RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer > Exceptions > -- > > Key: HBASE-16345 > URL: https://issues.apache.org/jira/browse/HBASE-16345 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 2.0.0 >Reporter: huaxiang sun >Assignee: huaxiang sun > Attachments: HBASE-16345-v001.patch, HBASE-16345.master.001.patch, > HBASE-16345.master.002.patch, HBASE-16345.master.003.patch > > > Update for the description. Debugged more at this front based on the comments > from Enis. > The cause is that for the primary replica, if its retry is exhausted too > fast, f.get() [1] returns ExecutionException. This Exception needs to be > ignored and continue with the replicas. > The other issue is that after adding calls for the replicas, if the first > completed task gets ExecutionException (due to the retry exhausted), it > throws the exception to the client[2]. > In this case, it needs to loop through these tasks, waiting for the success > one. If no one succeeds, throw exception. > Similar for the scan as well > [1] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L197 > [2] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L219 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16345) RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer Exceptions
[ https://issues.apache.org/jira/browse/HBASE-16345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15467863#comment-15467863 ] huaxiang sun commented on HBASE-16345: -- Hi [~enis], [~tedyu], any comments for version 3 patch? Thanks. > RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer > Exceptions > -- > > Key: HBASE-16345 > URL: https://issues.apache.org/jira/browse/HBASE-16345 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 2.0.0 >Reporter: huaxiang sun >Assignee: huaxiang sun > Attachments: HBASE-16345-v001.patch, HBASE-16345.master.001.patch, > HBASE-16345.master.002.patch, HBASE-16345.master.003.patch > > > Update for the description. Debugged more at this front based on the comments > from Enis. > The cause is that for the primary replica, if its retry is exhausted too > fast, f.get() [1] returns ExecutionException. This Exception needs to be > ignored and continue with the replicas. > The other issue is that after adding calls for the replicas, if the first > completed task gets ExecutionException (due to the retry exhausted), it > throws the exception to the client[2]. > In this case, it needs to loop through these tasks, waiting for the success > one. If no one succeeds, throw exception. > Similar for the scan as well > [1] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L197 > [2] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L219 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16345) RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer Exceptions
[ https://issues.apache.org/jira/browse/HBASE-16345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15456553#comment-15456553 ] huaxiang sun commented on HBASE-16345: -- Seems that rpc server had no response within 60 seconds, and this caused rpc timeout. Seems flaky to me. Reattach the patch to trigger a new run as it passed locally for 5 runs. > RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer > Exceptions > -- > > Key: HBASE-16345 > URL: https://issues.apache.org/jira/browse/HBASE-16345 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 2.0.0 >Reporter: huaxiang sun >Assignee: huaxiang sun > Attachments: HBASE-16345-v001.patch, HBASE-16345.master.001.patch, > HBASE-16345.master.002.patch, HBASE-16345.master.003.patch > > > Update for the description. Debugged more at this front based on the comments > from Enis. > The cause is that for the primary replica, if its retry is exhausted too > fast, f.get() [1] returns ExecutionException. This Exception needs to be > ignored and continue with the replicas. > The other issue is that after adding calls for the replicas, if the first > completed task gets ExecutionException (due to the retry exhausted), it > throws the exception to the client[2]. > In this case, it needs to loop through these tasks, waiting for the success > one. If no one succeeds, throw exception. > Similar for the scan as well > [1] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L197 > [2] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L219 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16345) RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer Exceptions
[ https://issues.apache.org/jira/browse/HBASE-16345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15456498#comment-15456498 ] huaxiang sun commented on HBASE-16345: -- Local run of TestRestoreSnapshotFromClient#testRestoreSchemaChange passed. I am looking at the stack trace to see what is going on. > RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer > Exceptions > -- > > Key: HBASE-16345 > URL: https://issues.apache.org/jira/browse/HBASE-16345 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 2.0.0 >Reporter: huaxiang sun >Assignee: huaxiang sun > Attachments: HBASE-16345-v001.patch, HBASE-16345.master.001.patch, > HBASE-16345.master.002.patch, HBASE-16345.master.003.patch > > > Update for the description. Debugged more at this front based on the comments > from Enis. > The cause is that for the primary replica, if its retry is exhausted too > fast, f.get() [1] returns ExecutionException. This Exception needs to be > ignored and continue with the replicas. > The other issue is that after adding calls for the replicas, if the first > completed task gets ExecutionException (due to the retry exhausted), it > throws the exception to the client[2]. > In this case, it needs to loop through these tasks, waiting for the success > one. If no one succeeds, throw exception. > Similar for the scan as well > [1] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L197 > [2] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L219 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16345) RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer Exceptions
[ https://issues.apache.org/jira/browse/HBASE-16345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15456459#comment-15456459 ] Hadoop QA commented on HBASE-16345: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 12s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 37s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 24s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 46s {color} | {color:green} master passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 52s {color} | {color:green} master passed with JDK v1.7.0_111 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 48s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 29s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 57s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 41s {color} | {color:green} master passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s {color} | {color:green} master passed with JDK v1.7.0_111 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 5s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 46s {color} | {color:green} the patch passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 46s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 54s {color} | {color:green} the patch passed with JDK v1.7.0_111 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 54s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 48s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 28s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 27m 41s {color} | {color:green} Patch does not cause any errors with Hadoop 2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} | | {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 0m 26s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 23s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 41s {color} | {color:green} the patch passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s {color} | {color:green} the patch passed with JDK v1.7.0_111 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 0s {color} | {color:green} hbase-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 84m 1s {color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 37s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 135m 20s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.client.TestRestoreSnapshotFromClientWithRegionReplicas | | Timed out junit tests | org.apache.hadoop.hbase.master.balancer.TestStochasticLoadBalancer2 | | | org.apache.hadoop.hbase.io.hfile.TestScannerSelectionUsingTTL | | |
[jira] [Commented] (HBASE-16345) RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer Exceptions
[ https://issues.apache.org/jira/browse/HBASE-16345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15456135#comment-15456135 ] huaxiang sun commented on HBASE-16345: -- Hi [~enis], I just uploaded a new patch which addressed your comments (the major one is adding unittest cases). Could you review and provide feedback? Thanks! > RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer > Exceptions > -- > > Key: HBASE-16345 > URL: https://issues.apache.org/jira/browse/HBASE-16345 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 2.0.0 >Reporter: huaxiang sun >Assignee: huaxiang sun > Attachments: HBASE-16345-v001.patch, HBASE-16345.master.001.patch, > HBASE-16345.master.002.patch, HBASE-16345.master.003.patch > > > Update for the description. Debugged more at this front based on the comments > from Enis. > The cause is that for the primary replica, if its retry is exhausted too > fast, f.get() [1] returns ExecutionException. This Exception needs to be > ignored and continue with the replicas. > The other issue is that after adding calls for the replicas, if the first > completed task gets ExecutionException (due to the retry exhausted), it > throws the exception to the client[2]. > In this case, it needs to loop through these tasks, waiting for the success > one. If no one succeeds, throw exception. > Similar for the scan as well > [1] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L197 > [2] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L219 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16345) RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer Exceptions
[ https://issues.apache.org/jira/browse/HBASE-16345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15447549#comment-15447549 ] huaxiang sun commented on HBASE-16345: -- Thanks [~enis], working on the comments. > RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer > Exceptions > -- > > Key: HBASE-16345 > URL: https://issues.apache.org/jira/browse/HBASE-16345 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 2.0.0 >Reporter: huaxiang sun >Assignee: huaxiang sun > Attachments: HBASE-16345-v001.patch, HBASE-16345.master.001.patch, > HBASE-16345.master.002.patch > > > Update for the description. Debugged more at this front based on the comments > from Enis. > The cause is that for the primary replica, if its retry is exhausted too > fast, f.get() [1] returns ExecutionException. This Exception needs to be > ignored and continue with the replicas. > The other issue is that after adding calls for the replicas, if the first > completed task gets ExecutionException (due to the retry exhausted), it > throws the exception to the client[2]. > In this case, it needs to loop through these tasks, waiting for the success > one. If no one succeeds, throw exception. > Similar for the scan as well > [1] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L197 > [2] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L219 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16345) RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer Exceptions
[ https://issues.apache.org/jira/browse/HBASE-16345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15446663#comment-15446663 ] Enis Soztutar commented on HBASE-16345: --- bq. Yes, the default is 35 times. The retry number is configurable Thanks. Had left some comments @ RB. > RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer > Exceptions > -- > > Key: HBASE-16345 > URL: https://issues.apache.org/jira/browse/HBASE-16345 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 2.0.0 >Reporter: huaxiang sun >Assignee: huaxiang sun > Attachments: HBASE-16345-v001.patch, HBASE-16345.master.001.patch, > HBASE-16345.master.002.patch > > > Update for the description. Debugged more at this front based on the comments > from Enis. > The cause is that for the primary replica, if its retry is exhausted too > fast, f.get() [1] returns ExecutionException. This Exception needs to be > ignored and continue with the replicas. > The other issue is that after adding calls for the replicas, if the first > completed task gets ExecutionException (due to the retry exhausted), it > throws the exception to the client[2]. > In this case, it needs to loop through these tasks, waiting for the success > one. If no one succeeds, throw exception. > Similar for the scan as well > [1] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L197 > [2] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L219 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16345) RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer Exceptions
[ https://issues.apache.org/jira/browse/HBASE-16345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15439358#comment-15439358 ] huaxiang sun commented on HBASE-16345: -- Yes, the default is 35 times. The retry number is configurable. In my testing program, I set {code} configuration.setInt("hbase.client.retries.number", 2). and configuration.setInt("hbase.client.primaryCallTimeout.get", 10); {code} to trigger the issue. > RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer > Exceptions > -- > > Key: HBASE-16345 > URL: https://issues.apache.org/jira/browse/HBASE-16345 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 2.0.0 >Reporter: huaxiang sun >Assignee: huaxiang sun > Attachments: HBASE-16345-v001.patch, HBASE-16345.master.001.patch, > HBASE-16345.master.002.patch > > > Update for the description. Debugged more at this front based on the comments > from Enis. > The cause is that for the primary replica, if its retry is exhausted too > fast, f.get() [1] returns ExecutionException. This Exception needs to be > ignored and continue with the replicas. > The other issue is that after adding calls for the replicas, if the first > completed task gets ExecutionException (due to the retry exhausted), it > throws the exception to the client[2]. > In this case, it needs to loop through these tasks, waiting for the success > one. If no one succeeds, throw exception. > Similar for the scan as well > [1] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L197 > [2] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L219 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16345) RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer Exceptions
[ https://issues.apache.org/jira/browse/HBASE-16345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15438800#comment-15438800 ] Enis Soztutar commented on HBASE-16345: --- bq. The cause is that for the primary replica, if its retry is exhausted too fast, f.get() [1] returns ExecutionException. This Exception needs to be ignored and continue with the replicas. Agreed. However, with the default settings, we retry 35 times, with sleeping 100ms between each attempt. The first timeout before we send the replica requests is 10 - 100ms, which makes it very unlikely to exhaust the retries from primary before sending the replica RPCs. However, with very limited number of retries and longer timeout for primary request this can indeed happen. > RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer > Exceptions > -- > > Key: HBASE-16345 > URL: https://issues.apache.org/jira/browse/HBASE-16345 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 2.0.0 >Reporter: huaxiang sun >Assignee: huaxiang sun > Attachments: HBASE-16345-v001.patch, HBASE-16345.master.001.patch, > HBASE-16345.master.002.patch > > > Update for the description. Debugged more at this front based on the comments > from Enis. > The cause is that for the primary replica, if its retry is exhausted too > fast, f.get() [1] returns ExecutionException. This Exception needs to be > ignored and continue with the replicas. > The other issue is that after adding calls for the replicas, if the first > completed task gets ExecutionException (due to the retry exhausted), it > throws the exception to the client[2]. > In this case, it needs to loop through these tasks, waiting for the success > one. If no one succeeds, throw exception. > Similar for the scan as well > [1] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L197 > [2] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L219 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16345) RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer Exceptions
[ https://issues.apache.org/jira/browse/HBASE-16345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15438058#comment-15438058 ] Hadoop QA commented on HBASE-16345: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 6s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 14s {color} | {color:green} master passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 17s {color} | {color:green} master passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 24s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 11s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 57s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s {color} | {color:green} master passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s {color} | {color:green} master passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 19s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 14s {color} | {color:green} the patch passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 17s {color} | {color:green} the patch passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 17s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 24s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 11s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 27m 30s {color} | {color:green} Patch does not cause any errors with Hadoop 2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} | | {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 0m 11s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 12s {color} | {color:red} hbase-client generated 2 new + 0 unchanged - 0 fixed = 2 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s {color} | {color:green} the patch passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s {color} | {color:green} the patch passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 0s {color} | {color:green} hbase-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 7s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 38m 19s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hbase-client | | | Possible null pointer dereference of f in org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.call(int) on exception path Dereferenced at RpcRetryingCallerWithReadReplicas.java:f in org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.call(int) on exception path Dereferenced at RpcRetryingCallerWithReadReplicas.java:[line 240] | | | Possible null pointer dereference of f in org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(int) on exception path
[jira] [Commented] (HBASE-16345) RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer Exceptions
[ https://issues.apache.org/jira/browse/HBASE-16345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15437853#comment-15437853 ] huaxiang sun commented on HBASE-16345: -- Uploaded a new patch to both review board and here, addressing the unittest failures. > RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer > Exceptions > -- > > Key: HBASE-16345 > URL: https://issues.apache.org/jira/browse/HBASE-16345 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 2.0.0 >Reporter: huaxiang sun >Assignee: huaxiang sun > Attachments: HBASE-16345-v001.patch, HBASE-16345.master.001.patch, > HBASE-16345.master.002.patch > > > Update for the description. Debugged more at this front based on the comments > from Enis. > The cause is that for the primary replica, if its retry is exhausted too > fast, f.get() [1] returns ExecutionException. This Exception needs to be > ignored and continue with the replicas. > The other issue is that after adding calls for the replicas, if the first > completed task gets ExecutionException (due to the retry exhausted), it > throws the exception to the client[2]. > In this case, it needs to loop through these tasks, waiting for the success > one. If no one succeeds, throw exception. > Similar for the scan as well > [1] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L197 > [2] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L219 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16345) RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer Exceptions
[ https://issues.apache.org/jira/browse/HBASE-16345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435949#comment-15435949 ] huaxiang sun commented on HBASE-16345: -- Know the cause for the test failure, will change the code a little bit. > RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer > Exceptions > -- > > Key: HBASE-16345 > URL: https://issues.apache.org/jira/browse/HBASE-16345 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 2.0.0 >Reporter: huaxiang sun >Assignee: huaxiang sun > Attachments: HBASE-16345-v001.patch, HBASE-16345.master.001.patch > > > Update for the description. Debugged more at this front based on the comments > from Enis. > The cause is that for the primary replica, if its retry is exhausted too > fast, f.get() [1] returns ExecutionException. This Exception needs to be > ignored and continue with the replicas. > The other issue is that after adding calls for the replicas, if the first > completed task gets ExecutionException (due to the retry exhausted), it > throws the exception to the client[2]. > In this case, it needs to loop through these tasks, waiting for the success > one. If no one succeeds, throw exception. > Similar for the scan as well > [1] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L197 > [2] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L219 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16345) RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer Exceptions
[ https://issues.apache.org/jira/browse/HBASE-16345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435693#comment-15435693 ] Hadoop QA commented on HBASE-16345: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 13s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 14s {color} | {color:green} master passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 17s {color} | {color:green} master passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 23s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 11s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 57s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s {color} | {color:green} master passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s {color} | {color:green} master passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 19s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 14s {color} | {color:green} the patch passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 16s {color} | {color:green} the patch passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 16s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 23s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 10s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 27m 50s {color} | {color:green} Patch does not cause any errors with Hadoop 2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} | | {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 0m 10s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 12s {color} | {color:red} hbase-client generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s {color} | {color:green} the patch passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s {color} | {color:green} the patch passed with JDK v1.7.0_101 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 3m 12s {color} | {color:red} hbase-client in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 8s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 40m 58s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hbase-client | | | Possible null pointer dereference of f in org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.call(int) on exception path Dereferenced at RpcRetryingCallerWithReadReplicas.java:f in org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.call(int) on exception path Dereferenced at RpcRetryingCallerWithReadReplicas.java:[line 240] | | Failed junit tests | hadoop.hbase.client.TestClientScanner | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.11.2 Server=1.11.2
[jira] [Commented] (HBASE-16345) RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer Exceptions
[ https://issues.apache.org/jira/browse/HBASE-16345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435689#comment-15435689 ] huaxiang sun commented on HBASE-16345: -- Looking at the client testing failure. > RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer > Exceptions > -- > > Key: HBASE-16345 > URL: https://issues.apache.org/jira/browse/HBASE-16345 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 2.0.0 >Reporter: huaxiang sun >Assignee: huaxiang sun > Attachments: HBASE-16345-v001.patch, HBASE-16345.master.001.patch > > > Update for the description. Debugged more at this front based on the comments > from Enis. > The cause is that for the primary replica, if its retry is exhausted too > fast, f.get() [1] returns ExecutionException. This Exception needs to be > ignored and continue with the replicas. > The other issue is that after adding calls for the replicas, if the first > completed task gets ExecutionException (due to the retry exhausted), it > throws the exception to the client[2]. > In this case, it needs to loop through these tasks, waiting for the success > one. If no one succeeds, throw exception. > Similar for the scan as well > [1] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L197 > [2] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L219 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16345) RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer Exceptions
[ https://issues.apache.org/jira/browse/HBASE-16345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435654#comment-15435654 ] Hadoop QA commented on HBASE-16345: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 12s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 3s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 14s {color} | {color:green} master passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 17s {color} | {color:green} master passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 24s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 11s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 57s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s {color} | {color:green} master passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s {color} | {color:green} master passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 20s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 14s {color} | {color:green} the patch passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 17s {color} | {color:green} the patch passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 17s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 24s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 11s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 27m 32s {color} | {color:green} Patch does not cause any errors with Hadoop 2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} | | {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 0m 11s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 11s {color} | {color:red} hbase-client generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s {color} | {color:green} the patch passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s {color} | {color:green} the patch passed with JDK v1.7.0_101 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 3m 9s {color} | {color:red} hbase-client in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 7s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 40m 28s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hbase-client | | | Possible null pointer dereference of f in org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.call(int) on exception path Dereferenced at RpcRetryingCallerWithReadReplicas.java:f in org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.call(int) on exception path Dereferenced at RpcRetryingCallerWithReadReplicas.java:[line 240] | | Failed junit tests | hadoop.hbase.client.TestClientScanner | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.11.2 Server=1.11.2
[jira] [Commented] (HBASE-16345) RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer Exceptions
[ https://issues.apache.org/jira/browse/HBASE-16345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435568#comment-15435568 ] huaxiang sun commented on HBASE-16345: -- Hi [~enis], sorry for late comeback. Did some more debugging and came up with the first patch. When you have time, could you help to take a look at the patch to see if this is on the right track? I am trying to come up with unittests. The tests I did is on a cluster. Thanks > RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer > Exceptions > -- > > Key: HBASE-16345 > URL: https://issues.apache.org/jira/browse/HBASE-16345 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 2.0.0 >Reporter: huaxiang sun >Assignee: huaxiang sun > Attachments: HBASE-16345-v001.patch > > > Update for the description. Debugged more at this front based on the comments > from Enis. > The cause is that for the primary replica, if its retry is exhausted too > fast, f.get() [1] returns ExecutionException. This Exception needs to be > ignored and continue with the replicas. > The other issue is that after adding calls for the replicas, if the first > completed task gets ExecutionException (due to the retry exhausted), it > throws the exception to the client[2]. > In this case, it needs to loop through these tasks, waiting for the success > one. If no one succeeds, throw exception. > Similar for the scan as well > [1] > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L197 > [2]https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L219 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16345) RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer Exceptions
[ https://issues.apache.org/jira/browse/HBASE-16345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15406806#comment-15406806 ] huaxiang sun commented on HBASE-16345: -- Thanks [~enis] for info, I overlooked this part. Gathering more info now and will report back. > RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer > Exceptions > -- > > Key: HBASE-16345 > URL: https://issues.apache.org/jira/browse/HBASE-16345 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 2.0.0 >Reporter: huaxiang sun >Assignee: huaxiang sun > > We run into the following exception during read replica testing. > {code} > Caused by: org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: > org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: Server > not running > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.checkOpen(RSRpcServices.java:924) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:1766) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:31439) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2035) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:107) > at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130) > at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107) > at java.lang.Thread.run(Thread.java:745) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at > org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106) > at > org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95) > at > org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:326) > at > org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas$ReplicaRegionServerCallable.call(RpcRetryingCallerWithReadReplicas.java:168) > at > org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas$ReplicaRegionServerCallable.call(RpcRetryingCallerWithReadReplicas.java:100) > at > org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:126) > ... 4 more > Caused by: > org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.regionserver.RegionServerStoppedException): > org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: Server > not running > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.checkOpen(RSRpcServices.java:924) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:1766) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:31439) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2035) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:107) > at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130) > at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107) > at java.lang.Thread.run(Thread.java:745) > at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1200) > at > org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:216) > at > org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:300) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.get(ClientProtos.java:31865) > at > org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas$ReplicaRegionServerCallable.call(RpcRetryingCallerWithReadReplicas.java:162) > ... 6 more > {code} > Checking the code, we need to catch a few exceptions from the primary region > server and continue with replicas. > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L211 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16345) RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer Exceptions
[ https://issues.apache.org/jira/browse/HBASE-16345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15406611#comment-15406611 ] Enis Soztutar commented on HBASE-16345: --- Is the request failed, or you see this exception, but the requests was still successful? In this: https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L211 what we do is that we submit a Callable to the {{ResultBoundedCompletionService}} which in turn uses a retrying caller to execute that callable. We wait on that operation until timeout, so even if the primary receives RegionServerStoppedException, it won't get propagated to the RpcRetryingCallerWithReadReplicas level unless all of the 35 retries are exhausted. > RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer > Exceptions > -- > > Key: HBASE-16345 > URL: https://issues.apache.org/jira/browse/HBASE-16345 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 2.0.0 >Reporter: huaxiang sun >Assignee: huaxiang sun > > We run into the following exception during read replica testing. > {code} > Caused by: org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: > org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: Server > not running > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.checkOpen(RSRpcServices.java:924) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:1766) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:31439) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2035) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:107) > at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130) > at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107) > at java.lang.Thread.run(Thread.java:745) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at > org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106) > at > org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95) > at > org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:326) > at > org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas$ReplicaRegionServerCallable.call(RpcRetryingCallerWithReadReplicas.java:168) > at > org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas$ReplicaRegionServerCallable.call(RpcRetryingCallerWithReadReplicas.java:100) > at > org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:126) > ... 4 more > Caused by: > org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.regionserver.RegionServerStoppedException): > org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: Server > not running > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.checkOpen(RSRpcServices.java:924) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:1766) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:31439) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2035) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:107) > at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130) > at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107) > at java.lang.Thread.run(Thread.java:745) > at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1200) > at > org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:216) > at > org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:300) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.get(ClientProtos.java:31865) > at > org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas$ReplicaRegionServerCallable.call(RpcRetryingCallerWithReadReplicas.java:162) > ... 6 more > {code} > Checking the code, we need to catch a few exceptions from the primary region > server and continue with replicas. >