[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list
[ https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16651699#comment-16651699 ] Hudson commented on HBASE-21266: Results for branch master [build #549 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/549/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/master/549//General_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/master/549//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/master/549//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Not running balancer because processing dead regionservers, but empty dead rs > list > -- > > Key: HBASE-21266 > URL: https://issues.apache.org/jira/browse/HBASE-21266 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.8 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Major > Fix For: 3.0.0, 1.5.0, 1.3.3, 2.2.0, 2.1.1, 2.0.3, 1.4.9 > > Attachments: HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266.branch-2.1.001.patch > > > Found during ITBLL testing. AM in master gets into a state where manual > attempts from the shell to run the balancer always return false and this is > printed in the master log: > 2018-10-03 19:17:14,892 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: > Not running balancer because processing dead regionserver(s): > Note the empty list. > This errant state did not recover without intervention by way of master > restart, but the test environment was chaotic so needs investigation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list
[ https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16651629#comment-16651629 ] Hudson commented on HBASE-21266: Results for branch branch-2 [build #1396 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1396/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1396//General_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1396//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1396//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Not running balancer because processing dead regionservers, but empty dead rs > list > -- > > Key: HBASE-21266 > URL: https://issues.apache.org/jira/browse/HBASE-21266 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.8 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Major > Fix For: 3.0.0, 1.5.0, 1.3.3, 2.2.0, 2.1.1, 2.0.3, 1.4.9 > > Attachments: HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266.branch-2.1.001.patch > > > Found during ITBLL testing. AM in master gets into a state where manual > attempts from the shell to run the balancer always return false and this is > printed in the master log: > 2018-10-03 19:17:14,892 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: > Not running balancer because processing dead regionserver(s): > Note the empty list. > This errant state did not recover without intervention by way of master > restart, but the test environment was chaotic so needs investigation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list
[ https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16651567#comment-16651567 ] Hudson commented on HBASE-21266: Results for branch branch-2.1 [build #472 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/472/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/472//General_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/472//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/472//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Not running balancer because processing dead regionservers, but empty dead rs > list > -- > > Key: HBASE-21266 > URL: https://issues.apache.org/jira/browse/HBASE-21266 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.8 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Major > Fix For: 3.0.0, 1.5.0, 1.3.3, 2.2.0, 2.1.1, 2.0.3, 1.4.9 > > Attachments: HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266.branch-2.1.001.patch > > > Found during ITBLL testing. AM in master gets into a state where manual > attempts from the shell to run the balancer always return false and this is > printed in the master log: > 2018-10-03 19:17:14,892 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: > Not running balancer because processing dead regionserver(s): > Note the empty list. > This errant state did not recover without intervention by way of master > restart, but the test environment was chaotic so needs investigation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list
[ https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16651529#comment-16651529 ] Hudson commented on HBASE-21266: Results for branch branch-1.4 [build #509 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/509/]: (x) *{color:red}-1 overall{color}* details (if available): (x) {color:red}-1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/509//General_Nightly_Build_Report/] (x) {color:red}-1 jdk7 checks{color} -- For more information [see jdk7 report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/509//JDK7_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/509//JDK8_Nightly_Build_Report_(Hadoop2)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. > Not running balancer because processing dead regionservers, but empty dead rs > list > -- > > Key: HBASE-21266 > URL: https://issues.apache.org/jira/browse/HBASE-21266 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.8 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Major > Fix For: 3.0.0, 1.5.0, 1.3.3, 2.2.0, 2.1.1, 2.0.3, 1.4.9 > > Attachments: HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266.branch-2.1.001.patch > > > Found during ITBLL testing. AM in master gets into a state where manual > attempts from the shell to run the balancer always return false and this is > printed in the master log: > 2018-10-03 19:17:14,892 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: > Not running balancer because processing dead regionserver(s): > Note the empty list. > This errant state did not recover without intervention by way of master > restart, but the test environment was chaotic so needs investigation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list
[ https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16651518#comment-16651518 ] Hudson commented on HBASE-21266: Results for branch branch-2.0 [build #957 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/957/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/957//General_Nightly_Build_Report/] (/) {color:green}+1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/957//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/957//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. > Not running balancer because processing dead regionservers, but empty dead rs > list > -- > > Key: HBASE-21266 > URL: https://issues.apache.org/jira/browse/HBASE-21266 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.8 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Major > Fix For: 3.0.0, 1.5.0, 1.3.3, 2.2.0, 2.1.1, 2.0.3, 1.4.9 > > Attachments: HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266.branch-2.1.001.patch > > > Found during ITBLL testing. AM in master gets into a state where manual > attempts from the shell to run the balancer always return false and this is > printed in the master log: > 2018-10-03 19:17:14,892 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: > Not running balancer because processing dead regionserver(s): > Note the empty list. > This errant state did not recover without intervention by way of master > restart, but the test environment was chaotic so needs investigation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list
[ https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16651469#comment-16651469 ] Hudson commented on HBASE-21266: Results for branch branch-1.3 [build #505 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/505/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/505//General_Nightly_Build_Report/] (x) {color:red}-1 jdk7 checks{color} -- For more information [see jdk7 report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/505//JDK7_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/505//JDK8_Nightly_Build_Report_(Hadoop2)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. > Not running balancer because processing dead regionservers, but empty dead rs > list > -- > > Key: HBASE-21266 > URL: https://issues.apache.org/jira/browse/HBASE-21266 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.8 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Major > Fix For: 3.0.0, 1.5.0, 1.3.3, 2.2.0, 2.1.1, 2.0.3, 1.4.9 > > Attachments: HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266.branch-2.1.001.patch > > > Found during ITBLL testing. AM in master gets into a state where manual > attempts from the shell to run the balancer always return false and this is > printed in the master log: > 2018-10-03 19:17:14,892 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: > Not running balancer because processing dead regionserver(s): > Note the empty list. > This errant state did not recover without intervention by way of master > restart, but the test environment was chaotic so needs investigation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list
[ https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16651468#comment-16651468 ] Hudson commented on HBASE-21266: Results for branch branch-1 [build #511 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/511/]: (x) *{color:red}-1 overall{color}* details (if available): (x) {color:red}-1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/511//General_Nightly_Build_Report/] (x) {color:red}-1 jdk7 checks{color} -- For more information [see jdk7 report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/511//JDK7_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/511//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 source release artifact{color} -- See build output for details. > Not running balancer because processing dead regionservers, but empty dead rs > list > -- > > Key: HBASE-21266 > URL: https://issues.apache.org/jira/browse/HBASE-21266 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.8 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Major > Fix For: 3.0.0, 1.5.0, 1.3.3, 2.2.0, 2.1.1, 2.0.3, 1.4.9 > > Attachments: HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266.branch-2.1.001.patch > > > Found during ITBLL testing. AM in master gets into a state where manual > attempts from the shell to run the balancer always return false and this is > printed in the master log: > 2018-10-03 19:17:14,892 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: > Not running balancer because processing dead regionserver(s): > Note the empty list. > This errant state did not recover without intervention by way of master > restart, but the test environment was chaotic so needs investigation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list
[ https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16650912#comment-16650912 ] Hudson commented on HBASE-21266: SUCCESS: Integrated in Jenkins build HBase-1.3-IT #491 (See [https://builds.apache.org/job/HBase-1.3-IT/491/]) HBASE-21266 Not running balancer because processing dead regionservers, (apurtell: rev 743f9a4ed0d94a34bde78ff801b1f3d9d2229aa2) * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/DeadServer.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestEndToEndSplitTransaction.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestDeadServer.java > Not running balancer because processing dead regionservers, but empty dead rs > list > -- > > Key: HBASE-21266 > URL: https://issues.apache.org/jira/browse/HBASE-21266 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.8 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Major > Fix For: 1.5.0, 1.3.3, 1.4.9 > > Attachments: HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266.branch-2.1.001.patch > > > Found during ITBLL testing. AM in master gets into a state where manual > attempts from the shell to run the balancer always return false and this is > printed in the master log: > 2018-10-03 19:17:14,892 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: > Not running balancer because processing dead regionserver(s): > Note the empty list. > This errant state did not recover without intervention by way of master > restart, but the test environment was chaotic so needs investigation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list
[ https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16650840#comment-16650840 ] Andrew Purtell commented on HBASE-21266: Thanks I'll commit now > Not running balancer because processing dead regionservers, but empty dead rs > list > -- > > Key: HBASE-21266 > URL: https://issues.apache.org/jira/browse/HBASE-21266 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.8 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Major > Fix For: 1.5.0, 1.4.9 > > Attachments: HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266.branch-2.1.001.patch > > > Found during ITBLL testing. AM in master gets into a state where manual > attempts from the shell to run the balancer always return false and this is > printed in the master log: > 2018-10-03 19:17:14,892 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: > Not running balancer because processing dead regionserver(s): > Note the empty list. > This errant state did not recover without intervention by way of master > restart, but the test environment was chaotic so needs investigation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list
[ https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16650832#comment-16650832 ] stack commented on HBASE-21266: --- +1 from me. I tried it on my test cluster and balancer 'works' now (balancer has other issues... issues coming but this patch addresses the empty dead server list). Thanks. > Not running balancer because processing dead regionservers, but empty dead rs > list > -- > > Key: HBASE-21266 > URL: https://issues.apache.org/jira/browse/HBASE-21266 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.8 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Major > Fix For: 1.5.0, 1.4.9 > > Attachments: HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266.branch-2.1.001.patch > > > Found during ITBLL testing. AM in master gets into a state where manual > attempts from the shell to run the balancer always return false and this is > printed in the master log: > 2018-10-03 19:17:14,892 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: > Not running balancer because processing dead regionserver(s): > Note the empty list. > This errant state did not recover without intervention by way of master > restart, but the test environment was chaotic so needs investigation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list
[ https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16650783#comment-16650783 ] Andrew Purtell commented on HBASE-21266: Any concerns about committing this? The branch-2 forward port has a +1. I will commit this tomorrow to the branch-1s if no objection. > Not running balancer because processing dead regionservers, but empty dead rs > list > -- > > Key: HBASE-21266 > URL: https://issues.apache.org/jira/browse/HBASE-21266 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.8 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Major > Fix For: 1.5.0, 1.4.9 > > Attachments: HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266.branch-2.1.001.patch > > > Found during ITBLL testing. AM in master gets into a state where manual > attempts from the shell to run the balancer always return false and this is > printed in the master log: > 2018-10-03 19:17:14,892 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: > Not running balancer because processing dead regionserver(s): > Note the empty list. > This errant state did not recover without intervention by way of master > restart, but the test environment was chaotic so needs investigation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list
[ https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16649518#comment-16649518 ] Andrew Purtell commented on HBASE-21266: The 1B row ITBLL with serverKilling policy passed, twice > Not running balancer because processing dead regionservers, but empty dead rs > list > -- > > Key: HBASE-21266 > URL: https://issues.apache.org/jira/browse/HBASE-21266 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.8 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Major > Fix For: 1.5.0, 1.4.9 > > Attachments: HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266.branch-2.1.001.patch > > > Found during ITBLL testing. AM in master gets into a state where manual > attempts from the shell to run the balancer always return false and this is > printed in the master log: > 2018-10-03 19:17:14,892 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: > Not running balancer because processing dead regionserver(s): > Note the empty list. > This errant state did not recover without intervention by way of master > restart, but the test environment was chaotic so needs investigation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list
[ https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16648738#comment-16648738 ] Hadoop QA commented on HBASE-21266: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} branch-2.1 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 55s{color} | {color:green} branch-2.1 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 14s{color} | {color:green} branch-2.1 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 24s{color} | {color:green} branch-2.1 passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 3m 51s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 11s{color} | {color:green} branch-2.1 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 38s{color} | {color:green} branch-2.1 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 58s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 18s{color} | {color:red} hbase-server: The patch generated 2 new + 2 unchanged - 5 fixed = 4 total (was 7) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 8s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 9m 9s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green}111m 39s{color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}152m 12s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:42ca976 | | JIRA Issue | HBASE-21266 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12943733/HBASE-21266.branch-2.1.001.patch | | Optional Tests | dupname asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux 02728695d635 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 14:43:09 UTC 2018 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | branch-2.1 / 72af27b8c9 | | maven | version: Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC3 | | checkstyle | https://builds.apache.org/job/PreCommit-HBASE-Build/14680/artifact/patchprocess/diff-checkstyle-hbase-server.txt | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/14680/testReport/ | | Max. process+thread count | 4368 (vs. ulimit of 1) | | modules | C: hbase-server U: hbase-server |
[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list
[ https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16648642#comment-16648642 ] stack commented on HBASE-21266: --- 2.1.001 is forward-port of Andrew's patch; took a bit of jiggering. > Not running balancer because processing dead regionservers, but empty dead rs > list > -- > > Key: HBASE-21266 > URL: https://issues.apache.org/jira/browse/HBASE-21266 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.8 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Major > Fix For: 1.5.0, 1.4.9 > > Attachments: HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266.branch-2.1.001.patch > > > Found during ITBLL testing. AM in master gets into a state where manual > attempts from the shell to run the balancer always return false and this is > printed in the master log: > 2018-10-03 19:17:14,892 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: > Not running balancer because processing dead regionserver(s): > Note the empty list. > This errant state did not recover without intervention by way of master > restart, but the test environment was chaotic so needs investigation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list
[ https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16648600#comment-16648600 ] Hadoop QA commented on HBASE-21266: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 1s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} branch-1 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 5s{color} | {color:green} branch-1 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 18s{color} | {color:green} branch-1 passed with JDK v1.8.0_181 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 2s{color} | {color:green} branch-1 passed with JDK v1.7.0_191 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 4s{color} | {color:green} branch-1 passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 3m 57s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 15s{color} | {color:green} branch-1 passed with JDK v1.8.0_181 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 1s{color} | {color:green} branch-1 passed with JDK v1.7.0_191 {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 21s{color} | {color:green} the patch passed with JDK v1.8.0_181 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 6s{color} | {color:green} the patch passed with JDK v1.7.0_191 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 6s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 56s{color} | {color:green} hbase-server: The patch generated 0 new + 68 unchanged - 5 fixed = 68 total (was 73) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 3m 48s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 2m 36s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s{color} | {color:green} the patch passed with JDK v1.8.0_181 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 3s{color} | {color:green} the patch passed with JDK v1.7.0_191 {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}144m 32s{color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 28s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}180m 14s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.master.TestMasterBalanceThrottling | | | hadoop.hbase.regionserver.TestPerColumnFamilyFlush | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:61288f8 | | JIRA Issue | HBASE-21266 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12943710/HBASE-21266-branch-1.patch | | Optional Tests | dupname asflicense javac javadoc unit findbugs shadedjars
[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list
[ https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16648562#comment-16648562 ] Andrew Purtell commented on HBASE-21266: Just waiting for a 1B row ITBLL with serverKilling chaos policy to complete on the latest patch. Still running. Looks good so far. Previously, problems surfaced quickly. Unit tests all look good. > Not running balancer because processing dead regionservers, but empty dead rs > list > -- > > Key: HBASE-21266 > URL: https://issues.apache.org/jira/browse/HBASE-21266 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.8 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Major > Fix For: 1.5.0, 1.4.9 > > Attachments: HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch > > > Found during ITBLL testing. AM in master gets into a state where manual > attempts from the shell to run the balancer always return false and this is > printed in the master log: > 2018-10-03 19:17:14,892 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: > Not running balancer because processing dead regionserver(s): > Note the empty list. > This errant state did not recover without intervention by way of master > restart, but the test environment was chaotic so needs investigation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list
[ https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16648433#comment-16648433 ] Hadoop QA commented on HBASE-21266: --- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 1s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} branch-1 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 57s{color} | {color:green} branch-1 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 43s{color} | {color:green} branch-1 passed with JDK v1.8.0_181 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 43s{color} | {color:green} branch-1 passed with JDK v1.7.0_191 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 27s{color} | {color:green} branch-1 passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 2m 58s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 44s{color} | {color:green} branch-1 passed with JDK v1.8.0_181 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 43s{color} | {color:green} branch-1 passed with JDK v1.7.0_191 {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s{color} | {color:green} the patch passed with JDK v1.8.0_181 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 43s{color} | {color:green} the patch passed with JDK v1.7.0_191 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 29s{color} | {color:green} hbase-server: The patch generated 0 new + 68 unchanged - 5 fixed = 68 total (was 73) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 3m 7s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 1m 59s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s{color} | {color:green} the patch passed with JDK v1.8.0_181 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 39s{color} | {color:green} the patch passed with JDK v1.7.0_191 {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green}113m 45s{color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 36s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}141m 33s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:61288f8 | | JIRA Issue | HBASE-21266 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12943695/HBASE-21266-branch-1.patch | | Optional Tests | dupname asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux 1fabe85234fb 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 14:43:09 UTC 2018 x86_64
[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list
[ https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16648338#comment-16648338 ] Andrew Purtell commented on HBASE-21266: If I remove an entry from the processing servers map when registering a new instance of the dead server coming back online, then I'll trip over an assert I added. Back soon. > Not running balancer because processing dead regionservers, but empty dead rs > list > -- > > Key: HBASE-21266 > URL: https://issues.apache.org/jira/browse/HBASE-21266 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.8 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Major > Fix For: 1.5.0, 1.4.9 > > Attachments: HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch > > > Found during ITBLL testing. AM in master gets into a state where manual > attempts from the shell to run the balancer always return false and this is > printed in the master log: > 2018-10-03 19:17:14,892 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: > Not running balancer because processing dead regionserver(s): > Note the empty list. > This errant state did not recover without intervention by way of master > restart, but the test environment was chaotic so needs investigation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list
[ https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16648268#comment-16648268 ] Andrew Purtell commented on HBASE-21266: Updated patch with suggestion from [~stack], also fixed something dumb I did with logging > Not running balancer because processing dead regionservers, but empty dead rs > list > -- > > Key: HBASE-21266 > URL: https://issues.apache.org/jira/browse/HBASE-21266 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.8 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Major > Fix For: 1.5.0, 1.4.9 > > Attachments: HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch > > > Found during ITBLL testing. AM in master gets into a state where manual > attempts from the shell to run the balancer always return false and this is > printed in the master log: > 2018-10-03 19:17:14,892 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: > Not running balancer because processing dead regionserver(s): > Note the empty list. > This errant state did not recover without intervention by way of master > restart, but the test environment was chaotic so needs investigation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list
[ https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16648171#comment-16648171 ] Andrew Purtell commented on HBASE-21266: bq. Are the check for logging level and avoidance of interpolation because this a branch-1 patch? e.g. Yes bq. Any value in checking the add/remove to set return values? If only to log? Yes, let me update. Back in a bit. > Not running balancer because processing dead regionservers, but empty dead rs > list > -- > > Key: HBASE-21266 > URL: https://issues.apache.org/jira/browse/HBASE-21266 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.8 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Major > Fix For: 1.5.0, 1.4.9 > > Attachments: HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch > > > Found during ITBLL testing. AM in master gets into a state where manual > attempts from the shell to run the balancer always return false and this is > printed in the master log: > 2018-10-03 19:17:14,892 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: > Not running balancer because processing dead regionserver(s): > Note the empty list. > This errant state did not recover without intervention by way of master > restart, but the test environment was chaotic so needs investigation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list
[ https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16648132#comment-16648132 ] stack commented on HBASE-21266: --- I just ran into this testing tip of branch-2.1 Patch looks good to me. Are the check for logging level and avoidance of interpolation because this a branch-1 patch? e.g. 161 if (LOG.isDebugEnabled()) { 162 LOG.debug("Removing old instance of server from processingServers set: " + sn + 163 " (numProcessing=" + processingServers.size() + ")"); 164 } Any value in checking the add/remove to set return values? If only to log? > Not running balancer because processing dead regionservers, but empty dead rs > list > -- > > Key: HBASE-21266 > URL: https://issues.apache.org/jira/browse/HBASE-21266 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.8 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Major > Fix For: 1.5.0, 1.4.9 > > Attachments: HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch > > > Found during ITBLL testing. AM in master gets into a state where manual > attempts from the shell to run the balancer always return false and this is > printed in the master log: > 2018-10-03 19:17:14,892 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: > Not running balancer because processing dead regionserver(s): > Note the empty list. > This errant state did not recover without intervention by way of master > restart, but the test environment was chaotic so needs investigation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list
[ https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16647339#comment-16647339 ] Hadoop QA commented on HBASE-21266: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 1s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} branch-1 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 20s{color} | {color:green} branch-1 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s{color} | {color:green} branch-1 passed with JDK v1.8.0_181 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s{color} | {color:green} branch-1 passed with JDK v1.7.0_191 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 29s{color} | {color:green} branch-1 passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 2m 51s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 36s{color} | {color:green} branch-1 passed with JDK v1.8.0_181 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 37s{color} | {color:green} branch-1 passed with JDK v1.7.0_191 {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 48s{color} | {color:green} the patch passed with JDK v1.8.0_181 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s{color} | {color:green} the patch passed with JDK v1.7.0_191 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 41s{color} | {color:green} hbase-server: The patch generated 0 new + 68 unchanged - 5 fixed = 68 total (was 73) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 2m 56s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 1m 48s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s{color} | {color:green} the patch passed with JDK v1.8.0_181 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 38s{color} | {color:green} the patch passed with JDK v1.7.0_191 {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}129m 43s{color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 28s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}156m 56s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.client.TestReplicasClient | | | hadoop.hbase.client.TestAdmin2 | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:61288f8 | | JIRA Issue | HBASE-21266 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12943544/HBASE-21266-branch-1.patch | | Optional Tests | dupname asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti
[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list
[ https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16647112#comment-16647112 ] Andrew Purtell commented on HBASE-21266: In case it's not clear I'm treating TestZKLessSplitOnCluster and TestEndToEndSplitTransaction like generic flaky tests at this point. I can include the test changes in the patch here or break them out to a subtask. Will do the former unless someone would like it done as the latter. > Not running balancer because processing dead regionservers, but empty dead rs > list > -- > > Key: HBASE-21266 > URL: https://issues.apache.org/jira/browse/HBASE-21266 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.8 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Major > Fix For: 1.5.0, 1.4.9 > > Attachments: HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch > > > Found during ITBLL testing. AM in master gets into a state where manual > attempts from the shell to run the balancer always return false and this is > printed in the master log: > 2018-10-03 19:17:14,892 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: > Not running balancer because processing dead regionserver(s): > Note the empty list. > This errant state did not recover without intervention by way of master > restart, but the test environment was chaotic so needs investigation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list
[ https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16647107#comment-16647107 ] Andrew Purtell commented on HBASE-21266: TestZKLessSplitOnCluster.testSSHCleanupDaugtherRegionsOfAbortedSplit does hand-rolled waits with 10 ms sleeps. Rewrote those to use Waiter#waitFor with the same timeout and period values of other uses of Waiter#waitFor in this unit. TestEndToEndSplitTransaction.testFromClientSideWhileSplitting utilizes a chore named RegionChecker also with a 10 ms interval, increasing this to 100. This isn't necessary beyond the fact that sleep(10) is obnoxious. Might as well just be a yield() or a spin-wait. > Not running balancer because processing dead regionservers, but empty dead rs > list > -- > > Key: HBASE-21266 > URL: https://issues.apache.org/jira/browse/HBASE-21266 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.8 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Major > Fix For: 1.5.0, 1.4.9 > > Attachments: HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch > > > Found during ITBLL testing. AM in master gets into a state where manual > attempts from the shell to run the balancer always return false and this is > printed in the master log: > 2018-10-03 19:17:14,892 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: > Not running balancer because processing dead regionserver(s): > Note the empty list. > This errant state did not recover without intervention by way of master > restart, but the test environment was chaotic so needs investigation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list
[ https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16647025#comment-16647025 ] Andrew Purtell commented on HBASE-21266: TestEndToEndSplitTransaction.testFromClientSideWhileSplitting doesn't involve dead server processing directly. TestZKLessSplitOnCluster.testSSHCleanupDaugtherRegionsOfAbortedSplit waits on ServerManager#areDeadServersInProgress so this change seems to have affected timing in this test with the result of making it unstable. Will look into it. > Not running balancer because processing dead regionservers, but empty dead rs > list > -- > > Key: HBASE-21266 > URL: https://issues.apache.org/jira/browse/HBASE-21266 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.8 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Major > Fix For: 1.5.0, 1.4.9 > > Attachments: HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch > > > Found during ITBLL testing. AM in master gets into a state where manual > attempts from the shell to run the balancer always return false and this is > printed in the master log: > 2018-10-03 19:17:14,892 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: > Not running balancer because processing dead regionserver(s): > Note the empty list. > This errant state did not recover without intervention by way of master > restart, but the test environment was chaotic so needs investigation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list
[ https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16647007#comment-16647007 ] Andrew Purtell commented on HBASE-21266: Those test failures in precommit might be flakes, let me see if I can reproduce them. I ran split, merge, assignment, and balancer tests, including the tests in question, and am not seeing any issues. {noformat} [INFO] Running org.apache.hadoop.hbase.util.TestMergeTable [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 26.85 s - in org.apache.hadoop.hbase.util.TestMergeTable [INFO] Running org.apache.hadoop.hbase.util.TestMergeTool [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 19.995 s - in org.apache.hadoop.hbase.util.TestMergeTool [INFO] Running org.apache.hadoop.hbase.util.TestRegionSplitter [INFO] Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 16.962 s - in org.apache.hadoop.hbase.util.TestRegionSplitter [INFO] Running org.apache.hadoop.hbase.util.TestRegionSplitCalculator [INFO] Tests run: 15, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.968 s - in org.apache.hadoop.hbase.util.TestRegionSplitCalculator [INFO] Running org.apache.hadoop.hbase.wal.TestWALSplit [INFO] Tests run: 33, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 46.067 s - in org.apache.hadoop.hbase.wal.TestWALSplit [INFO] Running org.apache.hadoop.hbase.wal.TestWALSplitBoundedLogWriterCreation [WARNING] Tests run: 33, Failures: 0, Errors: 0, Skipped: 1, Time elapsed: 43.419 s - in org.apache.hadoop.hbase.wal.TestWALSplitBoundedLogWriterCreation [INFO] Running org.apache.hadoop.hbase.wal.TestWALSplitCompressed [INFO] Tests run: 33, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 46.075 s - in org.apache.hadoop.hbase.wal.TestWALSplitCompressed [INFO] Running org.apache.hadoop.hbase.mapred.TestSplitTable [INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.774 s - in org.apache.hadoop.hbase.mapred.TestSplitTable [INFO] Running org.apache.hadoop.hbase.regionserver.TestRegionSplitPolicy [INFO] Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.836 s - in org.apache.hadoop.hbase.regionserver.TestRegionSplitPolicy [INFO] Running org.apache.hadoop.hbase.regionserver.TestCompactSplitThread [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 30.692 s - in org.apache.hadoop.hbase.regionserver.TestCompactSplitThread [INFO] Running org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster [INFO] Tests run: 17, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 87.588 s - in org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster [INFO] Running org.apache.hadoop.hbase.regionserver.TestSplitWalDataLoss [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 13.796 s - in org.apache.hadoop.hbase.regionserver.TestSplitWalDataLoss [INFO] Running org.apache.hadoop.hbase.regionserver.TestSplitTransaction [INFO] Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 6.14 s - in org.apache.hadoop.hbase.regionserver.TestSplitTransaction [INFO] Running org.apache.hadoop.hbase.regionserver.TestRegionMergeTransaction [INFO] Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 9.715 s - in org.apache.hadoop.hbase.regionserver.TestRegionMergeTransaction [INFO] Running org.apache.hadoop.hbase.regionserver.TestZKLessSplitOnCluster [INFO] Tests run: 17, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 71.721 s - in org.apache.hadoop.hbase.regionserver.TestZKLessSplitOnCluster [INFO] Running org.apache.hadoop.hbase.regionserver.TestEndToEndSplitTransaction [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 59.082 s - in org.apache.hadoop.hbase.regionserver.TestEndToEndSplitTransaction [INFO] Running org.apache.hadoop.hbase.regionserver.TestSplitLogWorker [INFO] Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 24.689 s - in org.apache.hadoop.hbase.regionserver.TestSplitLogWorker [INFO] Running org.apache.hadoop.hbase.regionserver.TestZKLessMergeOnCluster [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 28.068 s - in org.apache.hadoop.hbase.regionserver.TestZKLessMergeOnCluster [INFO] Running org.apache.hadoop.hbase.regionserver.TestRegionMergeTransactionOnCluster [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 35.386 s - in org.apache.hadoop.hbase.regionserver.TestRegionMergeTransactionOnCluster [INFO] Running org.apache.hadoop.hbase.master.TestAssignmentManagerOnCluster [INFO] Tests run: 23, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 185.692 s - in org.apache.hadoop.hbase.master.TestAssignmentManagerOnCluster [INFO] Running org.apache.hadoop.hbase.master.TestDistributedLogSplitting [WARNING] Tests run: 18, Failures: 0, Errors: 0, Skipped: 15, Time elapsed: 90.971 s - in org.apache.hadoop.hbase.master.TestDistributedLogSplitting [INFO]
[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list
[ https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645591#comment-16645591 ] Hadoop QA commented on HBASE-21266: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 2s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} branch-1 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 14s{color} | {color:green} branch-1 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 43s{color} | {color:green} branch-1 passed with JDK v1.8.0_181 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s{color} | {color:green} branch-1 passed with JDK v1.7.0_191 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 30s{color} | {color:green} branch-1 passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 3m 4s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 37s{color} | {color:green} branch-1 passed with JDK v1.8.0_181 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 38s{color} | {color:green} branch-1 passed with JDK v1.7.0_191 {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s{color} | {color:green} the patch passed with JDK v1.8.0_181 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s{color} | {color:green} the patch passed with JDK v1.7.0_191 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 20s{color} | {color:green} hbase-server: The patch generated 0 new + 3 unchanged - 5 fixed = 3 total (was 8) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 2m 39s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 1m 37s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s{color} | {color:green} the patch passed with JDK v1.8.0_181 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 36s{color} | {color:green} the patch passed with JDK v1.7.0_191 {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}103m 49s{color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 23s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}130m 6s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.regionserver.TestZKLessSplitOnCluster | | | hadoop.hbase.regionserver.TestEndToEndSplitTransaction | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:61288f8 | | JIRA Issue | HBASE-21266 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12943284/HBASE-21266-branch-1.patch | | Optional Tests | dupname asflicense javac javadoc unit findbugs
[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list
[ https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645505#comment-16645505 ] Hadoop QA commented on HBASE-21266: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} branch-1 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 4s{color} | {color:green} branch-1 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s{color} | {color:green} branch-1 passed with JDK v1.8.0_181 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s{color} | {color:green} branch-1 passed with JDK v1.7.0_191 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 41s{color} | {color:green} branch-1 passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 3m 17s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 41s{color} | {color:green} branch-1 passed with JDK v1.8.0_181 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 37s{color} | {color:green} branch-1 passed with JDK v1.7.0_191 {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s{color} | {color:green} the patch passed with JDK v1.8.0_181 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 43s{color} | {color:green} the patch passed with JDK v1.7.0_191 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 31s{color} | {color:green} hbase-server: The patch generated 0 new + 3 unchanged - 5 fixed = 3 total (was 8) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 3m 17s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 1m 47s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 33s{color} | {color:green} the patch passed with JDK v1.8.0_181 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 41s{color} | {color:green} the patch passed with JDK v1.7.0_191 {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}107m 5s{color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 27s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}135m 16s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.client.TestReplicasClient | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:61288f8 | | JIRA Issue | HBASE-21266 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12943273/HBASE-21266-branch-1.patch | | Optional Tests | dupname asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux
[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list
[ https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645489#comment-16645489 ] Hadoop QA commented on HBASE-21266: --- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} branch-1 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 39s{color} | {color:green} branch-1 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 34s{color} | {color:green} branch-1 passed with JDK v1.8.0_181 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s{color} | {color:green} branch-1 passed with JDK v1.7.0_191 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 17s{color} | {color:green} branch-1 passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 2m 37s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s{color} | {color:green} branch-1 passed with JDK v1.8.0_181 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 35s{color} | {color:green} branch-1 passed with JDK v1.7.0_191 {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 35s{color} | {color:green} the patch passed with JDK v1.8.0_181 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s{color} | {color:green} the patch passed with JDK v1.7.0_191 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 17s{color} | {color:green} hbase-server: The patch generated 0 new + 3 unchanged - 5 fixed = 3 total (was 8) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 2m 45s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 1m 41s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s{color} | {color:green} the patch passed with JDK v1.8.0_181 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 36s{color} | {color:green} the patch passed with JDK v1.7.0_191 {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green}111m 17s{color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 31s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}130m 0s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:61288f8 | | JIRA Issue | HBASE-21266 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12943273/HBASE-21266-branch-1.patch | | Optional Tests | dupname asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux 1bd0a354d67a 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 10:45:36 UTC 2018 x86_64
[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list
[ https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645461#comment-16645461 ] Andrew Purtell commented on HBASE-21266: With latest patch this issue cannot be reproduced and the AM is stable in ITBLL testing 500M rows with serverKilling chaos policy, which completes successfully. Verified with added debug logging in DeadServer, periodic hbck invocation (cluster always returned to a 0 inconsistencies detected state), and periodic balancer invocation, and the unit test suite. We no longer rely on an integer counter and boolean to track the processing status of dead servers. Instead DeadServer uses a Set from which expected state checks are derived, logging is improved, and there is a new runtime visible assert for incorrect API usage (which doesn't assert in any testing). > Not running balancer because processing dead regionservers, but empty dead rs > list > -- > > Key: HBASE-21266 > URL: https://issues.apache.org/jira/browse/HBASE-21266 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.8 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Major > Fix For: 1.5.0, 1.4.9 > > Attachments: HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch > > > Found during ITBLL testing. AM in master gets into a state where manual > attempts from the shell to run the balancer always return false and this is > printed in the master log: > 2018-10-03 19:17:14,892 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: > Not running balancer because processing dead regionserver(s): > Note the empty list. > This errant state did not recover without intervention by way of master > restart, but the test environment was chaotic so needs investigation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list
[ https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645255#comment-16645255 ] Andrew Purtell commented on HBASE-21266: Update patch to address checkstyle ImportOrder warning > Not running balancer because processing dead regionservers, but empty dead rs > list > -- > > Key: HBASE-21266 > URL: https://issues.apache.org/jira/browse/HBASE-21266 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.8 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Major > Fix For: 1.5.0, 1.4.9 > > Attachments: HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch > > > Found during ITBLL testing. AM in master gets into a state where manual > attempts from the shell to run the balancer always return false and this is > printed in the master log: > 2018-10-03 19:17:14,892 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: > Not running balancer because processing dead regionserver(s): > Note the empty list. > This errant state did not recover without intervention by way of master > restart, but the test environment was chaotic so needs investigation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list
[ https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645253#comment-16645253 ] Andrew Purtell commented on HBASE-21266: All unit tests pass for me locally. The remaining failures in precommit do not seem related to this change but I'll come back and look at them again. Now moving to ITBLL testing of the latest. > Not running balancer because processing dead regionservers, but empty dead rs > list > -- > > Key: HBASE-21266 > URL: https://issues.apache.org/jira/browse/HBASE-21266 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.8 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Major > Fix For: 1.5.0, 1.4.9 > > Attachments: HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch > > > Found during ITBLL testing. AM in master gets into a state where manual > attempts from the shell to run the balancer always return false and this is > printed in the master log: > 2018-10-03 19:17:14,892 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: > Not running balancer because processing dead regionserver(s): > Note the empty list. > This errant state did not recover without intervention by way of master > restart, but the test environment was chaotic so needs investigation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list
[ https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16644456#comment-16644456 ] Hadoop QA commented on HBASE-21266: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 33s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} branch-1 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 46s{color} | {color:green} branch-1 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 34s{color} | {color:green} branch-1 passed with JDK v1.8.0_181 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 35s{color} | {color:green} branch-1 passed with JDK v1.7.0_191 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 15s{color} | {color:green} branch-1 passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 2m 35s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s{color} | {color:green} branch-1 passed with JDK v1.8.0_181 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 34s{color} | {color:green} branch-1 passed with JDK v1.7.0_191 {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 33s{color} | {color:green} the patch passed with JDK v1.8.0_181 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 35s{color} | {color:green} the patch passed with JDK v1.7.0_191 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 35s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 13s{color} | {color:red} hbase-server: The patch generated 1 new + 3 unchanged - 5 fixed = 4 total (was 8) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 2m 34s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 1m 41s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s{color} | {color:green} the patch passed with JDK v1.8.0_181 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 33s{color} | {color:green} the patch passed with JDK v1.7.0_191 {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}146m 17s{color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 22s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}170m 48s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.mapreduce.TestLoadIncrementalHFilesUseSecurityEndPoint | | | hadoop.hbase.mapreduce.TestLoadIncrementalHFiles | | | hadoop.hbase.master.procedure.TestWALProcedureStoreOnHDFS | | | hadoop.hbase.mapreduce.TestSecureLoadIncrementalHFiles | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:61288f8 | | JIRA Issue | HBASE-21266 | | JIRA Patch URL |
[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list
[ https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16644373#comment-16644373 ] Xu Cang commented on HBASE-21266: - You are correct, Andrew. Thanks > Not running balancer because processing dead regionservers, but empty dead rs > list > -- > > Key: HBASE-21266 > URL: https://issues.apache.org/jira/browse/HBASE-21266 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.8 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Major > Fix For: 1.5.0, 1.4.9 > > Attachments: HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch > > > Found during ITBLL testing. AM in master gets into a state where manual > attempts from the shell to run the balancer always return false and this is > printed in the master log: > 2018-10-03 19:17:14,892 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: > Not running balancer because processing dead regionserver(s): > Note the empty list. > This errant state did not recover without intervention by way of master > restart, but the test environment was chaotic so needs investigation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list
[ https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16644333#comment-16644333 ] Andrew Purtell commented on HBASE-21266: bq. Two threads call #add and #finish respectively at the same time, though synchronized keyword helps nothing in this case. Hmm. My understanding is the Java memory model guarantees when one thread is executing a synchronized method all other threads that invoke synchronized methods for the same object will block trying to acquire the object monitor, until the first thread is finished executing the method. > Not running balancer because processing dead regionservers, but empty dead rs > list > -- > > Key: HBASE-21266 > URL: https://issues.apache.org/jira/browse/HBASE-21266 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.8 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Major > Fix For: 1.5.0, 1.4.9 > > Attachments: HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch > > > Found during ITBLL testing. AM in master gets into a state where manual > attempts from the shell to run the balancer always return false and this is > printed in the master log: > 2018-10-03 19:17:14,892 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: > Not running balancer because processing dead regionserver(s): > Note the empty list. > This errant state did not recover without intervention by way of master > restart, but the test environment was chaotic so needs investigation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list
[ https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16644324#comment-16644324 ] Andrew Purtell commented on HBASE-21266: Back to PA > Not running balancer because processing dead regionservers, but empty dead rs > list > -- > > Key: HBASE-21266 > URL: https://issues.apache.org/jira/browse/HBASE-21266 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.8 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Major > Fix For: 1.5.0, 1.4.9 > > Attachments: HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch > > > Found during ITBLL testing. AM in master gets into a state where manual > attempts from the shell to run the balancer always return false and this is > printed in the master log: > 2018-10-03 19:17:14,892 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: > Not running balancer because processing dead regionserver(s): > Note the empty list. > This errant state did not recover without intervention by way of master > restart, but the test environment was chaotic so needs investigation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list
[ https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16644317#comment-16644317 ] Andrew Purtell commented on HBASE-21266: Yeah those precommit failures were expected. Let me set this back to open until ready > Not running balancer because processing dead regionservers, but empty dead rs > list > -- > > Key: HBASE-21266 > URL: https://issues.apache.org/jira/browse/HBASE-21266 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.8 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Major > Fix For: 1.5.0, 1.4.9 > > Attachments: HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch > > > Found during ITBLL testing. AM in master gets into a state where manual > attempts from the shell to run the balancer always return false and this is > printed in the master log: > 2018-10-03 19:17:14,892 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: > Not running balancer because processing dead regionserver(s): > Note the empty list. > This errant state did not recover without intervention by way of master > restart, but the test environment was chaotic so needs investigation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list
[ https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16644314#comment-16644314 ] Andrew Purtell commented on HBASE-21266: Better WIP patch. Same as previous WIP patch but toString dumps servers in both deadServers and processingServers sets, and puts an asterisk next to servers in the processing set, better for debugging > Not running balancer because processing dead regionservers, but empty dead rs > list > -- > > Key: HBASE-21266 > URL: https://issues.apache.org/jira/browse/HBASE-21266 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.8 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Major > Fix For: 1.5.0, 1.4.9 > > Attachments: HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch > > > Found during ITBLL testing. AM in master gets into a state where manual > attempts from the shell to run the balancer always return false and this is > printed in the master log: > 2018-10-03 19:17:14,892 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: > Not running balancer because processing dead regionserver(s): > Note the empty list. > This errant state did not recover without intervention by way of master > restart, but the test environment was chaotic so needs investigation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list
[ https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16644313#comment-16644313 ] Hadoop QA commented on HBASE-21266: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 33s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:orange}-0{color} | {color:orange} test4tests {color} | {color:orange} 0m 0s{color} | {color:orange} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} branch-1 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 17s{color} | {color:green} branch-1 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s{color} | {color:green} branch-1 passed with JDK v1.8.0_181 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 35s{color} | {color:green} branch-1 passed with JDK v1.7.0_191 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 15s{color} | {color:green} branch-1 passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 2m 24s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s{color} | {color:green} branch-1 passed with JDK v1.8.0_181 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s{color} | {color:green} branch-1 passed with JDK v1.7.0_191 {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s{color} | {color:green} the patch passed with JDK v1.8.0_181 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 36s{color} | {color:green} the patch passed with JDK v1.7.0_191 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 11s{color} | {color:red} hbase-server: The patch generated 2 new + 1 unchanged - 5 fixed = 3 total (was 6) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 2m 27s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 1m 34s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s{color} | {color:green} the patch passed with JDK v1.8.0_181 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 34s{color} | {color:green} the patch passed with JDK v1.7.0_191 {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}139m 21s{color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 26s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}163m 40s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.security.visibility.TestVisibilityLabelsWithACL | | | hadoop.hbase.master.TestDeadServer | | | hadoop.hbase.mapreduce.TestSecureLoadIncrementalHFiles | | | hadoop.hbase.master.TestMasterBalanceThrottling | | | hadoop.hbase.mapreduce.TestLoadIncrementalHFilesUseSecurityEndPoint | | |
[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list
[ https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16644286#comment-16644286 ] Xu Cang commented on HBASE-21266: - how about making 'deadServers' concurrentHashMap? > Not running balancer because processing dead regionservers, but empty dead rs > list > -- > > Key: HBASE-21266 > URL: https://issues.apache.org/jira/browse/HBASE-21266 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.8 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Major > Fix For: 1.5.0, 1.4.9 > > Attachments: HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch > > > Found during ITBLL testing. AM in master gets into a state where manual > attempts from the shell to run the balancer always return false and this is > printed in the master log: > 2018-10-03 19:17:14,892 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: > Not running balancer because processing dead regionserver(s): > Note the empty list. > This errant state did not recover without intervention by way of master > restart, but the test environment was chaotic so needs investigation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list
[ https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16644281#comment-16644281 ] Andrew Purtell commented on HBASE-21266: Still seeing issues in ITBLL testing > Not running balancer because processing dead regionservers, but empty dead rs > list > -- > > Key: HBASE-21266 > URL: https://issues.apache.org/jira/browse/HBASE-21266 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.8 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Major > Fix For: 1.5.0, 1.4.9 > > Attachments: HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch > > > Found during ITBLL testing. AM in master gets into a state where manual > attempts from the shell to run the balancer always return false and this is > printed in the master log: > 2018-10-03 19:17:14,892 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: > Not running balancer because processing dead regionserver(s): > Note the empty list. > This errant state did not recover without intervention by way of master > restart, but the test environment was chaotic so needs investigation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list
[ https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16644265#comment-16644265 ] Andrew Purtell commented on HBASE-21266: I think Mingliang's point is because every method is synchronized there can never be more than one thread accessing or updating {{numProcessing}} at once. If someone changes that there should be a findbugs warning about a partially synchronized class. That said I'm testing what we have now and it looks good, loathe to make another change if not really needed. > Not running balancer because processing dead regionservers, but empty dead rs > list > -- > > Key: HBASE-21266 > URL: https://issues.apache.org/jira/browse/HBASE-21266 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.8 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Major > Fix For: 1.5.0, 1.4.9 > > Attachments: HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch > > > Found during ITBLL testing. AM in master gets into a state where manual > attempts from the shell to run the balancer always return false and this is > printed in the master log: > 2018-10-03 19:17:14,892 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: > Not running balancer because processing dead regionserver(s): > Note the empty list. > This errant state did not recover without intervention by way of master > restart, but the test environment was chaotic so needs investigation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list
[ https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16644226#comment-16644226 ] Xu Cang commented on HBASE-21266: - {quote}If all access to {{numProcessing}} is {{synchronized}}, we don't need the {{AtomicInteger}}. {quote} But ++ / - - is not thread-safe for integer. It's still possible gets caught by race condition IMO> > Not running balancer because processing dead regionservers, but empty dead rs > list > -- > > Key: HBASE-21266 > URL: https://issues.apache.org/jira/browse/HBASE-21266 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.8 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Major > Fix For: 1.5.0, 1.4.9 > > Attachments: HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch > > > Found during ITBLL testing. AM in master gets into a state where manual > attempts from the shell to run the balancer always return false and this is > printed in the master log: > 2018-10-03 19:17:14,892 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: > Not running balancer because processing dead regionserver(s): > Note the empty list. > This errant state did not recover without intervention by way of master > restart, but the test environment was chaotic so needs investigation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list
[ https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16644215#comment-16644215 ] Andrew Purtell commented on HBASE-21266: No but there’s no harm in this case either, like a perf issue. I don’t have a strong opinion either way. > Not running balancer because processing dead regionservers, but empty dead rs > list > -- > > Key: HBASE-21266 > URL: https://issues.apache.org/jira/browse/HBASE-21266 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.8 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Major > Fix For: 1.5.0, 1.4.9 > > Attachments: HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch > > > Found during ITBLL testing. AM in master gets into a state where manual > attempts from the shell to run the balancer always return false and this is > printed in the master log: > 2018-10-03 19:17:14,892 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: > Not running balancer because processing dead regionserver(s): > Note the empty list. > This errant state did not recover without intervention by way of master > restart, but the test environment was chaotic so needs investigation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list
[ https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16644204#comment-16644204 ] Mingliang Liu commented on HBASE-21266: --- If all access to {{numProcessing}} is {{synchronized}}, we don't need the {{AtomicInteger}}. > Not running balancer because processing dead regionservers, but empty dead rs > list > -- > > Key: HBASE-21266 > URL: https://issues.apache.org/jira/browse/HBASE-21266 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.8 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Major > Fix For: 1.5.0, 1.4.9 > > Attachments: HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch > > > Found during ITBLL testing. AM in master gets into a state where manual > attempts from the shell to run the balancer always return false and this is > printed in the master log: > 2018-10-03 19:17:14,892 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: > Not running balancer because processing dead regionserver(s): > Note the empty list. > This errant state did not recover without intervention by way of master > restart, but the test environment was chaotic so needs investigation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list
[ https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16644203#comment-16644203 ] Xu Cang commented on HBASE-21266: - The new fix (incrementAndGet in #add) makes sense to me. +1 > Not running balancer because processing dead regionservers, but empty dead rs > list > -- > > Key: HBASE-21266 > URL: https://issues.apache.org/jira/browse/HBASE-21266 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.8 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Major > Fix For: 1.5.0, 1.4.9 > > Attachments: HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch > > > Found during ITBLL testing. AM in master gets into a state where manual > attempts from the shell to run the balancer always return false and this is > printed in the master log: > 2018-10-03 19:17:14,892 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: > Not running balancer because processing dead regionserver(s): > Note the empty list. > This errant state did not recover without intervention by way of master > restart, but the test environment was chaotic so needs investigation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list
[ https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16644169#comment-16644169 ] Andrew Purtell commented on HBASE-21266: Updated patch. Improved logging > Not running balancer because processing dead regionservers, but empty dead rs > list > -- > > Key: HBASE-21266 > URL: https://issues.apache.org/jira/browse/HBASE-21266 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.8 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Major > Fix For: 1.5.0, 1.4.9 > > Attachments: HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch > > > Found during ITBLL testing. AM in master gets into a state where manual > attempts from the shell to run the balancer always return false and this is > printed in the master log: > 2018-10-03 19:17:14,892 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: > Not running balancer because processing dead regionserver(s): > Note the empty list. > This errant state did not recover without intervention by way of master > restart, but the test environment was chaotic so needs investigation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list
[ https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16644164#comment-16644164 ] Andrew Purtell commented on HBASE-21266: Updated patch. One of those mistakes that are obvious in retrospect, sorry about that. Spinning up for more ITBLL testing. Also have kicked off the test suite again locally and precommit checks here > Not running balancer because processing dead regionservers, but empty dead rs > list > -- > > Key: HBASE-21266 > URL: https://issues.apache.org/jira/browse/HBASE-21266 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.8 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Major > Fix For: 1.5.0, 1.4.9 > > Attachments: HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch > > > Found during ITBLL testing. AM in master gets into a state where manual > attempts from the shell to run the balancer always return false and this is > printed in the master log: > 2018-10-03 19:17:14,892 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: > Not running balancer because processing dead regionserver(s): > Note the empty list. > This errant state did not recover without intervention by way of master > restart, but the test environment was chaotic so needs investigation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list
[ https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16644143#comment-16644143 ] Andrew Purtell commented on HBASE-21266: This is going to need more work. With this change in place a unit in TestAssignmentManagerOnCluster will fail. Also, in ITBLL test scenarios with serverKilling policy, if the master is terminated while a region is splitting upon restart we can get this: 2018-10-09 20:59:46,242 WARN [ip-172-31-5-95:8100.activeMasterManager] master.AssignmentManager: Dropped splitting! Not in state good for SPLITTING; rs_p={332d04e88521c71ea4505592e434c9d1 state=SPLITTING, ts=1539118786241, server=ip-172-31-13-83.us-west-2.compute.internal,8120,1539118587733}, rs_a={1bbe77be39dfd903b31d00b98b02f842 state=OFFLINE, ts=1539118786229, server=null}, rs_b={6d6f67867f14d37c4fe35f3fe23f6cd8 state=OFFLINE, ts=1539118786230, server=null} and the daughter regions will remain unassigned and unavailable, requiring hbck -fixAssignments. I think I see a mistake in the patch. Let me try again. > Not running balancer because processing dead regionservers, but empty dead rs > list > -- > > Key: HBASE-21266 > URL: https://issues.apache.org/jira/browse/HBASE-21266 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.8 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Major > Fix For: 1.5.0, 1.4.9 > > Attachments: HBASE-21266-branch-1.patch > > > Found during ITBLL testing. AM in master gets into a state where manual > attempts from the shell to run the balancer always return false and this is > printed in the master log: > 2018-10-03 19:17:14,892 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: > Not running balancer because processing dead regionserver(s): > Note the empty list. > This errant state did not recover without intervention by way of master > restart, but the test environment was chaotic so needs investigation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list
[ https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16643838#comment-16643838 ] Andrew Purtell commented on HBASE-21266: I plan to commit this today unless objection. I have run a couple of ITBLL workloads with serverKilling and stressAM policy with a shell invoking the balancer every minute. No issues with dead server processing observed. The earlier observed problem does not reproduce. This isn't a positive test for that change, though, because I think it was a race condition, but it is a negative test in the sense that it is very unlikely we broke the AM with this change. All unit tests pass. > Not running balancer because processing dead regionservers, but empty dead rs > list > -- > > Key: HBASE-21266 > URL: https://issues.apache.org/jira/browse/HBASE-21266 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.8 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Major > Fix For: 1.5.0, 1.4.9 > > Attachments: HBASE-21266-branch-1.patch > > > Found during ITBLL testing. AM in master gets into a state where manual > attempts from the shell to run the balancer always return false and this is > printed in the master log: > 2018-10-03 19:17:14,892 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: > Not running balancer because processing dead regionserver(s): > Note the empty list. > This errant state did not recover without intervention by way of master > restart, but the test environment was chaotic so needs investigation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list
[ https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16642801#comment-16642801 ] Mingliang Liu commented on HBASE-21266: --- {{areDeadServersInProgress()}} does not have to be {{synchronized}}. Otherwise +1 (non-binding) > Not running balancer because processing dead regionservers, but empty dead rs > list > -- > > Key: HBASE-21266 > URL: https://issues.apache.org/jira/browse/HBASE-21266 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.8 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Major > Fix For: 1.5.0, 1.4.9 > > Attachments: HBASE-21266-branch-1.patch > > > Found during ITBLL testing. AM in master gets into a state where manual > attempts from the shell to run the balancer always return false and this is > printed in the master log: > 2018-10-03 19:17:14,892 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: > Not running balancer because processing dead regionserver(s): > Note the empty list. > This errant state did not recover without intervention by way of master > restart, but the test environment was chaotic so needs investigation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list
[ https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16639109#comment-16639109 ] Xu Cang commented on HBASE-21266: - the patch LGTM. +1 > Not running balancer because processing dead regionservers, but empty dead rs > list > -- > > Key: HBASE-21266 > URL: https://issues.apache.org/jira/browse/HBASE-21266 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.8 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Major > Fix For: 1.5.0, 1.4.9 > > Attachments: HBASE-21266-branch-1.patch > > > Found during ITBLL testing. AM in master gets into a state where manual > attempts from the shell to run the balancer always return false and this is > printed in the master log: > 2018-10-03 19:17:14,892 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: > Not running balancer because processing dead regionserver(s): > Note the empty list. > This errant state did not recover without intervention by way of master > restart, but the test environment was chaotic so needs investigation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list
[ https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16638704#comment-16638704 ] Andrew Purtell commented on HBASE-21266: Agreed. > Not running balancer because processing dead regionservers, but empty dead rs > list > -- > > Key: HBASE-21266 > URL: https://issues.apache.org/jira/browse/HBASE-21266 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.8 >Reporter: Andrew Purtell >Priority: Major > Fix For: 1.5.0, 1.4.9 > > > Found during ITBLL testing. AM in master gets into a state where manual > attempts from the shell to run the balancer always return false and this is > printed in the master log: > 2018-10-03 19:17:14,892 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: > Not running balancer because processing dead regionserver(s): > Note the empty list. > This errant state did not recover without intervention by way of master > restart, but the test environment was chaotic so needs investigation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list
[ https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16638671#comment-16638671 ] Xu Cang commented on HBASE-21266: - yes, I had the same doubt, 'processing' is redundant. Also, should we make 'numProcessing' AtomicInteger? Since numProcessing++ and numProcessing-- are not thread-safe. And they can be called interleaved in #notifyServer and #finish > Not running balancer because processing dead regionservers, but empty dead rs > list > -- > > Key: HBASE-21266 > URL: https://issues.apache.org/jira/browse/HBASE-21266 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.8 >Reporter: Andrew Purtell >Priority: Major > Fix For: 1.5.0, 1.4.9 > > > Found during ITBLL testing. AM in master gets into a state where manual > attempts from the shell to run the balancer always return false and this is > printed in the master log: > 2018-10-03 19:17:14,892 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: > Not running balancer because processing dead regionserver(s): > Note the empty list. > This errant state did not recover without intervention by way of master > restart, but the test environment was chaotic so needs investigation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list
[ https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16638614#comment-16638614 ] Andrew Purtell commented on HBASE-21266: Let me make the above proposed changes and run another test with 'stressAM' and 'serverKilling' chaos policies. > Not running balancer because processing dead regionservers, but empty dead rs > list > -- > > Key: HBASE-21266 > URL: https://issues.apache.org/jira/browse/HBASE-21266 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.8 >Reporter: Andrew Purtell >Priority: Major > Fix For: 1.5.0, 1.4.9 > > > Found during ITBLL testing. AM in master gets into a state where manual > attempts from the shell to run the balancer always return false and this is > printed in the master log: > 2018-10-03 19:17:14,892 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: > Not running balancer because processing dead regionserver(s): > Note the empty list. > This errant state did not recover without intervention by way of master > restart, but the test environment was chaotic so needs investigation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list
[ https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16638611#comment-16638611 ] Andrew Purtell commented on HBASE-21266: bq. "Number of dead servers in processing should always be non-negative" You are looking at that assert in DeadServer#finish, right? Those aren't evaulated unless the JVM is started with the -ea command line flag, which I didn't do. We can see from the log line I did see that the dead server map was empty at the time so I agree we should look at accounting in DeadServer.java. "Not running balancer because processing dead regionserver(s)" is printed from HMaster.java:1846 based on the result from ServerManager#areDeadServersInProgress, which passes through the result from DeadServer#areDeadServersInProgress, which is simply {code} public synchronized boolean areDeadServersInProgress() { return processing; } {code} This boolean is cleared in DeadServer#finish when {code} if (numProcessing == 0) { processing = false; } {code} So the first question I have is why do we even need this boolean field? It can easily be derived cheaply from other state. In areDeadServersInProgress just return the result of {{numProcessing == 0}}. That assert you observed should be replaced by use of Preconditions so we will get a RuntimeException that will get noticed. > Not running balancer because processing dead regionservers, but empty dead rs > list > -- > > Key: HBASE-21266 > URL: https://issues.apache.org/jira/browse/HBASE-21266 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.8 >Reporter: Andrew Purtell >Priority: Major > Fix For: 1.5.0, 1.4.9 > > > Found during ITBLL testing. AM in master gets into a state where manual > attempts from the shell to run the balancer always return false and this is > printed in the master log: > 2018-10-03 19:17:14,892 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: > Not running balancer because processing dead regionserver(s): > Note the empty list. > This errant state did not recover without intervention by way of master > restart, but the test environment was chaotic so needs investigation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list
[ https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16637939#comment-16637939 ] Xu Cang commented on HBASE-21266: - Took a quick look at the related code, I have one question, [~apurtell] did you see this line in log "Number of dead servers in processing should always be non-negative"? If so, it could be a race condition happened to int 'numProcessing' or hashMap 'deadServers'. > Not running balancer because processing dead regionservers, but empty dead rs > list > -- > > Key: HBASE-21266 > URL: https://issues.apache.org/jira/browse/HBASE-21266 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.8 >Reporter: Andrew Purtell >Priority: Major > Fix For: 1.5.0, 1.4.9 > > > Found during ITBLL testing. AM in master gets into a state where manual > attempts from the shell to run the balancer always return false and this is > printed in the master log: > 2018-10-03 19:17:14,892 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: > Not running balancer because processing dead regionserver(s): > Note the empty list. > This errant state did not recover without intervention by way of master > restart, but the test environment was chaotic so needs investigation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)