[jira] [Comment Edited] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list

Andrew Purtell (JIRA) Thu, 11 Oct 2018 15:23:06 -0700


    [ 
https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16647107#comment-16647107
 ]


Andrew Purtell edited comment on HBASE-21266 at 10/11/18 10:22 PM:
-------------------------------------------------------------------

TestZKLessSplitOnCluster.testSSHCleanupDaugtherRegionsOfAbortedSplit does 
hand-rolled waits with 10 ms sleeps. Rewrote those to use Waiter#waitFor with 
the same timeout and period values of other uses of Waiter#waitFor in this 
unit. 

TestEndToEndSplitTransaction.testFromClientSideWhileSplitting utilizes a chore 
named RegionChecker also with a 10 ms interval, increasing this to 100. This 
isn't necessary beyond the fact that sleep(10) is obnoxious. Might as well just 
be a yield() or a spin-wait. There are three instances of sleep(10) in this 
unit, changed to sleep(100) which IMHO is the smallest reasonable value you 
want to use if doing short waits.


was (Author: apurtell):
TestZKLessSplitOnCluster.testSSHCleanupDaugtherRegionsOfAbortedSplit does 
hand-rolled waits with 10 ms sleeps. Rewrote those to use Waiter#waitFor with 
the same timeout and period values of other uses of Waiter#waitFor in this 
unit. 

TestEndToEndSplitTransaction.testFromClientSideWhileSplitting utilizes a chore 
named RegionChecker also with a 10 ms interval, increasing this to 100. This 
isn't necessary beyond the fact that sleep(10) is obnoxious. Might as well just 
be a yield() or a spin-wait. 

> Not running balancer because processing dead regionservers, but empty dead rs 
> list
> ----------------------------------------------------------------------------------
>
>                 Key: HBASE-21266
>                 URL: https://issues.apache.org/jira/browse/HBASE-21266
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 1.4.8
>            Reporter: Andrew Purtell
>            Assignee: Andrew Purtell
>            Priority: Major
>             Fix For: 1.5.0, 1.4.9
>
>         Attachments: HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, 
> HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, 
> HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, 
> HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch
>
>
> Found during ITBLL testing. AM in master gets into a state where manual 
> attempts from the shell to run the balancer always return false and this is 
> printed in the master log:
> 2018-10-03 19:17:14,892 DEBUG 
> [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: 
> Not running balancer because processing dead regionserver(s): 
> Note the empty list. 
> This errant state did not recover without intervention by way of master 
> restart, but the test environment was chaotic so needs investigation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list

Reply via email to