[ https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16647107#comment-16647107 ]
Andrew Purtell edited comment on HBASE-21266 at 10/11/18 10:22 PM: ------------------------------------------------------------------- TestZKLessSplitOnCluster.testSSHCleanupDaugtherRegionsOfAbortedSplit does hand-rolled waits with 10 ms sleeps. Rewrote those to use Waiter#waitFor with the same timeout and period values of other uses of Waiter#waitFor in this unit. TestEndToEndSplitTransaction.testFromClientSideWhileSplitting utilizes a chore named RegionChecker also with a 10 ms interval, increasing this to 100. This isn't necessary beyond the fact that sleep(10) is obnoxious. Might as well just be a yield() or a spin-wait. There are three instances of sleep(10) in this unit, changed to sleep(100) which IMHO is the smallest reasonable value you want to use if doing short waits. was (Author: apurtell): TestZKLessSplitOnCluster.testSSHCleanupDaugtherRegionsOfAbortedSplit does hand-rolled waits with 10 ms sleeps. Rewrote those to use Waiter#waitFor with the same timeout and period values of other uses of Waiter#waitFor in this unit. TestEndToEndSplitTransaction.testFromClientSideWhileSplitting utilizes a chore named RegionChecker also with a 10 ms interval, increasing this to 100. This isn't necessary beyond the fact that sleep(10) is obnoxious. Might as well just be a yield() or a spin-wait. > Not running balancer because processing dead regionservers, but empty dead rs > list > ---------------------------------------------------------------------------------- > > Key: HBASE-21266 > URL: https://issues.apache.org/jira/browse/HBASE-21266 > Project: HBase > Issue Type: Bug > Affects Versions: 1.4.8 > Reporter: Andrew Purtell > Assignee: Andrew Purtell > Priority: Major > Fix For: 1.5.0, 1.4.9 > > Attachments: HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch > > > Found during ITBLL testing. AM in master gets into a state where manual > attempts from the shell to run the balancer always return false and this is > printed in the master log: > 2018-10-03 19:17:14,892 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: > Not running balancer because processing dead regionserver(s): > Note the empty list. > This errant state did not recover without intervention by way of master > restart, but the test environment was chaotic so needs investigation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)