[
https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16644143#comment-16644143
]
Andrew Purtell commented on HBASE-21266:
----------------------------------------
This is going to need more work.
With this change in place a unit in TestAssignmentManagerOnCluster will fail.
Also, in ITBLL test scenarios with serverKilling policy, if the master is
terminated while a region is splitting upon restart we can get this:
2018-10-09 20:59:46,242 WARN [ip-172-31-5-95:8100.activeMasterManager]
master.AssignmentManager: Dropped splitting! Not in state good for SPLITTING;
rs_p={332d04e88521c71ea4505592e434c9d1 state=SPLITTING, ts=1539118786241,
server=ip-172-31-13-83.us-west-2.compute.internal,8120,1539118587733},
rs_a={1bbe77be39dfd903b31d00b98b02f842 state=OFFLINE, ts=1539118786229,
server=null}, rs_b={6d6f67867f14d37c4fe35f3fe23f6cd8 state=OFFLINE,
ts=1539118786230, server=null}
and the daughter regions will remain unassigned and unavailable, requiring hbck
-fixAssignments.
I think I see a mistake in the patch. Let me try again.
> Not running balancer because processing dead regionservers, but empty dead rs
> list
> ----------------------------------------------------------------------------------
>
> Key: HBASE-21266
> URL: https://issues.apache.org/jira/browse/HBASE-21266
> Project: HBase
> Issue Type: Bug
> Affects Versions: 1.4.8
> Reporter: Andrew Purtell
> Assignee: Andrew Purtell
> Priority: Major
> Fix For: 1.5.0, 1.4.9
>
> Attachments: HBASE-21266-branch-1.patch
>
>
> Found during ITBLL testing. AM in master gets into a state where manual
> attempts from the shell to run the balancer always return false and this is
> printed in the master log:
> 2018-10-03 19:17:14,892 DEBUG
> [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster:
> Not running balancer because processing dead regionserver(s):
> Note the empty list.
> This errant state did not recover without intervention by way of master
> restart, but the test environment was chaotic so needs investigation.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)