[
https://issues.apache.org/jira/browse/HBASE-20671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
stack updated HBASE-20671:
--------------------------
Fix Version/s: (was: 2.0.1)
2.0.2
> Merged region brought back to life causing RS to be killed by Master
> --------------------------------------------------------------------
>
> Key: HBASE-20671
> URL: https://issues.apache.org/jira/browse/HBASE-20671
> Project: HBase
> Issue Type: Bug
> Components: amv2
> Affects Versions: 2.0.0
> Reporter: Josh Elser
> Assignee: Josh Elser
> Priority: Critical
> Fix For: 2.0.2
>
> Attachments: 0001-Test-for-HBASE-20671.patch,
> hbase-hbase-master-ctr-e138-1518143905142-336066-01-000003.hwx.site.log.zip,
> hbase-hbase-regionserver-ctr-e138-1518143905142-336066-01-000002.hwx.site.log.zip,
> workaround.txt
>
>
> Another bug coming out of a master restart and replay of the pv2 logs.
> The master merged two regions into one successfully, was restarted, but then
> ended up assigning the children region back out to the cluster. There is a
> log message which appears to indicate that RegionStates acknowledges that it
> doesn't know what this region is as it's replaying the pv2 WAL; however, it
> incorrectly assumes that the region is just OFFLINE and needs to be assigned.
> {noformat}
> 2018-05-30 04:26:00,055 INFO
> [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=20000] master.HMaster:
> Client=hrt_qa//172.27.85.11 Merge regions a7dd6606dcacc9daf085fc9fa2aecc0c
> and 4017a3c778551d4d258c785d455f9c0b
> 2018-05-30 04:28:27,525 DEBUG
> [master/ctr-e138-1518143905142-336066-01-000003:20000]
> procedure2.ProcedureExecutor: Completed pid=4368, state=SUCCESS;
> MergeTableRegionsProcedure table=tabletwo_merge,
> regions=[a7dd6606dcacc9daf085fc9fa2aecc0c, 4017a3c778551d4d258c785d455f9c0b],
> forcibly=false
> {noformat}
> {noformat}
> 2018-05-30 04:29:20,263 INFO
> [master/ctr-e138-1518143905142-336066-01-000003:20000]
> assignment.AssignmentManager: a7dd6606dcacc9daf085fc9fa2aecc0c
> regionState=null; presuming OFFLINE
> 2018-05-30 04:29:20,263 INFO
> [master/ctr-e138-1518143905142-336066-01-000003:20000]
> assignment.RegionStates: Added to offline, CURRENTLY NEVER CLEARED!!!
> rit=OFFLINE, location=null, table=tabletwo_merge,
> region=a7dd6606dcacc9daf085fc9fa2aecc0c
> 2018-05-30 04:29:20,266 INFO
> [master/ctr-e138-1518143905142-336066-01-000003:20000]
> assignment.AssignmentManager: 4017a3c778551d4d258c785d455f9c0b
> regionState=null; presuming OFFLINE
> 2018-05-30 04:29:20,266 INFO
> [master/ctr-e138-1518143905142-336066-01-000003:20000]
> assignment.RegionStates: Added to offline, CURRENTLY NEVER CLEARED!!!
> rit=OFFLINE, location=null, table=tabletwo_merge,
> region=4017a3c778551d4d258c785d455f9c0b
> {noformat}
> Eventually, the RS reports in its online regions, and the master tells it to
> kill itself:
> {noformat}
> 2018-05-30 04:29:24,272 WARN
> [RpcServer.default.FPBQ.Fifo.handler=26,queue=2,port=20000]
> assignment.AssignmentManager: Killing
> ctr-e138-1518143905142-336066-01-000002.hwx.site,16020,1527654546619: Not
> online: tabletwo_merge,,1527652130538.a7dd6606dcacc9daf085fc9fa2aecc0c.
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)