[
https://issues.apache.org/jira/browse/HBASE-7701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13565123#comment-13565123
]
ramkrishna.s.vasudevan commented on HBASE-7701:
-----------------------------------------------
bq.Closed regions are not removed from assignments
I think we do this in ClosedRegionHandler. Then in your case what i feel is
before even the call back for CLOSED came to the master the SSH has started for
the RS that had gone down.
Can you attach the entire log. The sequence of events will help us to decode
this bug. May be the ordering of events could tell us something.
> inconsistent state in AssignmentManager for moving region
> ---------------------------------------------------------
>
> Key: HBASE-7701
> URL: https://issues.apache.org/jira/browse/HBASE-7701
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.96.0
> Reporter: Sergey Shelukhin
>
> Closed regions are not removed from assignments. I am not sure if it's a
> general state problem, or just a small bug; for now, one manifestation is
> that moved region is ignored by SSH of the target server if target server
> dies before updating ZK.
> {code}
> 2013-01-22 17:59:00,524 DEBUG [IPC Server handler 3 on 50658]
> master.AssignmentManager(1475): Sent CLOSE to 10.11.2.92,51231,1358906285048
> for region
> IntegrationTestRebalanceAndKillServersTargeted,66666660,1358906196709.0200b366bc37c5afd1185f7d487c7dfb.
> 2013-01-22 17:59:00,997 DEBUG
> [RS_CLOSE_REGION-10.11.2.92,51231,1358906285048-1]
> handler.CloseRegionHandler(167): set region closed state in zk successfully
> for region
> IntegrationTestRebalanceAndKillServersTargeted,66666660,1358906196709.0200b366bc37c5afd1185f7d487c7dfb.
> sn name: 10.11.2.92,51231,1358906285048
> 2013-01-22 17:59:01,088 INFO
> [MASTER_CLOSE_REGION-10.11.2.92,50658,1358906192673-0]
> master.RegionStates(242): Region {NAME =>
> 'IntegrationTestRebalanceAndKillServersTargeted,66666660,1358906196709.0200b366bc37c5afd1185f7d487c7dfb.',
> STARTKEY => '66666660', ENDKEY => '7333332c',
> ENCODED => 0200b366bc37c5afd1185f7d487c7dfb,} transitioned from
> {IntegrationTestRebalanceAndKillServersTargeted,66666660,1358906196709.0200b366bc37c5afd1185f7d487c7dfb.
> state=CLOSED, ts=1358906341087, server=null} to
> {IntegrationTestRebalanceAndKillServersTargeted,66666660,1358906196709.0200b366bc37c5afd1185f7d487c7dfb.
> state=OFFLINE, ts=1358906341088, server=null}
> 2013-01-22 17:59:01,128 INFO
> [MASTER_CLOSE_REGION-10.11.2.92,50658,1358906192673-0]
> master.AssignmentManager(1596): Assigning region
> IntegrationTestRebalanceAndKillServersTargeted,66666660,1358906196709.0200b366bc37c5afd1185f7d487c7dfb.
> to 10.11.2.92,50661,1358906192942
> ... (50661 didn't update ZK to OPEN, only OPENING)
> 2013-01-22 17:59:06,605 INFO
> [MASTER_SERVER_OPERATIONS-10.11.2.92,50658,1358906192673-2]
> handler.ServerShutdownHandler(202): Reassigning 7 region(s) that
> 10.11.2.92,50661,1358906192942 was carrying (skipping 0 regions(s) that are
> already in transition)
> 2013-01-22 17:59:06,605 DEBUG
> [MASTER_SERVER_OPERATIONS-10.11.2.92,50658,1358906192673-2]
> handler.ServerShutdownHandler(219): Skip assigning region
> IntegrationTestRebalanceAndKillServersTargeted,66666660,1358906196709.0200b366bc37c5afd1185f7d487c7dfb.
> because it has been opened in 10.11.2.92,51231,1358906285048
> {code}
> Note the server in the last line - the one that has long closed the region.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira