[
https://issues.apache.org/jira/browse/HBASE-2482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
stack updated HBASE-2482:
-------------------------
Attachment: 2482-unittest.txt
First cut at unit test. Needs an edit but looks to be working. Adds a
protected 'killl' to regionserver which simulates RS kill (does no cleanup but
shutdown of socket). Also added new HMsg called TEST_MSG_BLOCK_RS. When RS
receives this from master, it just waits until closed, aborted or killed. It
blocks the worker queue.
The way the test works is that it adds a new RS to small cluster, waits on load
balancer to move some regions to new server. As soon as some are open, we send
a close of them all followed by a TEST_MSG_BLOCK_RS. The closes go through,
balancer assigns the new server some of the closed regions only the
TEST_MSG_BLOCK_RS is in place.
Let me make a patch that includes Todds patch and cleaned up test next.
> regions in transition do not get reassigned by master when RS crashes
> ---------------------------------------------------------------------
>
> Key: HBASE-2482
> URL: https://issues.apache.org/jira/browse/HBASE-2482
> Project: Hadoop HBase
> Issue Type: Bug
> Components: master
> Affects Versions: 0.20.5
> Reporter: Todd Lipcon
> Assignee: Todd Lipcon
> Priority: Blocker
> Fix For: 0.20.5, 0.21.0
>
> Attachments: 2482-unittest.txt, hbase-2482.txt
>
>
> Very similar to HBASE-1928, but for the general case (not just ROOT/META):
> If a region is in transition on a RS when the RS crashes, the master does not
> remove it from regionsInTransition when processing the RS shutdown. This is
> fairly easy to trigger by bringing up a RS and kill -9ing it just as it
> starts to get regions assigned. Those regions will get permanently lost since
> they're stuck in regionsInTransition and thus don't get assigned by the
> metascanner.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.