[
https://issues.apache.org/jira/browse/HBASE-2700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12881673#action_12881673
]
Jonathan Gray commented on HBASE-2700:
--------------------------------------
1. Master knows he is a failed-over master.
2. Determine if any RegionServers crashed by diffing ephemeral list with
master-managed list.
If so, process dead servers.
3. Get list of nodes in UNASSIGNED
(CLOSING) -> Wait. This should enter CLOSED eventually.
If timeout, deal with in same way we
would deal with timeout w/o failover
(CLOSED) -> Generate a destination RS. Send that RS an open
message.
The RS who gets the open message will
only open if he can switch it from
CLOSED to OPENING. This ensures an
open only occurs in one place.
(OPENING) -> Wait. This should enter OPENED eventually.
If timeout, deal with in same way we
would deal with timeout w/o failover
(OPENED) -> Remove the znode, this region is not in transition
any more.
> Handle master failover for regions in transition
> ------------------------------------------------
>
> Key: HBASE-2700
> URL: https://issues.apache.org/jira/browse/HBASE-2700
> Project: HBase
> Issue Type: Sub-task
> Components: master, zookeeper
> Reporter: Jonathan Gray
> Priority: Critical
> Fix For: 0.21.0
>
>
> To this point in HBASE-2692 tasks we have moved everything for regions in
> transition into ZK, but we have not fully handled the master failover case.
> This is to deal with that and to write tests for it.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.