[
https://issues.apache.org/jira/browse/HBASE-2611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12870922#action_12870922
]
Jean-Daniel Cryans commented on HBASE-2611:
-------------------------------------------
My solution would require the use of 4 "types" of znodes:
- Intention: When a node fails, a machine first writes locally the intention of
locking the other's
- Lock: After writing the intention a node "locks" another's folder of queues
by creating a znode that contains its startcode
- Tag: After sucessfully locking a znode, the winning node puts a znode locally
that gives the locked RS' start code and lists the queues that are going to be
copied
- Delete: After copying everything that was listed in a tag, the node creates
this znode aside the lock to mark those queues as being under deletion (because
there's no atomic recursive delete in ZK)
Then let's say we have the following machines:
Machine A: The first machine that dies
Machine B: The machine that's trying to failover A but that fails while doing
it
Machine C: The machine that's trying to failover B and that acquires the lock
successfully
Here's now what should happen to node C when B fails after step X.
||Machine B fails:||Machine C||
|After writing intention|Reads the intention, sees it doesn't own the lock,
removes the intention and proceeds with failover of just B|
|After writing lock|Reads the intention, sees the lock is owned by B, repeats
the failover process going to the deepest node|
|After writing tag|same, but takes care not to copy what's in the tag when
doing the failover of B|
|After copying some znodes|same behavior, as long as the tag is there what's in
it is considered dirty|
|After writing delete marker|Reads the intention, sees the lock, sees the
delete marker, finishes deleting all znodes, deletes the tag, then failovers B|
|After deleting the local tag|Reads the intention, sees the znode is gone,
deletes the intention znode, failovers B|
|After deleting the intention|Basic failover|
> Handle RS that fails while processing the failure of another one
> ----------------------------------------------------------------
>
> Key: HBASE-2611
> URL: https://issues.apache.org/jira/browse/HBASE-2611
> Project: HBase
> Issue Type: Sub-task
> Reporter: Jean-Daniel Cryans
> Assignee: Jean-Daniel Cryans
> Fix For: 0.21.0
>
>
> HBASE-2223 doesn't manage region servers that fail while doing the transfer
> of HLogs queues from other region servers that failed. Devise a reliable way
> to do it.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.