[ 
https://issues.apache.org/jira/browse/HBASE-2611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12870922#action_12870922
 ] 

Jean-Daniel Cryans commented on HBASE-2611:
-------------------------------------------

My solution would require the use of 4 "types" of znodes:

- Intention: When a node fails, a machine first writes locally the intention of 
locking the other's
- Lock: After writing the intention a node "locks" another's folder of queues 
by creating a znode that contains its startcode
- Tag: After sucessfully locking a znode, the winning node puts a znode locally 
that gives the locked RS' start code and lists the queues that are going to be 
copied
- Delete: After copying everything that was listed in a tag, the node creates 
this znode aside the lock to mark those queues as being under deletion (because 
there's no atomic recursive delete in ZK)

Then let's say we have the following machines:

Machine A: The first machine that dies
Machine B: The machine that's trying to failover A but that fails while doing 
it                        
Machine C: The machine that's trying to failover B and that acquires the lock 
successfully                      

Here's now what should happen to node C when B fails after step X.

||Machine B fails:||Machine C||
|After writing intention|Reads the intention, sees it doesn't own the lock, 
removes the intention and proceeds with failover of just B|
|After writing lock|Reads the intention, sees the lock is owned by B, repeats 
the failover process going to the deepest node|
|After writing tag|same, but takes care not to copy what's in the tag when 
doing the failover of B|
|After copying some znodes|same behavior, as long as the tag is there what's in 
it is considered dirty|
|After writing delete marker|Reads the intention, sees the lock, sees the 
delete marker, finishes deleting all znodes, deletes the tag, then failovers B|
|After deleting the local tag|Reads the intention, sees the znode is gone, 
deletes the intention znode, failovers B|
|After deleting the intention|Basic failover|

> Handle RS that fails while processing the failure of another one
> ----------------------------------------------------------------
>
>                 Key: HBASE-2611
>                 URL: https://issues.apache.org/jira/browse/HBASE-2611
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>             Fix For: 0.21.0
>
>
> HBASE-2223 doesn't manage region servers that fail while doing the transfer 
> of HLogs queues from other region servers that failed. Devise a reliable way 
> to do it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to