[ 
https://issues.apache.org/jira/browse/HBASE-2611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401779#comment-13401779
 ] 

Himanshu Vashishtha commented on HBASE-2611:
--------------------------------------------

I looked at this issue from the perspective of using Zookeeper#multi Operation 
(present in 3.4). This API guarantees to do a list of Op as a single 
transaction, rolling back all the Ops in case any of the Op fails. I tested 
this functionality as a standalone case (where the transaction was to move a 
bunch of Znodes from one parent to another), and it works good (out of N 
threads which race to do the transfer, only 1 is successful). And in case of a 
failure, all the Ops done so far are rolled back. I can attach the sample code 
if required.

Here is the approach I used to utilize multi for this issue:
a) All the active region servers tries to "move" the logs of peers under the 
dead regionserver znode. It involves creating Op objects for creating new 
znodes and deleting old ones. As per the multi API guarantee, only one 
regionserver will be successful in moving the znodes.

b) The regionservers will "keep on trying to move" the znodes from the dead 
regionserver untill they are sure that the node is deleted (by the successful 
regionserver), or there is no log to process. This is to avoid any corner case 
so as not to miss the logs for the dead regionserver. The number of trials can 
be made configurable.

c) In case of cascading failure (when the successful regionserver dies before 
it gets the notification from zk about the successful move), other 
regionservers will get this new event and will proceed as normal (will try to 
move all the znodes from this newly dead regionserver znode).


It will be good to know what others think about this approach. Other rogue 
conditions that can happen?

Attached is a patch based and I tested it by manually killing regionservers at 
random (not totally random, but killing one and then killing the successful one 
when it has just transferred the logs) (its difficult to kill it while 
transferring as its an atomic operation now). Any ideas/suggestions for more 
direct testing are welcome.
                
> Handle RS that fails while processing the failure of another one
> ----------------------------------------------------------------
>
>                 Key: HBASE-2611
>                 URL: https://issues.apache.org/jira/browse/HBASE-2611
>             Project: HBase
>          Issue Type: Sub-task
>          Components: replication
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>         Attachments: HBase-2611-upstream-v1.patch
>
>
> HBASE-2223 doesn't manage region servers that fail while doing the transfer 
> of HLogs queues from other region servers that failed. Devise a reliable way 
> to do it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to