[jira] [Commented] (HDFS-4721) Speed up lease/block recovery when DN fails and a block goes into recovery

Varun Sharma (JIRA) Sun, 21 Apr 2013 11:17:16 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-4721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13637611#comment-13637611
 ]


Varun Sharma commented on HDFS-4721:
------------------------------------

As a 2nd action item, it would also be nice to have the ability to skip "stale 
nodes" for reconciliation at the Primary DN. Basically, we have the following 
happen currently:

1) 1st recoverLease call from HBase - bound to fail since it picks Bad DN as 
primary
2) 2nd recoverLease call from HBase - picks correct DN as primary. At the 
primary DN, we still try to reconcile blocks against the stale/bad DN causing 
the recovery to take as much as dfs.socket.timeout (default 60 seconds)

If we avoid picking stale nodes (nodes with lost heartbeat for say, 20-30 
seconds) and also avoid them during the reconciliation phase. That will enable 
lease recovery to be a lot faster...
                
> Speed up lease/block recovery when DN fails and a block goes into recovery
> --------------------------------------------------------------------------
>
>                 Key: HDFS-4721
>                 URL: https://issues.apache.org/jira/browse/HDFS-4721
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>    Affects Versions: 2.0.3-alpha
>            Reporter: Varun Sharma
>
> This was observed while doing HBase WAL recovery. HBase uses append to write 
> to its write ahead log. So initially the pipeline is setup as
> DN1 --> DN2 --> DN3
> This WAL needs to be read when DN1 fails since it houses the HBase 
> regionserver for the WAL.
> HBase first recovers the lease on the WAL file. During recovery, we choose 
> DN1 as the primary DN to do the recovery even though DN1 has failed and is 
> not heartbeating any more.
> Avoiding the stale DN1 would speed up recovery and reduce hbase MTTR. There 
> are two options.
> a) Ride on HDFS 3703 and if stale node detection is turned on, we do not 
> choose stale datanodes (typically not heart beated for 20-30 seconds) as 
> primary DN(s)
> b) We sort the replicas in order of last heart beat and always pick the ones 
> which gave the most recent heart beat
> Going to the dead datanode increases lease + block recovery since the block 
> goes into UNDER_RECOVERY state even though no one is recovering it actively. 
> Please let me know if this makes sense. If yes, whether we should move 
> forward with a) or b).
> Thanks

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4721) Speed up lease/block recovery when DN fails and a block goes into recovery

Reply via email to