[jira] Commented: (HDFS-782) dynamic replication

Ning Zhang (JIRA) Fri, 20 Nov 2009 17:40:05 -0800

    [ 
https://issues.apache.org/jira/browse/HDFS-782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12780872#action_12780872
 ]


Ning Zhang commented on HDFS-782:
---------------------------------

To elaborate on the proposal, a data node keeps the statistics on how many 
clients are requesting a certain block. If the number exceeds a certain 
threshold, the data node can send the block to a number of data nodes 
(children) and ask them to replicate the block (one heuristics is to choose 
from the data nodes whose asked for the block). If a child data node accepts 
the replication request (e.g., it doesn't hold already), it goes through the 
same protocol as adding a new replica acknowledged by the name node. The reason 
we propose datanode->datanode replication rather than 
datanode->namenode->datanode replication is that it is much faster for the 
former case than the latter (whose performance depending on the work load of 
the name node could be minutes). If the children also got too many requests, 
they can proactively replicate themselves recursively, until the # of requests 
are distributed to sufficient number of replicas. 

Currently the name node cleans up the extra replicas periodically. To address 
DN->DN dynamic replication, we need to add a heuristic to let it clean extra 
replicas only when they has not been access in a certain period. 

Any suggests?

> dynamic replication
> -------------------
>
>                 Key: HDFS-782
>                 URL: https://issues.apache.org/jira/browse/HDFS-782
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Ning Zhang
>
> In a large and busy cluster, a block can be requested by many clients at the 
> same time. HDFS-767 tries to solve the failing case when the # of retries 
> exceeds the maximum # of retries. However, that patch doesn't solve the 
> performance issue since all failing clients have to wait a certain period 
> before retry, and the # of retries could be high. 
> One solution to solve the performance issue is to increase the # of replicas 
> for this "hot" block dynamically when it is requested many times at a short 
> period. The name node need to be aware such situation and only clean up extra 
> replicas when they are not accessed recently. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-782) dynamic replication

Reply via email to