[ 
https://issues.apache.org/jira/browse/HDFS-782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13529678#comment-13529678
 ] 

omkar vinit joshi commented on HDFS-782:
----------------------------------------

Hi Putu,

This we have just implemented for hadoop 0.20.2 version. We have implemented 2 
way approach. 
1) We are changing the list of replicas returned by namenode ( based on 
xceivercount). Probability of a particular node getting selected is inversely 
proportional to its active connections.
2) We are increasing and decreasing the replicas as needed. ( only when we see 
that all the nodes hosting a block are loaded - hot spot).
At present we have kept various parameters such as maximum blocks to be 
replicated dynamically ( there by maximum extra space to be used) , number of 
extra replicas to create ( at present 2) configurable.
                
> dynamic replication
> -------------------
>
>                 Key: HDFS-782
>                 URL: https://issues.apache.org/jira/browse/HDFS-782
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Ning Zhang
>
> In a large and busy cluster, a block can be requested by many clients at the 
> same time. HDFS-767 tries to solve the failing case when the # of retries 
> exceeds the maximum # of retries. However, that patch doesn't solve the 
> performance issue since all failing clients have to wait a certain period 
> before retry, and the # of retries could be high. 
> One solution to solve the performance issue is to increase the # of replicas 
> for this "hot" block dynamically when it is requested many times at a short 
> period. The name node need to be aware such situation and only clean up extra 
> replicas when they are not accessed recently. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to