[
https://issues.apache.org/jira/browse/HDFS-782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13529678#comment-13529678
]
omkar vinit joshi commented on HDFS-782:
----------------------------------------
Hi Putu,
This we have just implemented for hadoop 0.20.2 version. We have implemented 2
way approach.
1) We are changing the list of replicas returned by namenode ( based on
xceivercount). Probability of a particular node getting selected is inversely
proportional to its active connections.
2) We are increasing and decreasing the replicas as needed. ( only when we see
that all the nodes hosting a block are loaded - hot spot).
At present we have kept various parameters such as maximum blocks to be
replicated dynamically ( there by maximum extra space to be used) , number of
extra replicas to create ( at present 2) configurable.
> dynamic replication
> -------------------
>
> Key: HDFS-782
> URL: https://issues.apache.org/jira/browse/HDFS-782
> Project: Hadoop HDFS
> Issue Type: New Feature
> Reporter: Ning Zhang
>
> In a large and busy cluster, a block can be requested by many clients at the
> same time. HDFS-767 tries to solve the failing case when the # of retries
> exceeds the maximum # of retries. However, that patch doesn't solve the
> performance issue since all failing clients have to wait a certain period
> before retry, and the # of retries could be high.
> One solution to solve the performance issue is to increase the # of replicas
> for this "hot" block dynamically when it is requested many times at a short
> period. The name node need to be aware such situation and only clean up extra
> replicas when they are not accessed recently.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira