[
https://issues.apache.org/jira/browse/HDFS-782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16624895#comment-16624895
]
Christopher Burns commented on HDFS-782:
----------------------------------------
Another thought along the same lines is to allow a dynamic replication policy
that would "age-down" blocks to a minimum replication factor from some higher
initial replication factor, on the premise that newer data will be more useful
and thus more frequently accessed than older data.
This obviously depends on the dataset, which is why such a policy would be
manually configured for a given directory. The aging mechanism would be
automatic after the policy is set
Various policy "shapes" could include linear, exponential, or binary (X for
files less than _d_ days old; otherwise Y)
> dynamic replication
> -------------------
>
> Key: HDFS-782
> URL: https://issues.apache.org/jira/browse/HDFS-782
> Project: Hadoop HDFS
> Issue Type: New Feature
> Reporter: Ning Zhang
> Priority: Major
>
> In a large and busy cluster, a block can be requested by many clients at the
> same time. HDFS-767 tries to solve the failing case when the # of retries
> exceeds the maximum # of retries. However, that patch doesn't solve the
> performance issue since all failing clients have to wait a certain period
> before retry, and the # of retries could be high.
> One solution to solve the performance issue is to increase the # of replicas
> for this "hot" block dynamically when it is requested many times at a short
> period. The name node need to be aware such situation and only clean up extra
> replicas when they are not accessed recently.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]