[
https://issues.apache.org/jira/browse/MAPREDUCE-2117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12919331#action_12919331
]
Doug Cutting commented on MAPREDUCE-2117:
-----------------------------------------
HDFS support for something like hard links would make this even faster, no?
One could hard-link to blocks in a tree to checkpoint it. Hard links would be
a bigger, deeper change to HDFS, requiring the maintenance of link counts per
block, but might provide a better long-term solution for such checkpoints.
> Superfast Distcp when copying data within the same hdfs cluster
> ---------------------------------------------------------------
>
> Key: MAPREDUCE-2117
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2117
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Components: distcp
> Reporter: dhruba borthakur
>
> There are use cases when distcp is used to copy a bunch of files/directories
> from one part of the HDFS namespace to another part within the same HDFS
> cluster. It is superfast if we can instruct relevant datanodes to make local
> replicas of relevant blocks and limit network usage to a minimum. It is
> especially useful to make HBase take a backup of a region with minimum
> downtime.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.