[ 
https://issues.apache.org/jira/browse/HDFS-8878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16627227#comment-16627227
 ] 

Steve Loughran commented on HDFS-8878:
--------------------------------------

HDFS-12090 will provide the API needed to do per-block uploads; a version of 
distcp running at the MR layer can partition a source file by blocks and then 
run across the cluster, again, concatting things together

> An HDFS built-in DistCp 
> ------------------------
>
>                 Key: HDFS-8878
>                 URL: https://issues.apache.org/jira/browse/HDFS-8878
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Linxiao Jin
>            Assignee: Linxiao Jin
>            Priority: Major
>
> For now, we use DistCp to do directory copy, which works quite good. However, 
> it would be better if there is an HDFS built-in, efficient, directory copy 
> tool. It could be faster by cut off the redundant communication between HDFS, 
> YARN and MapReduce. It could also release the resource DistCp consumed in job 
> tracker and YARN and easier for debugging.
> We need more discussion on the new protocol between NN and DN from different 
> clusters to achieve HDFS-level command sending and data transfer. One 
> available hacky solution could be, the srcNN get the block distribution of 
> the target file, ask each datanode to start a DFSClient and copy their local 
> shortcircuited block as a file in dst cluster. After all the block-file in 
> dst cluster is completed, use a DFSClient to concat them together to form the 
> target destination file. There might be some optimized solution by implement 
> a newly designed protocol to communicate over cluster rather than DFSClient 
> and use methods from lower bottom layer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to