[ 
https://issues.apache.org/jira/browse/HDFS-13916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16899337#comment-16899337
 ] 

Wei-Chiu Chuang commented on HDFS-13916:
----------------------------------------

Attached 006 patch to address [~xyao]'s comment. [^HDFS-13916.006.patch] 

I played a bit with adding iterator based snapshot diff, but then realized it 
would require a pretty big rewrite. So I'll get another jira to improve this.
Additionally, I also found that distcp runs two getSnapshotDiffReport() calls 
on the target file system. If the target is a HDFS (hdfs:// or webhdfs://), it 
puts unnecessary stress to the NameNode. I'll try to find a way to not do this.

> Distcp SnapshotDiff not completely implemented for supporting WebHdfs
> ---------------------------------------------------------------------
>
>                 Key: HDFS-13916
>                 URL: https://issues.apache.org/jira/browse/HDFS-13916
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: distcp, webhdfs
>    Affects Versions: 3.0.1, 3.1.1
>            Reporter: Xun REN
>            Assignee: Xun REN
>            Priority: Major
>              Labels: easyfix, newbie, patch
>         Attachments: HDFS-13916.002.patch, HDFS-13916.003.patch, 
> HDFS-13916.004.patch, HDFS-13916.005.patch, HDFS-13916.006.patch, 
> HDFS-13916.patch
>
>
> [~ljain] has worked on the JIRA: HDFS-13052 to provide the possibility to 
> make DistCP of SnapshotDiff with WebHDFSFileSystem. However, in the patch, 
> there is no modification for the real java class which is used by launching 
> the command "hadoop distcp ..."
>  
> You can check in the latest version here:
> [https://github.com/apache/hadoop/blob/branch-3.1.1/hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCpSync.java#L96-L100]
> In the method "preSyncCheck" of the class "DistCpSync", we still check if the 
> file system is DFS. 
> So I propose to change the class DistCpSync in order to take into 
> consideration what was committed by Lokesh Jain.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to