[
https://issues.apache.org/jira/browse/HDFS-13916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17369831#comment-17369831
]
Ayush Saxena commented on HDFS-13916:
-------------------------------------
{quote}When there are millions of diffs between two snapshots, the old
getSnapshotDiffReport() isn't scalable. NameNode find itself creating huge RPC
messages for the snapshot diff items, which creates GC memory pressure;
application produces big memory spikes too.
{quote}
Yeps, True, have seen bunch of cases where this causes problem at the namenode
side. we should ultimately support getSnapshotDiffReportListing in WebHdfs.
Otherwise the code changes LGTM. If [~weichiu] is convinced we can have a
followup Jira to support {{getSnapshotDiffReportListing}} in {{WebHDFS}} and
can get this in to get the basic functionality support with {{WebHDFS}}
> Distcp SnapshotDiff to support WebHDFS
> --------------------------------------
>
> Key: HDFS-13916
> URL: https://issues.apache.org/jira/browse/HDFS-13916
> Project: Hadoop HDFS
> Issue Type: New Feature
> Components: distcp, webhdfs
> Affects Versions: 3.0.1, 3.1.1
> Reporter: Xun REN
> Assignee: Xun REN
> Priority: Major
> Labels: easyfix, newbie, patch
> Attachments: HDFS-13916.002.patch, HDFS-13916.003.patch,
> HDFS-13916.004.patch, HDFS-13916.005.patch, HDFS-13916.006.patch,
> HDFS-13916.007.patch, HDFS-13916.patch
>
>
> [~ljain] has worked on the JIRA: HDFS-13052 to provide the possibility to
> make DistCP of SnapshotDiff with WebHDFSFileSystem. However, in the patch,
> there is no modification for the real java class which is used by launching
> the command "hadoop distcp ..."
>
> You can check in the latest version here:
> [https://github.com/apache/hadoop/blob/branch-3.1.1/hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCpSync.java#L96-L100]
> In the method "preSyncCheck" of the class "DistCpSync", we still check if the
> file system is DFS.
> So I propose to change the class DistCpSync in order to take into
> consideration what was committed by Lokesh Jain.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]