[
https://issues.apache.org/jira/browse/HDFS-7535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jing Zhao updated HDFS-7535:
----------------------------
Status: Patch Available (was: Open)
> Utilize Snapshot diff report for distcp
> ---------------------------------------
>
> Key: HDFS-7535
> URL: https://issues.apache.org/jira/browse/HDFS-7535
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: distcp, snapshots
> Reporter: Jing Zhao
> Assignee: Jing Zhao
> Attachments: HDFS-7535.000.patch, HDFS-7535.001.patch
>
>
> Currently HDFS snapshot diff report can identify file/directory creation,
> deletion, rename and modification under a snapshottable directory. We can use
> the diff report for distcp between the primary cluster and a backup cluster
> to avoid unnecessary data copy. This is especially useful when there is a big
> directory rename happening in the primary cluster: the current distcp cannot
> detect the rename op thus this rename usually leads to large amounts of real
> data copy.
> More details of the approach will come in the first comment.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)