Jing Zhao created HDFS-7535:
-------------------------------
Summary: Utilize Snapshot diff report for distcp
Key: HDFS-7535
URL: https://issues.apache.org/jira/browse/HDFS-7535
Project: Hadoop HDFS
Issue Type: Improvement
Reporter: Jing Zhao
Assignee: Jing Zhao
Currently HDFS snapshot diff report can identify file/directory creation,
deletion, rename and modification under a snapshottable directory. We can use
the diff report for distcp between the primary cluster and a backup cluster to
avoid unnecessary data copy. This is especially useful when there is a big
directory rename happening in the primary cluster: the current distcp cannot
detect the rename op thus this rename usually leads to large amounts of real
data copy.
More details of the approach will come in the first comment.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)