[ 
https://issues.apache.org/jira/browse/HDFS-7535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15982590#comment-15982590
 ] 

Benjamin Huo commented on HDFS-7535:
------------------------------------

I've one question regarding the following comments:
"This snapshot diff report represents the delta that should be applied to the 
backup cluster. For changes like deletion and rename we can directly apply the 
same operations (following some specific order based on their dependency) in 
the backup cluster. For changes like creation, append, and other metadata 
modification we keep using the functionality of the current distcp."

I'm not very clear about what "we keep using the functionality of the current 
distcp" means.

After fix HDFS-7535, the file changes list for creation and modification are 
generated based on snapshots s1 and s2 on the source cluster, or it's generated 
based on the file changes between source cluster and destination cluster?

Thanks
Ben



> Utilize Snapshot diff report for distcp
> ---------------------------------------
>
>                 Key: HDFS-7535
>                 URL: https://issues.apache.org/jira/browse/HDFS-7535
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: distcp, snapshots
>            Reporter: Jing Zhao
>            Assignee: Jing Zhao
>             Fix For: 2.7.0
>
>         Attachments: HDFS-7535.000.patch, HDFS-7535.001.patch, 
> HDFS-7535.002.patch, HDFS-7535.003.patch, HDFS-7535.004.patch
>
>
> Currently HDFS snapshot diff report can identify file/directory creation, 
> deletion, rename and modification under a snapshottable directory. We can use 
> the diff report for distcp between the primary cluster and a backup cluster 
> to avoid unnecessary data copy. This is especially useful when there is a big 
> directory rename happening in the primary cluster: the current distcp cannot 
> detect the rename op thus this rename usually leads to large amounts of real 
> data copy.
> More details of the approach will come in the first comment.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to