[ 
https://issues.apache.org/jira/browse/HDFS-10314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15510391#comment-15510391
 ] 

Yongjun Zhang commented on HDFS-10314:
--------------------------------------

Hi [~jingzhao],

For clarity, and as a recap, here is a comparison table between -diff and the 
proposed -rdiff, which shows the symmetricity:

||Comparison||-diff s1 s2 <src> <tgt>||-rdiff s2 s1 <src> <tgt>||
|Current feature state|Existing in distcp|Proposed Addition |
|Functionality| Given <tgt>'s current state is s1, make <tgt>'s current state 
the same as newer snapshot s2 | Given <tgt>'s current state is s2, make <tgt>'s 
current state the same as older snapshot s1 | 
|Requirements| # <src> and <tgt> need to be different paths
# both <src> and <tgt> have snapshot s1 with exact same content 
# <src> has snapshot s2
# s2 is newer than s1
# <tgt>'s current state is the same as s1
# <tgt> doesn't have snapshot s2 | # <src> and <tgt> can be the same or 
different paths
# both <src> and <tgt> have snapshot s1 with exact same content
# <tgt> has snapshot s2
#  s2 is newer than s1 
# <tgt>'s current state is the same as s2
# <src> may or may not have snapshot s2 |
|Steps|# calculate snapshotDiff<s1,s2> at <src> 
# apply rename/delete part of snapshotDiff on <tgt> 
# copy modified part of snapshotDiff from s1 of <src> to <tgt> | # calculate 
snapshotDiff<s2,s1> at <tgt> 
# apply rename/delete part of snapshotDiff on <tgt> 
# copy modified part of snapshotDiff from s1 of <src> to <tgt> |

The original thinking was to add -ridff to distcp (solution A), but because of 
the concern of confusing semantics, it's suggested to introduce a new command 
here (solution B). 

Thanks.


> A new tool to sync current HDFS view to specified snapshot
> ----------------------------------------------------------
>
>                 Key: HDFS-10314
>                 URL: https://issues.apache.org/jira/browse/HDFS-10314
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: tools
>            Reporter: Yongjun Zhang
>            Assignee: Yongjun Zhang
>         Attachments: HDFS-10314.001.patch
>
>
> HDFS-9820 proposed adding -rdiff switch to distcp, as a reversed operation of 
> -diff switch. 
> Upon discussion with [~jingzhao], we will introduce a new tool that wraps 
> around distcp to achieve the same purpose.
> I'm thinking about calling the new tool "rsync", similar to unix/linux 
> command "rsync". The "r" here means remote.
> The syntax that simulate -rdiff behavior proposed in HDFS-9820 is
> {code}
> rsync <fromSnapshotName>  <toSnapshotName>  <source> <target>
> {code}
> This command ensure <fromSnapshotName>  is newer than <toSnapshotName>.
> I think, In the future, we can add another command to have the functionality 
> of -diff switch of distcp.
> {code}
> sync <fromSnapshotName>  <toSnapshotName>  <source> <target>
> {code}
> that ensures <fromSnapshotName>  is older than <toSnapshotName>.
> Thanks [~jingzhao].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to