[ 
https://issues.apache.org/jira/browse/HDFS-10314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15524359#comment-15524359
 ] 

Yongjun Zhang commented on HDFS-10314:
--------------------------------------

Hi [~jingzhao],

To better address your comments, I discussed with [~atm], and we think there 
are the following possible options:

Option 1. To get closer to what you suggested, we can introduce new tool 
distsync and let it support all of the following semantics:
* distsync -diff s1 s2 src tgt
* distsync -rdiff s2 s1 src tgt
* distsync tgt s1

Implementation-wise, we can move common DistCp stuff to AbstractDistCp, and let 
both DistCp and DistSync inherit from AbstractDistCp.  Accordingly, 
DistCpOption need to be refactored in a similar fashion.

With this route, the "-diff" in distcp will probably eventually be obsoleted.

Option 2. Given that we already have -diff switch in DistCp, we can add 
"-rdiff" to DistCp, and support
* distcp -diff s1 s2 src tgt (currently supported)
* distcp -rdiff s2 s1 src tgt

Would you please share what you think?

Thanks a lot.








> A new tool to sync current HDFS view to specified snapshot
> ----------------------------------------------------------
>
>                 Key: HDFS-10314
>                 URL: https://issues.apache.org/jira/browse/HDFS-10314
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: tools
>            Reporter: Yongjun Zhang
>            Assignee: Yongjun Zhang
>         Attachments: HDFS-10314.001.patch
>
>
> HDFS-9820 proposed adding -rdiff switch to distcp, as a reversed operation of 
> -diff switch. 
> Upon discussion with [~jingzhao], we will introduce a new tool that wraps 
> around distcp to achieve the same purpose.
> I'm thinking about calling the new tool "rsync", similar to unix/linux 
> command "rsync". The "r" here means remote.
> The syntax that simulate -rdiff behavior proposed in HDFS-9820 is
> {code}
> rsync <fromSnapshotName>  <toSnapshotName>  <source> <target>
> {code}
> This command ensure <fromSnapshotName>  is newer than <toSnapshotName>.
> I think, In the future, we can add another command to have the functionality 
> of -diff switch of distcp.
> {code}
> sync <fromSnapshotName>  <toSnapshotName>  <source> <target>
> {code}
> that ensures <fromSnapshotName>  is older than <toSnapshotName>.
> Thanks [~jingzhao].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to