[ https://issues.apache.org/jira/browse/HDFS-10314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15524359#comment-15524359 ]
Yongjun Zhang commented on HDFS-10314: -------------------------------------- Hi [~jingzhao], To better address your comments, I discussed with [~atm], and we think there are the following possible options: Option 1. To get closer to what you suggested, we can introduce new tool distsync and let it support all of the following semantics: * distsync -diff s1 s2 src tgt * distsync -rdiff s2 s1 src tgt * distsync tgt s1 Implementation-wise, we can move common DistCp stuff to AbstractDistCp, and let both DistCp and DistSync inherit from AbstractDistCp. Accordingly, DistCpOption need to be refactored in a similar fashion. With this route, the "-diff" in distcp will probably eventually be obsoleted. Option 2. Given that we already have -diff switch in DistCp, we can add "-rdiff" to DistCp, and support * distcp -diff s1 s2 src tgt (currently supported) * distcp -rdiff s2 s1 src tgt Would you please share what you think? Thanks a lot. > A new tool to sync current HDFS view to specified snapshot > ---------------------------------------------------------- > > Key: HDFS-10314 > URL: https://issues.apache.org/jira/browse/HDFS-10314 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools > Reporter: Yongjun Zhang > Assignee: Yongjun Zhang > Attachments: HDFS-10314.001.patch > > > HDFS-9820 proposed adding -rdiff switch to distcp, as a reversed operation of > -diff switch. > Upon discussion with [~jingzhao], we will introduce a new tool that wraps > around distcp to achieve the same purpose. > I'm thinking about calling the new tool "rsync", similar to unix/linux > command "rsync". The "r" here means remote. > The syntax that simulate -rdiff behavior proposed in HDFS-9820 is > {code} > rsync <fromSnapshotName> <toSnapshotName> <source> <target> > {code} > This command ensure <fromSnapshotName> is newer than <toSnapshotName>. > I think, In the future, we can add another command to have the functionality > of -diff switch of distcp. > {code} > sync <fromSnapshotName> <toSnapshotName> <source> <target> > {code} > that ensures <fromSnapshotName> is older than <toSnapshotName>. > Thanks [~jingzhao]. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org