[ https://issues.apache.org/jira/browse/HDFS-10314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15530595#comment-15530595 ]
Yongjun Zhang commented on HDFS-10314: -------------------------------------- Had a discussion with [~jingzhao], and we had the following agreement: 1. For now, he will be fine with option 2 stated in https://issues.apache.org/jira/browse/HDFS-10314?focusedCommentId=15524359&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15524359 as long as we document it well, even though it's not his favorite. In that case, we can continue to work on HDFS-9820. 2. When creating a new tool in the future (HDFS-10314), we need to do the following: * refactor the DistCp code to separate out the snapshot sync part (that handles rename/delete per snapshot diff) and copyList calculation part to its own class, e.g., DistCpPrepare. * let both DistCp and DistSync to call DistCpPrepare for the functionality they need * Modify DistCp to take an optional new argument copyListing. * Let DistSync call DistCpPrepare to do the snapshot sync part and copyListing creation part, and then pass the copyListing to DIstCp. Please feel free to correct/add if I'm inaccurate or missed anything. Thanks much Jing. > A new tool to sync current HDFS view to specified snapshot > ---------------------------------------------------------- > > Key: HDFS-10314 > URL: https://issues.apache.org/jira/browse/HDFS-10314 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools > Reporter: Yongjun Zhang > Assignee: Yongjun Zhang > Attachments: HDFS-10314.001.patch > > > HDFS-9820 proposed adding -rdiff switch to distcp, as a reversed operation of > -diff switch. > Upon discussion with [~jingzhao], we will introduce a new tool that wraps > around distcp to achieve the same purpose. > I'm thinking about calling the new tool "rsync", similar to unix/linux > command "rsync". The "r" here means remote. > The syntax that simulate -rdiff behavior proposed in HDFS-9820 is > {code} > rsync <fromSnapshotName> <toSnapshotName> <source> <target> > {code} > This command ensure <fromSnapshotName> is newer than <toSnapshotName>. > I think, In the future, we can add another command to have the functionality > of -diff switch of distcp. > {code} > sync <fromSnapshotName> <toSnapshotName> <source> <target> > {code} > that ensures <fromSnapshotName> is older than <toSnapshotName>. > Thanks [~jingzhao]. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org