[ https://issues.apache.org/jira/browse/HDFS-13123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16911849#comment-16911849 ]
Wei-Chiu Chuang commented on HDFS-13123: ---------------------------------------- This patch uses distcp + snapshot. [~smeng] FYI. Given how much "experience" we have associated with distcp + snapshot, I want to be very careful with this patch. You should make sure both directories on the source and destination are snapshottable before running this tool. Probably not a good idea to hard code the snapshot name as "s1" and "s2". Use randomly generated name instead. I don't understand why you create two snapshots in the source cluster almost immediately. If you do so, you only update the files added/deleted during the two snapshots. The distcp -diff command is meant for a read-only destination. The state of "s1" snapshot on the source should be exactly the same as the state of "s1" snapshot on the destination. You'll hit various strange issues if the destination is not a mirror of source. This is either not the right way to use the tool, or not the right tool for the use case. Additionally, make sure you delete the snapshots even if the prior steps hit errors. Otherwise you'll end up with thousands of leftover snapshots. > RBF: Add a balancer tool to move data across subcluster > -------------------------------------------------------- > > Key: HDFS-13123 > URL: https://issues.apache.org/jira/browse/HDFS-13123 > Project: Hadoop HDFS > Issue Type: Improvement > Reporter: Wei Yan > Assignee: hemanthboyina > Priority: Major > Attachments: HDFS Router-Based Federation Rebalancer.pdf, > HDFS-13123.patch > > > Follow the discussion in HDFS-12615. This Jira is to track effort for > building a rebalancer tool, used by router-based federation to move data > among subclusters. -- This message was sent by Atlassian Jira (v8.3.2#803003) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org