Hi Ravi, With distcp you usually run it on the target cluster and specify the source with the "hftp://" URI and the target with "hdfs://". hftp is a read only, http based access to the data.
Lars On Tue, Jan 4, 2011 at 11:10 PM, Ravi Phulari <[email protected]> wrote: > Hello Hadoopers, > I need to distcp data across two clusters. For security reasons I can not > use hdfs based distcp. > HFTP based distcp is failing with following Ioexception. > > Stack trace. > > Copy failed: java.io.IOException: Not supported > at org.apache.hadoop.hdfs.HftpFileSystem.delete(HftpFileSystem.java:360) > at org.apache.hadoop.tools.DistCp.fullyDelete(DistCp.java:939) > at org.apache.hadoop.tools.DistCp.copy(DistCp.java:655) > at org.apache.hadoop.tools.DistCp.run(DistCp.java:857) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) > at org.apache.hadoop.tools.DistCp.main(DistCp.java:884) > > I am using following command for distcp. > > Hadoop distcp hftp://nn1.hadoop1:50070/data > hftp://nn2.hadoop2:50070/user/hadoop/ > Hadoop distcp /data/logs hftp://nn2.hadoop2:50070/user/hadoop/ > > Any idea why this distcp could be failing. > I don’t see any logs in JT and NN. > > Any help will be greatly appreciated. > > - > Thanks, > Ravi >
