try hadoop distcp more info here http://hadoop.apache.org/core/docs/current/distcp.html Documentation is for current release, but looking hadoop distcp should print out help message.
Thanks, Lohit ----- Original Message ---- From: C G <[email protected]> To: [email protected] Sent: Wednesday, December 17, 2008 7:18:51 PM Subject: Copy data between HDFS instances... Hi All: I am setting up 2 grids, each with its own HDFS. The grids are unaware of each other but exist on the same network. I'd like to copy data from one HDFS to the other. Is there a way to do this simply, or do I need to cobble together scripts to copy from HDFS on one side and pipe to a dfs -cp on the other side? I tried something like this: hadoop dfs -ls hdfs://grid1NameNode:portNo/ from grid2 trying to ls on grid1 but got a "wrong FS" error message. I also tried: hadoop dfs -ls hdfs://grid1NameNode:portNo/foo on grid2 where "/foo" exists on grid1 and got 0 files found. I assume there is some way to do this and I just don't have the right command line magic. This is Hadoop 0.15.0. Any help appreciated. Thanks, C G
