Yes, just run something along the lines of: hadoop distcp hdfs://local-namenode/path hdfs://ec2-namenode/path
on the job tracker of a MapReduce cluster. Make sure that your EC2 security group setup allows HDFS access from the local HDFS cluster and wherever you run MapReduce job from. Also, I believe both HDFS setups still need to be running on the same version of Hadoop. More here: http://hadoop.apache.org/common/docs/r0.20.0/distcp.html Cheers, Anthony On Mon, Sep 7, 2009 at 10:37 PM, stchu<[email protected]> wrote: > Hi, > > Does Distcp support to copy data from my local cluster (1 master+3 slaves, > fs=hdfs) to the EC2 cluster (1master+2slaves, fs=hdfs)? > If it's supported, how can I do? I appreciate for any guide or suggestion. > > stchu >
