The simplest and fastest way is to use 'distcp' command in hadoop shell. This will copy all your data in parallel using map-reduce.
The shell command would look something like this

hadoop distcp hdfs://source-namenode-host:port/source-path/ hdfs://dest-namenode-host:port/dest-path

Hope this helps

-Ankur

ma qiang wrote:
Hi all,
    I have a large dataset saved in a hadoop cluster, and now I want
to copy these data from this hadoop cluster into another hadoop
cluster,  who can tell me how?
    Thank you very much !
    Best wishes !

maqiang

Reply via email to