The simplest and fastest way is to use 'distcp' command in hadoop shell.
This will copy all your data in parallel using map-reduce.
The shell command would look something like this
hadoop distcp hdfs://source-namenode-host:port/source-path/
hdfs://dest-namenode-host:port/dest-path
Hope this helps
-Ankur
ma qiang wrote:
Hi all,
I have a large dataset saved in a hadoop cluster, and now I want
to copy these data from this hadoop cluster into another hadoop
cluster, who can tell me how?
Thank you very much !
Best wishes !
maqiang