DistCp is capable of running large copies like this in distributed fashion, implemented as a MapReduce job.
http://hadoop.apache.org/docs/r2.7.1/hadoop-distcp/DistCp.html A lot of the literature on DistCp talks about use cases for copying across different clusters, but it's also completely legitimate to run DistCp within the same cluster. --Chris Nauroth From: Gavin Yue <[email protected]<mailto:[email protected]>> Date: Friday, January 8, 2016 at 4:45 PM To: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Subject: how to quickly fs -cp dir with thousand files? I want to cp a dir with over 8000 files to another dir in the same hdfs. but the copy process is really slow since it is copying one by one. Is there a fast way to copy this using Java FileSystem or FileUtil api? Thanks.
