When I use -overwrite everything gets copied over fine. And the files are not corrupt.
When I use the -update option for distcp, I constantly get this WARN + exception. What is it trying to do and what is failing? 11/08/23 22:43:06 WARN hdfs.DFSClient: src=/analytics_hive_tables/web_etl_tables/mass_actions_table/mass_actions_table_2011-01-01-2011-08-16.out, datanodes[0].getName()=10.177.1.218:50010 java.io.EOFException at java.io.DataInputStream.readShort(DataInputStream.java:298) at org.apache.hadoop.hdfs.DFSClient.getFileChecksum(DFSClient.java:782) at org.apache.hadoop.hdfs.DFSClient.getFileChecksum(DFSClient.java:719) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileChecksum(DistributedFileSystem.java:553) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileChecksum(DistributedFileSystem.java:53) at org.apache.hadoop.tools.DistCp.sameFile(DistCp.java:1261) at org.apache.hadoop.tools.DistCp.setup(DistCp.java:1120) at org.apache.hadoop.tools.DistCp.copy(DistCp.java:666) at org.apache.hadoop.tools.DistCp.run(DistCp.java:881) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) at org.apache.hadoop.tools.DistCp.main(DistCp.java:908) -Ayon See My Photos on Flickr Also check out my Blog for answers to commonly asked questions.