Hey folks, We're running a 100 node cluster on Hadoop 0.18.3 using Amazon Elastic MapReduce.
We've been uploading data to this cluster via SCP and using hadoop fs -copyFromLocal to get it into HDFS. Generally this works fine but our last run saw a failure in this operation which only said "RuntimeError". So we blew away the destination directory in HDFS and tried the copyFromLocal again. This time it failed because it thinks one of the files it's trying to copy to HDFS is already in HDFS, however, I don't get how this is possible if we just blew away the destination's parent directory. Subequent attepts result in identical results. hadoop fsck reports a HEALTHY filesystem. We do see a lot of errors like those below in the namenode log. Are these normal, or perhaps related to the problem described above? Would appreciate any advice or suggestions. b 2010-01-25 16:34:19,762 INFO org.apache.hadoop.dfs.StateChange (IPC Server handler 12 on 9000): BLOCK* NameSystem.addToInvalidates: blk_-3060969094589165545 is added to invalidSet of 10.245.103.240:9200 2010-01-25 16:34:19,762 INFO org.apache.hadoop.dfs.StateChange (IPC Server handler 12 on 9000): BLOCK* NameSystem.addToInvalidates: blk_-3060969094589165545 is added to invalidSet of 10.242.25.206:9200 2010-01-25 16:34:19,762 INFO org.apache.hadoop.dfs.StateChange (IPC Server handler 12 on 9000): BLOCK* NameSystem.addToInvalidates: blk_5935615666845780861 is added to invalidSet of 10.242.15.111:9200 2010-01-25 16:34:19,762 INFO org.apache.hadoop.dfs.StateChange (IPC Server handler 12 on 9000): BLOCK* NameSystem.addToInvalidates: blk_5935615666845780861 is added to invalidSet of 10.244.107.18:9200
