I'm trying out the HDFS Append with r0.21.0 in a quick little simple 
application before using in my production code. It creates a new file, writes 
out 50K records, closes the file, then opens the file in append, writes another 
100K records, then closes the file. Everything is fine up until it goes to 
close the append stream. When it does that, I get the below error. Furthermore, 
when I check the datanode logfile for one that had the issue, I'm seeing some 
stuff in the logfile relating to that block. First is the output from my small 
application, 2nd is the output from the datanode logfile.

I can't do a hadoop fs -fsck at this moment due to some production data being 
used (very small cluster, we're a pretty new Hadoop shop), so while I'm pretty 
sure an fsck would be fine, I want to wait until we're done with it before 
running that, or asking the sysadmin to restart the DFS. This will be early 
tomorrow at the earliest, but I can try just about any other suggestions. Help!

--Aaron

03-21-11 15:58:17 [INFO ] Exception in createBlockOutputStream 
java.io.EOFException
03-21-11 15:58:17 [WARN ] Error Recovery for block 
blk_8212105008236569520_123591 in pipeline 10.10.11.50:50010, 
10.10.11.51:50010, 10.10.11.52:50010: bad datanode 10.10.11.50:50010
03-21-11 15:58:18 [INFO ] Exception in createBlockOutputStream 
java.io.EOFException
03-21-11 15:58:18 [WARN ] Error Recovery for block 
blk_8212105008236569520_123591 in pipeline 10.10.11.51:50010, 
10.10.11.52:50010: bad datanode 10.10.11.51:50010
03-21-11 15:58:18 [INFO ] Exception in createBlockOutputStream 
java.io.EOFException
03-21-11 15:58:18 [WARN ] DataStreamer Exception: java.lang.NullPointerException
            at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:480)

03-21-11 15:58:18 [WARN ] DFSOutputStream ResponseProcessor exception  for 
block blk_8212105008236569520_123591java.lang.NullPointerException
            at 
org.apache.hadoop.hdfs.protocol.DataTransferProtocol$PipelineAck.readFields(DataTransferProtocol.java:531)
            at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:616)

03-21-11 15:58:19 [ERROR] java.io.IOException - All datanodes 10.10.11.52:50010 
are bad. Aborting...
java.io.IOException: All datanodes 10.10.11.52:50010 are bad. Aborting...
            at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:753)
            at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:431)
03-21-11 15:58:19 [ERROR] Exception closing file 
/user/abaff/test/append/test4.out : java.io.IOException: All datanodes 
10.10.11.52:50010 are bad. Aborting...
java.io.IOException: All datanodes 10.10.11.52:50010 are bad. Aborting...
            at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:753)
            at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:431)




2011-03-21 15:58:17,694 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Receiving block blk_8212105008236569520_123591 src: /10.8.4.6:3812 dest: 
/10.10.11.50:50010
2011-03-21 15:58:17,695 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Appending to replica FinalizedReplica, blk_8212105008236569520_123591, FINALIZED
  getNumBytes()     = 1538890
  getBytesOnDisk()  = 1538890
  getVisibleLength()= 1538890
  getVolume()       = 
/hadoop/hadoop-datastore/hadoop-hadoop/dfs/data/current/finalized
  getBlockFile()    = 
/hadoop/hadoop-datastore/hadoop-hadoop/dfs/data/current/finalized/subdir14/subdir45/blk_8212105008236569520
  unlinked=false
2011-03-21 15:58:17,735 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
IOException in BlockReceiver constructor. Cause is
2011-03-21 15:58:17,736 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
writeBlock blk_8212105008236569520_123591 received exception 
java.io.IOException: Failed to get link count on file 
/hadoop/hadoop-datastore/hadoop-hadoop/dfs/data/current/finalized/subdir14/subdir45/blk_8212105008236569520:
 message=null; error=stat: illegal option -- c; exit value=1
2011-03-21 15:58:17,736 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: 
DatanodeRegistration(10.10.11.50:50010, 
storageID=DS-1360244006-10.10.11.50-50010-1290637962096, infoPort=50075, 
ipcPort=50020):DataXceiver
java.io.IOException: Failed to get link count on file 
/hadoop/hadoop-datastore/hadoop-hadoop/dfs/data/current/finalized/subdir14/subdir45/blk_8212105008236569520:
 message=null; error=stat: illegal option -- c; exit value=1
            at 
org.apache.hadoop.fs.FileUtil.createIOException(FileUtil.java:709)
            at org.apache.hadoop.fs.FileUtil.access$000(FileUtil.java:42)
            at 
org.apache.hadoop.fs.FileUtil$HardLink.getLinkCount(FileUtil.java:682)
            at 
org.apache.hadoop.hdfs.server.datanode.ReplicaInfo.unlinkBlock(ReplicaInfo.java:215)
            at 
org.apache.hadoop.hdfs.server.datanode.FSDataset.append(FSDataset.java:1116)
            at 
org.apache.hadoop.hdfs.server.datanode.FSDataset.append(FSDataset.java:1099)
            at 
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.<init>(BlockReceiver.java:112)
            at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.opWriteBlock(DataXceiver.java:258)
            at 
org.apache.hadoop.hdfs.protocol.DataTransferProtocol$Receiver.opWriteBlock(DataTransferProtocol.java:390)
            at 
org.apache.hadoop.hdfs.protocol.DataTransferProtocol$Receiver.processOp(DataTransferProtocol.java:331)
            at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:111)
            at java.lang.Thread.run(Thread.java:619)

Reply via email to