I'm trying out the HDFS Append with r0.21.0 in a quick little simple
application before using in my production code. It creates a new file, writes
out 50K records, closes the file, then opens the file in append, writes another
100K records, then closes the file. Everything is fine up until it goes to
close the append stream. When it does that, I get the below error. Furthermore,
when I check the datanode logfile for one that had the issue, I'm seeing some
stuff in the logfile relating to that block. First is the output from my small
application, 2nd is the output from the datanode logfile.
I can't do a hadoop fs -fsck at this moment due to some production data being
used (very small cluster, we're a pretty new Hadoop shop), so while I'm pretty
sure an fsck would be fine, I want to wait until we're done with it before
running that, or asking the sysadmin to restart the DFS. This will be early
tomorrow at the earliest, but I can try just about any other suggestions. Help!
--Aaron
03-21-11 15:58:17 [INFO ] Exception in createBlockOutputStream
java.io.EOFException
03-21-11 15:58:17 [WARN ] Error Recovery for block
blk_8212105008236569520_123591 in pipeline 10.10.11.50:50010,
10.10.11.51:50010, 10.10.11.52:50010: bad datanode 10.10.11.50:50010
03-21-11 15:58:18 [INFO ] Exception in createBlockOutputStream
java.io.EOFException
03-21-11 15:58:18 [WARN ] Error Recovery for block
blk_8212105008236569520_123591 in pipeline 10.10.11.51:50010,
10.10.11.52:50010: bad datanode 10.10.11.51:50010
03-21-11 15:58:18 [INFO ] Exception in createBlockOutputStream
java.io.EOFException
03-21-11 15:58:18 [WARN ] DataStreamer Exception: java.lang.NullPointerException
at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:480)
03-21-11 15:58:18 [WARN ] DFSOutputStream ResponseProcessor exception for
block blk_8212105008236569520_123591java.lang.NullPointerException
at
org.apache.hadoop.hdfs.protocol.DataTransferProtocol$PipelineAck.readFields(DataTransferProtocol.java:531)
at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:616)
03-21-11 15:58:19 [ERROR] java.io.IOException - All datanodes 10.10.11.52:50010
are bad. Aborting...
java.io.IOException: All datanodes 10.10.11.52:50010 are bad. Aborting...
at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:753)
at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:431)
03-21-11 15:58:19 [ERROR] Exception closing file
/user/abaff/test/append/test4.out : java.io.IOException: All datanodes
10.10.11.52:50010 are bad. Aborting...
java.io.IOException: All datanodes 10.10.11.52:50010 are bad. Aborting...
at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:753)
at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:431)
2011-03-21 15:58:17,694 INFO org.apache.hadoop.hdfs.server.datanode.DataNode:
Receiving block blk_8212105008236569520_123591 src: /10.8.4.6:3812 dest:
/10.10.11.50:50010
2011-03-21 15:58:17,695 INFO org.apache.hadoop.hdfs.server.datanode.DataNode:
Appending to replica FinalizedReplica, blk_8212105008236569520_123591, FINALIZED
getNumBytes() = 1538890
getBytesOnDisk() = 1538890
getVisibleLength()= 1538890
getVolume() =
/hadoop/hadoop-datastore/hadoop-hadoop/dfs/data/current/finalized
getBlockFile() =
/hadoop/hadoop-datastore/hadoop-hadoop/dfs/data/current/finalized/subdir14/subdir45/blk_8212105008236569520
unlinked=false
2011-03-21 15:58:17,735 WARN org.apache.hadoop.hdfs.server.datanode.DataNode:
IOException in BlockReceiver constructor. Cause is
2011-03-21 15:58:17,736 INFO org.apache.hadoop.hdfs.server.datanode.DataNode:
writeBlock blk_8212105008236569520_123591 received exception
java.io.IOException: Failed to get link count on file
/hadoop/hadoop-datastore/hadoop-hadoop/dfs/data/current/finalized/subdir14/subdir45/blk_8212105008236569520:
message=null; error=stat: illegal option -- c; exit value=1
2011-03-21 15:58:17,736 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode:
DatanodeRegistration(10.10.11.50:50010,
storageID=DS-1360244006-10.10.11.50-50010-1290637962096, infoPort=50075,
ipcPort=50020):DataXceiver
java.io.IOException: Failed to get link count on file
/hadoop/hadoop-datastore/hadoop-hadoop/dfs/data/current/finalized/subdir14/subdir45/blk_8212105008236569520:
message=null; error=stat: illegal option -- c; exit value=1
at
org.apache.hadoop.fs.FileUtil.createIOException(FileUtil.java:709)
at org.apache.hadoop.fs.FileUtil.access$000(FileUtil.java:42)
at
org.apache.hadoop.fs.FileUtil$HardLink.getLinkCount(FileUtil.java:682)
at
org.apache.hadoop.hdfs.server.datanode.ReplicaInfo.unlinkBlock(ReplicaInfo.java:215)
at
org.apache.hadoop.hdfs.server.datanode.FSDataset.append(FSDataset.java:1116)
at
org.apache.hadoop.hdfs.server.datanode.FSDataset.append(FSDataset.java:1099)
at
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.<init>(BlockReceiver.java:112)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.opWriteBlock(DataXceiver.java:258)
at
org.apache.hadoop.hdfs.protocol.DataTransferProtocol$Receiver.opWriteBlock(DataTransferProtocol.java:390)
at
org.apache.hadoop.hdfs.protocol.DataTransferProtocol$Receiver.processOp(DataTransferProtocol.java:331)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:111)
at java.lang.Thread.run(Thread.java:619)