Hey guys,
I have a cluster with 11 nodes (1 NN and 10 DNs) which is running and working.
However my datanodes keep having the same errors, over and over.

I googled the problems and tried different flags (ex: 
-XX:MaxDirectMemorySize=2G)
and different configs (xceivers=8192) but could not solve it.

Does anyone know what is the problem and how can I solve it? (the stacktrace is 
at the end)

I am running:
Java 1.7
Hadoop 0.20.2
Hbase 0.90.6
Zoo 3.3.5

% top -> shows low load average (6% most of the time up to 60%), already 
considering the number of cpus
% vmstat -> shows no swap at all
% sar -> shows 75% idle cpu in the worst case

Hope you guys can help me.
Thanks in advance,
Pablo

2012-07-20 00:03:44,455 INFO 
org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /DN01:50010, 
dest:
/DN01:43516, bytes: 396288, op: HDFS_READ, cliID: 
DFSClient_hb_rs_DN01,60020,1342734302945_1342734303427, offset: 54956544, 
srvID: DS-798921853-DN01-50010-1328651609047, blockid: 
blk_914960691839012728_14061688, duration:
480061254006
2012-07-20 00:03:44,455 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
DatanodeRegistration(DN01:50010, 
storageID=DS-798921853-DN01-50010-1328651609047, infoPort=50075, 
ipcPort=50020):Got exception while serving blk_914960691839012728_14061688 to 
/DN01:
java.net.SocketTimeoutException: 480000 millis timeout while waiting for 
channel to be ready for write. ch : java.nio.channels.SocketChannel[connected 
local=/DN01:50010 remote=/DN01:43516]
        at 
org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246)
        at 
org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:159)
        at 
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:198)
        at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendChunks(BlockSender.java:397)
        at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:493)
        at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:279)
        at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:175)

2012-07-20 00:03:44,455 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: 
DatanodeRegistration(DN01:50010, 
storageID=DS-798921853-DN01-50010-1328651609047, infoPort=50075, 
ipcPort=50020):DataXceiver
java.net.SocketTimeoutException: 480000 millis timeout while waiting for 
channel to be ready for write. ch : java.nio.channels.SocketChannel[connected 
local=/DN01:50010 remote=/DN01:43516]
        at 
org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246)
        at 
org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:159)
        at 
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:198)
        at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendChunks(BlockSender.java:397)
        at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:493)
        at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:279)
        at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:175)

2012-07-20 00:12:11,949 INFO 
org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification succeeded 
for blk_4602445008578088178_5707787
2012-07-20 00:12:11,962 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
writeBlock blk_-8916344806514717841_14081066 received exception 
java.net.SocketTimeoutException: 63000 millis timeout while waiting for channel 
to be ready for read. ch : java.nio.channels.SocketChannel[connected 
local=/DN01:36634 remote=/DN03:50010]
2012-07-20 00:12:11,962 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: 
DatanodeRegistration(DN01:50010, 
storageID=DS-798921853-DN01-50010-1328651609047, infoPort=50075, 
ipcPort=50020):DataXceiver
java.net.SocketTimeoutException: 63000 millis timeout while waiting for channel 
to be ready for read. ch : java.nio.channels.SocketChannel[connected 
local=/DN01:36634 remote=/DN03:50010]
        at 
org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
        at 
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155)
        at 
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128)
        at 
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:116)
        at java.io.FilterInputStream.read(FilterInputStream.java:83)
        at java.io.DataInputStream.readShort(DataInputStream.java:312)
        at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:447)
        at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:183)


2012-07-20 00:12:20,670 INFO 
org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification succeeded 
for blk_7238561256016868237_3555939
2012-07-20 00:12:22,541 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Receiving block blk_-7028120671250332363_14081073 src: /DN03:50331 dest: 
/DN01:50010
2012-07-20 00:12:22,544 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Exception in receiveBlock for block blk_-7028120671250332363_14081073 
java.io.EOFException: while trying to read 65557 bytes
2012-07-20 00:12:22,544 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
PacketResponder 0 for block blk_-7028120671250332363_14081073 Interrupted.
2012-07-20 00:12:22,544 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
PacketResponder 0 for block blk_-7028120671250332363_14081073 terminating
2012-07-20 00:12:22,544 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
writeBlock blk_-7028120671250332363_14081073 received exception 
java.io.EOFException: while trying to read 65557 bytes
2012-07-20 00:12:22,544 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: 
DatanodeRegistration(DN01:50010, 
storageID=DS-798921853-DN01-50010-1328651609047, infoPort=50075, 
ipcPort=50020):DataXceiver
java.io.EOFException: while trying to read 65557 bytes
        at 
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:290)
        at 
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:334)
        at 
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:398)
        at 
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:577)
        at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:494)
        at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:183)


2012-07-20 00:12:34,266 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Receiving block blk_-1834839455324747507_14081046 src: /DN05:59897 dest: 
/DN01:50010
2012-07-20 00:12:34,267 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Exception in receiveBlock for block blk_-1834839455324747507_14081046 
java.io.EOFException: while trying to read 65557 bytes
2012-07-20 00:12:34,268 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
PacketResponder 0 for block blk_-1834839455324747507_14081046 Interrupted.
2012-07-20 00:12:34,268 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
PacketResponder 0 for block blk_-1834839455324747507_14081046 terminating
2012-07-20 00:12:34,268 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
writeBlock blk_-1834839455324747507_14081046 received exception 
java.io.EOFException: while trying to read 65557 bytes
2012-07-20 00:12:34,268 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: 
DatanodeRegistration(DN01:50010, 
storageID=DS-798921853-DN01-50010-1328651609047, infoPort=50075, 
ipcPort=50020):DataXceiver
java.io.EOFException: while trying to read 65557 bytes
        at 
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:290)
        at 
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:334)
        at 
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:398)
        at 
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:577)
        at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:494)
        at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:183)
2012-07-20 00:12:34,269 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Receiving block blk_3941134611454287401_14080990 src: /DN03:50345 dest: 
/DN01:50010
2012-07-20 00:12:34,270 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Exception in receiveBlock for block blk_3941134611454287401_14080990 
java.io.EOFException: while trying to read 65557 bytes
2012-07-20 00:12:34,270 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
PacketResponder 0 for block blk_3941134611454287401_14080990 Interrupted.
2012-07-20 00:12:34,271 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
PacketResponder 0 for block blk_3941134611454287401_14080990 terminating
2012-07-20 00:12:34,271 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
writeBlock blk_3941134611454287401_14080990 received exception 
java.io.EOFException: while trying to read 65557 bytes
2012-07-20 00:12:34,271 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: 
DatanodeRegistration(DN01:50010, 
storageID=DS-798921853-DN01-50010-1328651609047, infoPort=50075, 
ipcPort=50020):DataXceiver
java.io.EOFException: while trying to read 65557 bytes
        at 
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:290)
        at 
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:334)
        at 
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:398)
        at 
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:577)
        at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:494)
        at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:183)

Reply via email to