Hi Pablo, Are you sure that Hadoop 0.20.2 is supported on Java 1.7? (AFAIK it's Java 1.6)
Thanks, Anil On Fri, Jul 20, 2012 at 6:07 AM, Pablo Musa <pa...@psafe.com> wrote: > Hey guys, > I have a cluster with 11 nodes (1 NN and 10 DNs) which is running and > working. > However my datanodes keep having the same errors, over and over. > > I googled the problems and tried different flags (ex: > -XX:MaxDirectMemorySize=2G) > and different configs (xceivers=8192) but could not solve it. > > Does anyone know what is the problem and how can I solve it? (the > stacktrace is at the end) > > I am running: > Java 1.7 > Hadoop 0.20.2 > Hbase 0.90.6 > Zoo 3.3.5 > > % top -> shows low load average (6% most of the time up to 60%), already > considering the number of cpus > % vmstat -> shows no swap at all > % sar -> shows 75% idle cpu in the worst case > > Hope you guys can help me. > Thanks in advance, > Pablo > > 2012-07-20 00:03:44,455 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: > /DN01:50010, dest: > /DN01:43516, bytes: 396288, op: HDFS_READ, cliID: > DFSClient_hb_rs_DN01,60020,1342734302945_1342734303427, offset: 54956544, > srvID: DS-798921853-DN01-50010-1328651609047, blockid: > blk_914960691839012728_14061688, duration: > 480061254006 > 2012-07-20 00:03:44,455 WARN > org.apache.hadoop.hdfs.server.datanode.DataNode: > DatanodeRegistration(DN01:50010, > storageID=DS-798921853-DN01-50010-1328651609047, infoPort=50075, > ipcPort=50020):Got exception while serving blk_914960691839012728_14061688 > to /DN01: > java.net.SocketTimeoutException: 480000 millis timeout while waiting for > channel to be ready for write. ch : > java.nio.channels.SocketChannel[connected local=/DN01:50010 > remote=/DN01:43516] > at > org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246) > at > org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:159) > at > org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:198) > at > org.apache.hadoop.hdfs.server.datanode.BlockSender.sendChunks(BlockSender.java:397) > at > org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:493) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:279) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:175) > > 2012-07-20 00:03:44,455 ERROR > org.apache.hadoop.hdfs.server.datanode.DataNode: > DatanodeRegistration(DN01:50010, > storageID=DS-798921853-DN01-50010-1328651609047, infoPort=50075, > ipcPort=50020):DataXceiver > java.net.SocketTimeoutException: 480000 millis timeout while waiting for > channel to be ready for write. ch : > java.nio.channels.SocketChannel[connected local=/DN01:50010 > remote=/DN01:43516] > at > org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246) > at > org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:159) > at > org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:198) > at > org.apache.hadoop.hdfs.server.datanode.BlockSender.sendChunks(BlockSender.java:397) > at > org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:493) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:279) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:175) > > 2012-07-20 00:12:11,949 INFO > org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification > succeeded for blk_4602445008578088178_5707787 > 2012-07-20 00:12:11,962 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock > blk_-8916344806514717841_14081066 received exception > java.net.SocketTimeoutException: 63000 millis timeout while waiting for > channel to be ready for read. ch : > java.nio.channels.SocketChannel[connected local=/DN01:36634 > remote=/DN03:50010] > 2012-07-20 00:12:11,962 ERROR > org.apache.hadoop.hdfs.server.datanode.DataNode: > DatanodeRegistration(DN01:50010, > storageID=DS-798921853-DN01-50010-1328651609047, infoPort=50075, > ipcPort=50020):DataXceiver > java.net.SocketTimeoutException: 63000 millis timeout while waiting for > channel to be ready for read. ch : > java.nio.channels.SocketChannel[connected local=/DN01:36634 > remote=/DN03:50010] > at > org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164) > at > org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155) > at > org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128) > at > org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:116) > at java.io.FilterInputStream.read(FilterInputStream.java:83) > at java.io.DataInputStream.readShort(DataInputStream.java:312) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:447) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:183) > > > 2012-07-20 00:12:20,670 INFO > org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification > succeeded for blk_7238561256016868237_3555939 > 2012-07-20 00:12:22,541 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block > blk_-7028120671250332363_14081073 src: /DN03:50331 dest: /DN01:50010 > 2012-07-20 00:12:22,544 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in receiveBlock > for block blk_-7028120671250332363_14081073 java.io.EOFException: while > trying to read 65557 bytes > 2012-07-20 00:12:22,544 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for > block blk_-7028120671250332363_14081073 Interrupted. > 2012-07-20 00:12:22,544 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for > block blk_-7028120671250332363_14081073 terminating > 2012-07-20 00:12:22,544 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock > blk_-7028120671250332363_14081073 received exception java.io.EOFException: > while trying to read 65557 bytes > 2012-07-20 00:12:22,544 ERROR > org.apache.hadoop.hdfs.server.datanode.DataNode: > DatanodeRegistration(DN01:50010, > storageID=DS-798921853-DN01-50010-1328651609047, infoPort=50075, > ipcPort=50020):DataXceiver > java.io.EOFException: while trying to read 65557 bytes > at > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:290) > at > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:334) > at > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:398) > at > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:577) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:494) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:183) > > > 2012-07-20 00:12:34,266 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block > blk_-1834839455324747507_14081046 src: /DN05:59897 dest: /DN01:50010 > 2012-07-20 00:12:34,267 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in receiveBlock > for block blk_-1834839455324747507_14081046 java.io.EOFException: while > trying to read 65557 bytes > 2012-07-20 00:12:34,268 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for > block blk_-1834839455324747507_14081046 Interrupted. > 2012-07-20 00:12:34,268 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for > block blk_-1834839455324747507_14081046 terminating > 2012-07-20 00:12:34,268 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock > blk_-1834839455324747507_14081046 received exception java.io.EOFException: > while trying to read 65557 bytes > 2012-07-20 00:12:34,268 ERROR > org.apache.hadoop.hdfs.server.datanode.DataNode: > DatanodeRegistration(DN01:50010, > storageID=DS-798921853-DN01-50010-1328651609047, infoPort=50075, > ipcPort=50020):DataXceiver > java.io.EOFException: while trying to read 65557 bytes > at > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:290) > at > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:334) > at > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:398) > at > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:577) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:494) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:183) > 2012-07-20 00:12:34,269 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block > blk_3941134611454287401_14080990 src: /DN03:50345 dest: /DN01:50010 > 2012-07-20 00:12:34,270 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in receiveBlock > for block blk_3941134611454287401_14080990 java.io.EOFException: while > trying to read 65557 bytes > 2012-07-20 00:12:34,270 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for > block blk_3941134611454287401_14080990 Interrupted. > 2012-07-20 00:12:34,271 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for > block blk_3941134611454287401_14080990 terminating > 2012-07-20 00:12:34,271 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock > blk_3941134611454287401_14080990 received exception java.io.EOFException: > while trying to read 65557 bytes > 2012-07-20 00:12:34,271 ERROR > org.apache.hadoop.hdfs.server.datanode.DataNode: > DatanodeRegistration(DN01:50010, > storageID=DS-798921853-DN01-50010-1328651609047, infoPort=50075, > ipcPort=50020):DataXceiver > java.io.EOFException: while trying to read 65557 bytes > at > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:290) > at > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:334) > at > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:398) > at > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:577) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:494) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:183) > -- Thanks & Regards, Anil Gupta