Good Evening, anyone seen this in your logs? It could be something simple that we are missing. We also seeing that Datanodes can't be accessed from the webport 50075 every ones in a while.
-Jack On Mon, Mar 28, 2011 at 4:19 PM, Jack Levin <[email protected]> wrote: > Hello guys, we are getting those errors: > > > 2011-03-28 15:08:33,485 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: > /10.101.6.5:50010, dest: /10.101.6.5:51365, bytes: 66564, op: > HDFS_READ, cliI > D: > DFSClient_hb_rs_rdaf5.prod.imageshack.com,60020,1301323415015_1301323415053, > offset: 4191232, srvID: DS-1528941561-10.101.6.5-50010-1299713950021, > blockid: blk_-30874978 > 22408705276_723501, duration: 14409579 > 2011-03-28 15:08:33,492 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: > /10.101.6.5:50010, dest: /10.101.6.5:51366, bytes: 14964, op: > HDFS_READ, cliI > D: > DFSClient_hb_rs_rdaf5.prod.imageshack.com,60020,1301323415015_1301323415053, > offset: 67094016, srvID: DS-1528941561-10.101.6.5-50010-1299713950021, > blockid: blk_-3224146 > 686136187733_731011, duration: 8855000 > 2011-03-28 15:08:33,495 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: > /10.101.6.5:50010, dest: /10.101.6.5:51368, bytes: 51600, op: > HDFS_READ, cliI > D: > DFSClient_hb_rs_rdaf5.prod.imageshack.com,60020,1301323415015_1301323415053, > offset: 0, srvID: DS-1528941561-10.101.6.5-50010-1299713950021, > blockid: blk_-63843345833451 > 99846_731014, duration: 2053969 > 2011-03-28 15:08:33,503 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: > /10.101.6.5:50010, dest: /10.101.6.5:42553, bytes: 462336, op: > HDFS_READ, cli > ID: > DFSClient_hb_rs_rdaf5.prod.imageshack.com,60020,1301323415015_1301323415053, > offset: 327680, srvID: DS-1528941561-10.101.6.5-50010-1299713950021, > blockid: blk_-47512832 > 94726600221_724785, duration: 480254862706 > 2011-03-28 15:08:33,504 WARN > org.apache.hadoop.hdfs.server.datanode.DataNode: > DatanodeRegistration(10.101.6.5:50010, > storageID=DS-1528941561-10.101.6.5-50010-1299713950021, > infoPort=50075, ipcPort=50020):Got exception while serving > blk_-4751283294726600221_724785 to /10.101.6.5: > java.net.SocketTimeoutException: 480000 millis timeout while waiting > for channel to be ready for write. ch : > java.nio.channels.SocketChannel[connected local=/10.101.6.5:500 > 10 remote=/10.101.6.5:42553] > at > org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246) > at > org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:159) > at > org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:198) > at > org.apache.hadoop.hdfs.server.datanode.BlockSender.sendChunks(BlockSender.java:350) > at > org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:436) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:197) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:110) > > 2011-03-28 15:08:33,504 ERROR > org.apache.hadoop.hdfs.server.datanode.DataNode: > DatanodeRegistration(10.101.6.5:50010, > storageID=DS-1528941561-10.101.6.5-50010-1299713950021 > , infoPort=50075, ipcPort=50020):DataXceiver > java.net.SocketTimeoutException: 480000 millis timeout while waiting > for channel to be ready for write. ch : > java.nio.channels.SocketChannel[connected local=/10.101.6.5:500 > 10 remote=/10.101.6.5:42553] > at > org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246) > at > org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:159) > at > org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:198) > at > org.apache.hadoop.hdfs.server.datanode.BlockSender.sendChunks(BlockSender.java:350) > at > org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:436) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:197) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:110) > 2011-03-28 15:08:33,504 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: > /10.101.6.5:50010, dest: /10.101.6.5:51369, bytes: 66564, op: > HDFS_READ, cliI > D: > DFSClient_hb_rs_rdaf5.prod.imageshack.com,60020,1301323415015_1301323415053, > offset: 4781568, srvID: DS-1528941561-10.101.6.5-50010-1299713950021, > blockid: blk_-30874978 > 22408705276_723501, duration: 11478016 > 2011-03-28 15:08:33,506 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: > /10.101.6.5:50010, dest: /10.101.6.5:51370, bytes: 66564, op: > HDFS_READ, cliI > D: > DFSClient_hb_rs_rdaf5.prod.imageshack.com,60020,1301323415015_1301323415053, > offset: 66962944, srvID: DS-1528941561-10.101.6.5-50010-1299713950021, > blockid: blk_-3224146 > 686136187733_731011, duration: 7643688 > > > RS talking to DN, and we are getting timeouts, there are no issues > like ulimit afaik, as we start them with 32k. Any ideas what the deal > is? > > -Jack >
