Are you logging GC activity for the datanodes?

On Mon, Mar 28, 2011 at 9:28 PM, Jack Levin <[email protected]> wrote:
> Good Evening, anyone seen this in your logs?  It could be something
> simple that we are missing.   We also seeing that Datanodes can't be
> accessed from the webport 50075 every ones in a while.
>
> -Jack
>
> On Mon, Mar 28, 2011 at 4:19 PM, Jack Levin <[email protected]> wrote:
>> Hello guys, we are getting those errors:
>>
>>
>> 2011-03-28 15:08:33,485 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src:
>> /10.101.6.5:50010, dest: /10.101.6.5:51365, bytes: 66564, op:
>> HDFS_READ, cliI
>> D: 
>> DFSClient_hb_rs_rdaf5.prod.imageshack.com,60020,1301323415015_1301323415053,
>> offset: 4191232, srvID: DS-1528941561-10.101.6.5-50010-1299713950021,
>> blockid: blk_-30874978
>> 22408705276_723501, duration: 14409579
>> 2011-03-28 15:08:33,492 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src:
>> /10.101.6.5:50010, dest: /10.101.6.5:51366, bytes: 14964, op:
>> HDFS_READ, cliI
>> D: 
>> DFSClient_hb_rs_rdaf5.prod.imageshack.com,60020,1301323415015_1301323415053,
>> offset: 67094016, srvID: DS-1528941561-10.101.6.5-50010-1299713950021,
>> blockid: blk_-3224146
>> 686136187733_731011, duration: 8855000
>> 2011-03-28 15:08:33,495 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src:
>> /10.101.6.5:50010, dest: /10.101.6.5:51368, bytes: 51600, op:
>> HDFS_READ, cliI
>> D: 
>> DFSClient_hb_rs_rdaf5.prod.imageshack.com,60020,1301323415015_1301323415053,
>> offset: 0, srvID: DS-1528941561-10.101.6.5-50010-1299713950021,
>> blockid: blk_-63843345833451
>> 99846_731014, duration: 2053969
>> 2011-03-28 15:08:33,503 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src:
>> /10.101.6.5:50010, dest: /10.101.6.5:42553, bytes: 462336, op:
>> HDFS_READ, cli
>> ID: 
>> DFSClient_hb_rs_rdaf5.prod.imageshack.com,60020,1301323415015_1301323415053,
>> offset: 327680, srvID: DS-1528941561-10.101.6.5-50010-1299713950021,
>> blockid: blk_-47512832
>> 94726600221_724785, duration: 480254862706
>> 2011-03-28 15:08:33,504 WARN
>> org.apache.hadoop.hdfs.server.datanode.DataNode:
>> DatanodeRegistration(10.101.6.5:50010,
>> storageID=DS-1528941561-10.101.6.5-50010-1299713950021,
>>  infoPort=50075, ipcPort=50020):Got exception while serving
>> blk_-4751283294726600221_724785 to /10.101.6.5:
>> java.net.SocketTimeoutException: 480000 millis timeout while waiting
>> for channel to be ready for write. ch :
>> java.nio.channels.SocketChannel[connected local=/10.101.6.5:500
>> 10 remote=/10.101.6.5:42553]
>>        at 
>> org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246)
>>        at 
>> org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:159)
>>        at 
>> org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:198)
>>        at 
>> org.apache.hadoop.hdfs.server.datanode.BlockSender.sendChunks(BlockSender.java:350)
>>        at 
>> org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:436)
>>        at 
>> org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:197)
>>        at 
>> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:110)
>>
>> 2011-03-28 15:08:33,504 ERROR
>> org.apache.hadoop.hdfs.server.datanode.DataNode:
>> DatanodeRegistration(10.101.6.5:50010,
>> storageID=DS-1528941561-10.101.6.5-50010-1299713950021
>> , infoPort=50075, ipcPort=50020):DataXceiver
>> java.net.SocketTimeoutException: 480000 millis timeout while waiting
>> for channel to be ready for write. ch :
>> java.nio.channels.SocketChannel[connected local=/10.101.6.5:500
>> 10 remote=/10.101.6.5:42553]
>>        at 
>> org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246)
>>        at 
>> org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:159)
>>        at 
>> org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:198)
>>        at 
>> org.apache.hadoop.hdfs.server.datanode.BlockSender.sendChunks(BlockSender.java:350)
>>        at 
>> org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:436)
>>        at 
>> org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:197)
>>        at 
>> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:110)
>> 2011-03-28 15:08:33,504 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src:
>> /10.101.6.5:50010, dest: /10.101.6.5:51369, bytes: 66564, op:
>> HDFS_READ, cliI
>> D: 
>> DFSClient_hb_rs_rdaf5.prod.imageshack.com,60020,1301323415015_1301323415053,
>> offset: 4781568, srvID: DS-1528941561-10.101.6.5-50010-1299713950021,
>> blockid: blk_-30874978
>> 22408705276_723501, duration: 11478016
>> 2011-03-28 15:08:33,506 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src:
>> /10.101.6.5:50010, dest: /10.101.6.5:51370, bytes: 66564, op:
>> HDFS_READ, cliI
>> D: 
>> DFSClient_hb_rs_rdaf5.prod.imageshack.com,60020,1301323415015_1301323415053,
>> offset: 66962944, srvID: DS-1528941561-10.101.6.5-50010-1299713950021,
>> blockid: blk_-3224146
>> 686136187733_731011, duration: 7643688
>>
>>
>> RS talking to DN, and we are getting timeouts, there are no issues
>> like ulimit afaik, as we start them with 32k.  Any ideas what the deal
>> is?
>>
>> -Jack
>>
>

Reply via email to