Hi,
recently, we're seeing frequent STEs in our datanodes. We had prior fixed
this issue by upping the handler count max.xciever (note this is misspelled
in the code as well - so we're just being consistent).
We're using 0.19 with a couple of patches - none of which should affect any
of the areas in the stacktrace.

We've seen this before upping the limits on the xcievers - but these
settings seem very high already. We're running 102 nodes.

Any hints would be appreciated.

 <property>
    <name>dfs.datanode.handler.count</name>
    <value>300</value>
</property>
<property>
   <name>dfs.namenode.handler.count</name>
    <value>300</value>
 </property>
 <property>
    <name>dfs.datanode.max.xcievers</name>
    <value>2000</value>
 </property>


2009-09-24 17:48:13,648 ERROR
org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(
10.16.160.79:50010,
storageID=DS-1662533511-10.16.160.79-50010-1219665628349, infoPort=50075,
ipcPort=50020):DataXceiver
java.net.SocketTimeoutException: 480000 millis timeout while waiting for
channel to be ready for write. ch :
java.nio.channels.SocketChannel[connected local=/10.16.160.79:50010 remote=/
10.16.134.78:34280]
        at
org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:185)
        at
org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:159)
        at
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:198)
        at
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendChunks(BlockSender.java:293)
        at
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:387)
        at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:179)
        at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:94)
        at java.lang.Thread.run(Thread.java:619)

Reply via email to