I don't think setting the timeout to 0 is a good idea - after all we have a
lot writes going on so it should happen at times that a resource isn't
available immediately. Am I missing something or what's your reasoning for
assuming that the timeout value is the problem?

On Thu, Sep 24, 2009 at 2:19 PM, Amandeep Khurana <[email protected]> wrote:

> When do you get this error?
>
> Try making the timeout to 0. That'll remove the timeout of 480s. Property
> name: dfs.datanode.socket.write.timeout
>
> -ak
>
>
>
> Amandeep Khurana
> Computer Science Graduate Student
> University of California, Santa Cruz
>
>
> On Thu, Sep 24, 2009 at 1:36 PM, Florian Leibert <[email protected]> wrote:
>
> > Hi,
> > recently, we're seeing frequent STEs in our datanodes. We had prior fixed
> > this issue by upping the handler count max.xciever (note this is
> misspelled
> > in the code as well - so we're just being consistent).
> > We're using 0.19 with a couple of patches - none of which should affect
> any
> > of the areas in the stacktrace.
> >
> > We've seen this before upping the limits on the xcievers - but these
> > settings seem very high already. We're running 102 nodes.
> >
> > Any hints would be appreciated.
> >
> >  <property>
> >    <name>dfs.datanode.handler.count</name>
> >    <value>300</value>
> > </property>
> > <property>
> >   <name>dfs.namenode.handler.count</name>
> >    <value>300</value>
> >  </property>
> >  <property>
> >    <name>dfs.datanode.max.xcievers</name>
> >    <value>2000</value>
> >  </property>
> >
> >
> > 2009-09-24 17:48:13,648 ERROR
> > org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(
> > 10.16.160.79:50010,
> > storageID=DS-1662533511-10.16.160.79-50010-1219665628349, infoPort=50075,
> > ipcPort=50020):DataXceiver
> > java.net.SocketTimeoutException: 480000 millis timeout while waiting for
> > channel to be ready for write. ch :
> > java.nio.channels.SocketChannel[connected local=/10.16.160.79:50010
> remote=/
> > 10.16.134.78:34280]
> >        at
> >
> >
> org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:185)
> >        at
> >
> >
> org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:159)
> >        at
> >
> >
> org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:198)
> >        at
> >
> >
> org.apache.hadoop.hdfs.server.datanode.BlockSender.sendChunks(BlockSender.java:293)
> >        at
> >
> >
> org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:387)
> >        at
> >
> >
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:179)
> >        at
> >
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:94)
> >        at java.lang.Thread.run(Thread.java:619)
> >
>

Reply via email to