RE: hbase error caused by DFS timeout

Buttler, David Tue, 10 Aug 2010 08:00:58 -0700

Thanks Ryan.
I actually already had this in the hbase-site.xml, but somehow missed putting 
it in the core-site.xml.
The cluster seems to be much happier.
Dave



-----Original Message-----
From: Ryan Rawson [mailto:ryano...@gmail.com] 
Sent: Monday, August 09, 2010 12:01 PM
To: user@hbase.apache.org
Subject: Re: hbase error caused by DFS timeout

Try this config:

<property>
<name>dfs.datanode.socket.write.timeout</name>
<value>0</value>
</property>

in both hbase-site.xml and core-site.xml in the hadoop configs.

-ryan

On Mon, Aug 9, 2010 at 10:02 AM, Buttler, David <buttl...@llnl.gov> wrote:
> Hi all,
> I seem to get this error far too frequently:
>
> 2010-08-09 09:54:03,685 INFO org.apache.hadoop.hbase.regionserver.Store: 
> Started compaction of 1 file(s) in annotations of 
> doc,293817e024ed1d54a11e9e7c9b836dd837badbbc,1281372823189, 
> hasReferences=true, into /hbase/doc/compaction.dir/237345967, seqid=1182913218
> 2010-08-09 09:54:03,784 WARN org.apache.hadoop.hdfs.DFSClient: 
> DFSOutputStream ResponseProcessor exception  for block 
> blk_-4556852958383799371_431518java.net.SocketTimeoutException: 6000 millis 
> timeout while waiting for channel to be ready for read. ch : 
> java.nio.channels.SocketChannel[connected local=/10.220.5.35:49924 
> remote=/10.220.5.14:50010]
>        at 
> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
>        at 
> org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155)
>        at 
> org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128)
>        at java.io.DataInputStream.readFully(DataInputStream.java:178)
>        at java.io.DataInputStream.readLong(DataInputStream.java:399)
>        at 
> org.apache.hadoop.hdfs.protocol.DataTransferProtocol$PipelineAck.readFields(DataTransferProtocol.java:119)
>        at 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$ResponseProcessor.run(DFSClient.java:2424)
>
> This basically is taking one of clusters down consistently.  Is there an 
> obvious thing I can do about this?
> I have seen this across three different clusters with radically different 
> hardware, leading me to believe that I have misconfigured something in either 
> hbase or hdfs
>
> Any ideas of where to look?
>
> Thanks,
> Dave
>

RE: hbase error caused by DFS timeout

Reply via email to