Hi,
My hadoop is running fine when don't start hbase service . And my network
is normal , I checked !
now , I restart hbase service , HDFS read timeout will occur!
need you help , Thanks!
[email protected]
From: Jean-Marc Spaggiari
Date: 2014-11-07 20:57
To: user
Subject: Re: hbase cannot normally start regionserver in the environment of big
data.
Hi,
Have you checked that your Hadoop is running fine? Have you checked that
network between your servers is fine to?
JM
2014-11-07 5:22 GMT-05:00 [email protected] <[email protected]>:
> I've deploied a "2+4" cluster which has been normally running for a
> long time.
> The cluster has got more than 40T data.When I initiatively shut the hbase
> service
> and try to restart it,the regionserver will be dead.
>
> The log of regionserver shows that all the regions are opened. But in
> the logs of the datanode can see WARN and ERROR logs.
> Bellow is the log for details:
>
> 2014-11-07 14:47:21,584 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 10.230.63.12:50010, dest: /10.230.63.9:39405, bytes: 4696, op: HDFS_READ,
> cliID: DFSClient_hb_rs_salve1,60020,1415342303886_-
> 2037622978_29, offset: 31996928, srvID:
> bb0032a3-1170-4a34-b85b-e2cfa0d56cb2, blockid: BP-1731746090-10.230.63.3-
> 1406195669990:blk_1078709392_4968828, duration: 7978822
> 2014-11-07 14:47:21,596 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: exception:
> java.net.SocketTimeoutException: 480000 millis timeout while waiting
> for channel to be ready for write. ch :
> java.nio.channels.SocketChannel[connected local=/10.230.63.12:50010
> remote=/10.230.63.11:41511]
> at
> org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246)
> at
> org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:172)
> at
> org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:220)
> at
> org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:547)
> at
> org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:712)
> at
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:479)
> at
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:110)
> at
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:68)
> at
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:229)
> at java.lang.Thread.run(Thread.java:744)
> 2014-11-07 14:47:21,599 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 10.230.63.12:50010, dest: /10.230.63.11:41511, bytes: 726528, op:
> HDFS_READ, cliID: DFSClient_hb_rs_salve3,60020,1415342303807_1094119849_29,
> offset: 0, srvID: bb0032a3-1170-4a34-b85b-e2cfa0d56cb2, blockid:
> BP-1731746090-10.230.63.3-1406195669990:blk_1078034913_4294168, duration:
> 480190668115
> 2014-11-07 14:47:21,599 WARN
> org.apache.hadoop.hdfs.server.datanode.DataNode:
> DatanodeRegistration(10.230.63.12,
> datanodeUuid=bb0032a3-1170-4a34-b85b-e2cfa0d56cb2, infoPort=50075,
> ipcPort=50020, storageInfo=lv=-55;cid=cluster12;nsid=395652542;c=0):Got
> exception while serving
> BP-1731746090-10.230.63.3-1406195669990:blk_1078034913_4294168 to /
> 10.230.63.11:41511
> java.net.SocketTimeoutException: 480000 millis timeout while waiting for
> channel to be ready for write. ch :
> java.nio.channels.SocketChannel[connected local=/10.230.63.12:50010
> remote=/10.230.63.11:41511]
> at
> org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246)
> at
> org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:172)
> at
> org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:220)
> at
> org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:547)
> at
> org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:712)
> at
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:479)
> at
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:110)
> at
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:68)
> at
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:229)
> at java.lang.Thread.run(Thread.java:744)
> 2014-11-07 14:47:21,600 ERROR
> org.apache.hadoop.hdfs.server.datanode.DataNode: salve4:50010:DataXceiver
> error processing READ_BLOCK operation src: /10.230.63.11:41511 dest: /
> 10.230.63.12:50010
>
>
> I personally think it was caused on the load on open stage,where the
> disk IO of the cluster can
> be very high and the pressure can be huge.
>
> I wonder what results in reading error while reading hfile,and what
> leads to timeout.
> Are there any solutions that can control the speed of loading on open and
> reduce
> pressure of the cluster?
>
> I need help !
>
> Thanks!
>
>
>
>
> [email protected]
>