hadoop 1.0
hbase 0.94.11
datanode log from 192.168.10.45. why it shut down itself?
2014-04-21 20:33:59,309 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock
blk_-7969006819959471805_202154 received exception
java.io.InterruptedIOException: Interruped while waiting for IO on
channel java.nio.channels.SocketChannel[closed]. 0 millis timeout
left.
2014-04-21 20:33:59,310 ERROR
org.apache.hadoop.hdfs.server.datanode.DataNode:
DatanodeRegistration(192.168.10.45:50010,
storageID=DS-1676697306-192.168.10.45-50010-1392029190949,
infoPort=50075, ipcPort=50020):DataXceiver
java.io.InterruptedIOException: Interruped while waiting for IO on
channel java.nio.channels.SocketChannel[closed]. 0 millis timeout
left.
at
org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:349)
at
org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:157)
at
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155)
at
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:273)
at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
at java.io.DataInputStream.read(DataInputStream.java:149)
at
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:265)
at
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:312)
at
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:376)
at
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:532)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:398)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:107)
at java.lang.Thread.run(Thread.java:722)
2014-04-21 20:33:59,310 ERROR
org.apache.hadoop.hdfs.server.datanode.DataNode:
DatanodeRegistration(192.168.10.45:50010,
storageID=DS-1676697306-192.168.10.45-50010-1392029190949,
infoPort=50075, ipcPort=50020):DataXceiver
java.io.InterruptedIOException: Interruped while waiting for IO on
channel java.nio.channels.SocketChannel[closed]. 466924 millis timeout
left.
at
org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:349)
at
org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:245)
at
org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:159)
at
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:198)
at
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendChunks(BlockSender.java:350)
at
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:436)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:197)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:99)
at java.lang.Thread.run(Thread.java:722)
2014-04-21 20:34:00,291 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Waiting for
threadgroup to exit, active threads is 0
2014-04-21 20:34:00,404 INFO
org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService:
Shutting down all async disk service threads...
2014-04-21 20:34:00,405 INFO
org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: All
async disk service threads have been shut down.
2014-04-21 20:34:00,413 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Exiting Datanode
2014-04-21 20:34:00,424 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at app-hbase-1/192.168.10.45
************************************************************/
On Tue, Apr 22, 2014 at 11:25 AM, Ted Yu <[email protected]> wrote:
> bq. one datanode failed
>
> Was the crash due to out of memory error ?
> Can you post the tail of data node log on pastebin ?
>
> Giving us versions of hadoop and hbase would be helpful.
>
>
> On Mon, Apr 21, 2014 at 7:39 PM, Li Li <[email protected]> wrote:
>
>> I have a small hbase cluster with 1 namenode, 1 secondary namenode, 4
>> datanode.
>> and the hbase master is on the same machine with namenode, 4 hbase
>> slave on datanode machine.
>> I found average requests per seconds is about 10,000. and the clusters
>> crashed. and I found the reason is one datanode failed.
>>
>> the datanode configuration is about 4 cpu core and 10GB memory
>> is my cluster overloaded?
>>