Hi, Todd:
I downloaded hadoop-0.20.2+320 and hbase-0.89.20100621+17 from CDH3
and
inserted data with full load, after a while the hbase regionserver
crashed.
I checked system with "iostat -x 5" and notice the disk is pretty
busy.
Then I modified my client code and reduced the insertion rate by 6
times,
and the test runs fine. Is there any way that regionserver be modified
so
that at least it doesn't crash under heavy load ? I used apache hbase
0.20.5 distribution and the same problem happens. I am thinking that
when
the regionserver is too busy, it should throttle incoming data rate to
protect the server. Could this be done ?
Do you also know when the CDH3 official release will come out ? the
one
I
downloaded is beta version.
Jimmy
2010-07-13 02:24:34,389 INFO
org.apache.hadoop.hbase.regionserver.HRegion:
Close
d Spam_MsgEventTable,56-2010-05-19
10:09:02\x099a420f4f31748828fd24aeea1d06b294,
1278973678315.01dd22f517dabf53ddd135709b68ba6c.
2010-07-13 02:24:34,389 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer:
aborting server at: m0002029.ppops.net,60020,1278969481450
2010-07-13 02:24:34,389 DEBUG
org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper
: Closed connection with ZooKeeper; /hbase/root-region-server
2010-07-13 02:24:34,389 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer:
regionserver60020 exiting
2010-07-13 02:24:34,608 INFO
org.apache.hadoop.hbase.regionserver.ShutdownHook:
Shutdown hook starting; hbase.shutdown.hook=true;
fsShutdownHook=Thread[Thread-1
0,5,main]
2010-07-13 02:24:34,608 INFO
org.apache.hadoop.hbase.regionserver.ShutdownHook:
Starting fs shutdown hook thread.
2010-07-13 02:24:34,608 ERROR org.apache.hadoop.hdfs.DFSClient:
Exception
closin
g file
/hbase/.logs/m0002029.ppops.net,60020,1278969481450/10.110.24.79%3A60020.
1278987220794 : java.io.IOException: IOException
flush:java.io.IOException:
IOEx
ception flush:java.io.IOException: IOException
flush:java.io.IOException:
IOExce
ption flush:java.io.IOException: IOException flush:java.io.IOException:
IOExcept
ion flush:java.io.IOException: IOException flush:java.io.IOException:
IOExceptio
n flush:java.io.IOException: IOException flush:java.io.IOException:
Error
Recove
ry for block blk_-1605696159279298313_2395924 failed because recovery
from
prim
ary datanode 10.110.24.80:50010 failed 6 times. Pipeline was
10.110.24.80:50010
. Aborting...
java.io.IOException: IOException flush:java.io.IOException: IOException
flush:ja
va.io.IOException: IOException flush:java.io.IOException: IOException
flush:java
.io.IOException: IOException flush:java.io.IOException: IOException
flush:java.i
o.IOException: IOException flush:java.io.IOException: IOException
flush:java.io.
IOException: IOException flush:java.io.IOException: Error Recovery for
block
blk
_-1605696159279298313_2395924 failed because recovery from primary
datanode
10.
110.24.80:50010 failed 6 times. Pipeline was 10.110.24.80:50010.
Aborting...
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.sync(DFSClient.java:
3214)
at
org.apache.hadoop.fs.FSDataOutputStream.sync(FSDataOutputStream.java:
97)
at
org.apache.hadoop.io.SequenceFile$Writer.syncFs(SequenceFile.java:944
)
at sun.reflect.GeneratedMethodAccessor24.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
sorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.sync(S
equenceFileLogWriter.java:124)
at
org.apache.hadoop.hbase.regionserver.wal.HLog.hflush(HLog.java:826)
at
org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1004)
at
org.apache.hadoop.hbase.regionserver.wal.HLog.append(HLog.java:817)
at
org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.j
ava:1531)
at
org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1447)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.put(HRegionServer.
java:1703)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.multiPut(HRegionSe
rver.java:2361)
at sun.reflect.GeneratedMethodAccessor10.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
sorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:576)
at
org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:
919)
2010-07-13 02:24:34,610 ERROR org.apache.hadoop.hdfs.DFSClient:
Exception
closin
g file
/hbase/Spam_MsgEventTable/079c7de876422e57e5f09fef5d997e06/.tmp/677365813
4549268273 : java.io.IOException: All datanodes 10.110.24.80:50010 are
bad.
Abor
ting...
java.io.IOException: All datanodes 10.110.24.80:50010 are bad.
Aborting...
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError
(DFSClient.java:2603)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1600(DFSClien
t.java:2139)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFS
Client.java:2306)
2010-07-13 02:24:34,729 INFO
org.apache.hadoop.hbase.regionserver.ShutdownHook:
Shutdown hook finished.