Hi,
I am running HBASE 0.90.3 (just upgraded for testing). It is configured
for 1.5G heap, which seemed to be a good setting for HBASE 0.20.6. When
running a stress test that would write into three HBASE data nodes from
24 processes with the goal of inserting one billion simple rows, I get
an OOMs at two of three region servers after about 75% of the work is done.
Here is the first OOM:
2011-07-09 23:34:40,988 DEBUG
org.apache.hadoop.hbase.regionserver.HRegion: Applied 924, skipped 1105,
firstSequenceidInLog=162957072, maxSequenceidInLog=163841413
2011-07-09 23:34:40,988 DEBUG
org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for
tir_items,customer/7/8CC6E17710156EE5518325B96E5F5EB9FF3278D2F2E8848E859E90CC7445AE8E,1309973529621.39f9da510435c2bc053fab116af0d4d6.,
current region memstore size 270.7k; wal is null, using passed
sequenceid=163841413
2011-07-09 23:34:40,989 DEBUG
org.apache.hadoop.hbase.regionserver.HRegion: Finished snapshotting,
commencing flushing stores
2011-07-09 23:34:43,266 DEBUG
org.apache.hadoop.hbase.regionserver.Store: loaded
hdfs://tirmaster:9000/hbase/tir_items/0fb951f11fe3caef6c5ad5595ffda9ea/original1/2395129059875563550,
isReference=false, isBulkLoadResult=false, seqid=150362469,
majorCompaction=false
2011-07-09 23:34:51,788 DEBUG
org.apache.hadoop.hbase.regionserver.Store: loaded
hdfs://tirmaster:9000/hbase/tir_items/0fb951f11fe3caef6c5ad5595ffda9ea/original1/2547547152617947847,
isReference=false, isBulkLoadResult=false, seqid=163671317,
majorCompaction=false
2011-07-09 23:34:58,652 DEBUG
org.apache.hadoop.hbase.regionserver.Store: loaded
hdfs://tirmaster:9000/hbase/tir_items/0fb951f11fe3caef6c5ad5595ffda9ea/original1/2867700810527601701,
isReference=false, isBulkLoadResult=false, seqid=150617582,
majorCompaction=false
2011-07-09 23:35:35,067 ERROR
org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while
processing event M_RS_OPEN_REGION
java.lang.OutOfMemoryError: Java heap space
at
org.apache.hadoop.hbase.io.hfile.HFile$Reader.readAllIndex(HFile.java:805)
at
org.apache.hadoop.hbase.io.hfile.HFile$Reader.loadFileInfo(HFile.java:832)
at
org.apache.hadoop.hbase.regionserver.StoreFile$Reader.loadFileInfo(StoreFile.java:1002)
at
org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:382)
at
org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:438)
at
org.apache.hadoop.hbase.regionserver.Store.loadStoreFiles(Store.java:266)
at
org.apache.hadoop.hbase.regionserver.Store.<init>(Store.java:208)
at
org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:2008)
at
org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:346)
at
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2551)
at
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2537)
at
org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:272)
at
org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:99)
at
org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:156)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
It then gets more until something fatal happen.
Now:
1. Is there any way to configure some stable heap size? Where is the
leak? This is really frustrating (it took a while to figure out 1.5G was
"somehow good" for 0.20.6)
2. Wouldn't it make sense to let the region server die at the first OOM
and have it restarted quickly rather then letting it go on in some
likely broken state after the OOM until it eventually dies anyway?
But, on the good side, 0.90.3 is notably faster at writing than 0.20.6.
Thanks,
*Henning Blohm*
*ZFabrik Software KG*
T: +49/62278399955
F: +49/62278399956
M: +49/1781891820
Bunsenstrasse 1
69190 Walldorf
[email protected] <mailto:[email protected]>
Linkedin <http://de.linkedin.com/pub/henning-blohm/0/7b5/628>
www.zfabrik.de <http://www.zfabrik.de>
www.z2-environment.eu <http://www.z2-environment.eu>