We use 0.20.6 with HBASE-2473 As you can see from the following region server log snippet, OOME happened to this RS:
2010-08-11 03:59:12,760 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 17 on 60020' on region 2__HB_NOINC_GRID_0809-THREEGPPSPEECHCALLS-1281499094297,\x0E\x0E\x0E\x0E\x0E\x0E\x0E\x0E\x0E\x0E\x0E\x0E\x0E\x0E\x0E\x0E\x0E\x0E\x0E\x0E\x0E\x0E\x0E\x0E\x0E\x0E\x0E\x0E\x0E\x0E\x0E\x0E,1281499095128: memstore size 1.0g is >= than blocking 1.0g size 2010-08-11 03:59:16,853 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 24 on 60020' on region 2__HB_NOINC_GRID_0809-THREEGPPSPEECHCALLS-1281499094297,\x0E\x0E\x0E\x0E\x0E\x0E\x0E\x0E\x0E\x0E\x0E\x0E\x0E\x0E\x0E\x0E\x0E\x0E\x0E\x0E\x0E\x0E\x0E\x0E\x0E\x0E\x0E\x0E\x0E\x0E\x0E\x0E,1281499095128: memstore size 1.0g is >= than blocking 1.0g size 2010-08-11 03:59:44,524 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: OutOfMemoryError, aborting. java.lang.OutOfMemoryError: Java heap space at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:39) at java.nio.ByteBuffer.allocate(ByteBuffer.java:312) at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:825) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:419) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener.run(HBaseServer.java:318) 2010-08-11 03:59:44,525 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Dump of metrics: request=0.0, regions=9, stores=22, storefiles=4, storefileIndexSize=5, memstoreSize=1502, compactionQueueSize=0, usedHeap=*3929*, maxHeap=3973, blockCacheSize=6836104, blockCacheFree=826362424, blockCacheCount=0, blockCacheHitRatio=0, fsReadLatency=0, fsWriteLatency=0, fsSyncLatency=0 Among the other RS, the highest usedHeap is 1750 On Sat, Jul 31, 2010 at 3:31 PM, Ryan Rawson <ryano...@gmail.com> wrote: > Hi, > > #3 is going to be tricky... due to the ebb And flow of the gc this value > isn't as accurate as one would wish. Furthermore we flush nematodes based > on > ram pressure. > > Any algorithm would have to have the property of being stable and > conservative... rebalancing is not a 0 impact operation. > > There are jiras open for the rebalance based on load. To date it hasn't > been > a practical problem here at SU in our prod clusters however. > > On Jul 31, 2010 3:18 PM, "Ted Yu" <yuzhih...@gmail.com> wrote: > > Hi, > > Currently load balancing only considers region count. > > See ServerManager.getAverageLoad() > > > > I think load balancing should consider the following three factors for > each > > RS: > > 1. number of regions it hosts > > 2. number of requests it serves within given period > > 3. how close usedHeap is to maxHeap > > > > Please comment how we should weigh the above three factors in deciding > the > > regions to offload from each RS. > > > > Thanks >