[
https://issues.apache.org/jira/browse/HBASE-12369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14187922#comment-14187922
]
Nick Dimiduk commented on HBASE-12369:
--------------------------------------
I don't like pct-based decision making because 5% means something very
different when you have half a TB of ram than the usual case. Same reason I
don't like the existing checker for Memstore + BlockCache = 80%. I don't know
how much "headroom" the JVM needs in direct memory. I vaguely remember seeing
the number 64m, but I don't recall where. If we can decide on what that value
should be, it would be good to have a check + fatal exception as you describe.
We should make it possible to disable the constraint checking though.
> Warn if hbase.bucketcache.size too close or equal to MaxDirectMemorySize
> ------------------------------------------------------------------------
>
> Key: HBASE-12369
> URL: https://issues.apache.org/jira/browse/HBASE-12369
> Project: HBase
> Issue Type: Bug
> Components: regionserver
> Reporter: Esteban Gutierrez
>
> Our ref guide currently says that its required to leave some room from the
> DirectMemory. However if hbase.bucketcache.size is too close or equal to
> MaxDirectMemorySize it can trigger OOMEs:
> {code}
> 2014-10-28 16:14:41,585 INFO [master//172.16.0.101:16020]
> util.ByteBufferArray: Allocating buffers total=5.00 GB, sizePerBuffer=4 MB,
> count=1280, direct=true
> 2014-10-28 16:14:41,604 INFO [172.16.0.101:16020.activeMasterManager]
> master.ServerManager: Waiting for region servers count to settle; currently
> checked in 1, slept for 99 ms, expecting minimum of 2, maximum of 2147483647,
> timeout of 4500 ms, interval of 1500 ms.
> 2014-10-28 16:14:43,144 INFO [172.16.0.101:16020.activeMasterManager]
> master.ServerManager: Waiting for region servers count to settle; currently
> checked in 1, slept for 1639 ms, expecting minimum of 2, maximum of
> 2147483647, timeout of 4500 ms, interval of 1500 ms.
> 2014-10-28 16:14:44,057 INFO [master//172.16.0.101:16020]
> regionserver.HRegionServer: STOPPED: Failed initialization
> 2014-10-28 16:14:44,058 ERROR [master//172.16.0.101:16020]
> regionserver.HRegionServer: Failed init
> java.lang.OutOfMemoryError: Direct buffer memory
> at java.nio.Bits.reserveMemory(Bits.java:658)
> at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:123)
> at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:306)
> at
> org.apache.hadoop.hbase.util.ByteBufferArray.<init>(ByteBufferArray.java:65)
> at
> org.apache.hadoop.hbase.io.hfile.bucket.ByteBufferIOEngine.<init>(ByteBufferIOEngine.java:47)
> at
> org.apache.hadoop.hbase.io.hfile.bucket.BucketCache.getIOEngineFromName(BucketCache.java:310)
> at
> org.apache.hadoop.hbase.io.hfile.bucket.BucketCache.<init>(BucketCache.java:218)
> at
> org.apache.hadoop.hbase.io.hfile.CacheConfig.getL2(CacheConfig.java:513)
> at
> org.apache.hadoop.hbase.io.hfile.CacheConfig.instantiateBlockCache(CacheConfig.java:536)
> at
> org.apache.hadoop.hbase.io.hfile.CacheConfig.<init>(CacheConfig.java:213)
> at
> org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:1259)
> at
> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:818)
> at java.lang.Thread.run(Thread.java:724)
> {code}
> It would be helpful to print a warn message that hbase.bucketcache.size too
> close or equal to MaxDirectMemorySize.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)