[ 
https://issues.apache.org/jira/browse/HBASE-12369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14187949#comment-14187949
 ] 

Esteban Gutierrez commented on HBASE-12369:
-------------------------------------------

[~ndimiduk] agreed for 10s of 100s of GBs probably 5% is just too much. I ran a 
very simple test without any workload and for 5GBs of direct memory the 
overhead is close to 10MB (~0.2%) before hitting an OOME when trying to 
allocate a bucket cache of 5GB. Another possibility is to call 
DirectMemoryUtils.getDirectMemoryUsage() before attempting to allocate the 
direct buffer, however if something else tries to allocate direct memory we 
will just be shifting the OOME somewhere else if the difference is too small.


> Warn if hbase.bucketcache.size too close or equal to MaxDirectMemorySize
> ------------------------------------------------------------------------
>
>                 Key: HBASE-12369
>                 URL: https://issues.apache.org/jira/browse/HBASE-12369
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: Esteban Gutierrez
>
> Our ref guide currently says that its required to leave some room from the 
> DirectMemory. However if hbase.bucketcache.size is too close or equal to 
> MaxDirectMemorySize it can trigger OOMEs: 
> {code}
> 2014-10-28 16:14:41,585 INFO  [master//172.16.0.101:16020] 
> util.ByteBufferArray: Allocating buffers total=5.00 GB, sizePerBuffer=4 MB, 
> count=1280, direct=true
> 2014-10-28 16:14:41,604 INFO  [172.16.0.101:16020.activeMasterManager] 
> master.ServerManager: Waiting for region servers count to settle; currently 
> checked in 1, slept for 99 ms, expecting minimum of 2, maximum of 2147483647, 
> timeout of 4500 ms, interval of 1500 ms.
> 2014-10-28 16:14:43,144 INFO  [172.16.0.101:16020.activeMasterManager] 
> master.ServerManager: Waiting for region servers count to settle; currently 
> checked in 1, slept for 1639 ms, expecting minimum of 2, maximum of 
> 2147483647, timeout of 4500 ms, interval of 1500 ms.
> 2014-10-28 16:14:44,057 INFO  [master//172.16.0.101:16020] 
> regionserver.HRegionServer: STOPPED: Failed initialization
> 2014-10-28 16:14:44,058 ERROR [master//172.16.0.101:16020] 
> regionserver.HRegionServer: Failed init
> java.lang.OutOfMemoryError: Direct buffer memory
>       at java.nio.Bits.reserveMemory(Bits.java:658)
>       at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:123)
>       at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:306)
>       at 
> org.apache.hadoop.hbase.util.ByteBufferArray.<init>(ByteBufferArray.java:65)
>       at 
> org.apache.hadoop.hbase.io.hfile.bucket.ByteBufferIOEngine.<init>(ByteBufferIOEngine.java:47)
>       at 
> org.apache.hadoop.hbase.io.hfile.bucket.BucketCache.getIOEngineFromName(BucketCache.java:310)
>       at 
> org.apache.hadoop.hbase.io.hfile.bucket.BucketCache.<init>(BucketCache.java:218)
>       at 
> org.apache.hadoop.hbase.io.hfile.CacheConfig.getL2(CacheConfig.java:513)
>       at 
> org.apache.hadoop.hbase.io.hfile.CacheConfig.instantiateBlockCache(CacheConfig.java:536)
>       at 
> org.apache.hadoop.hbase.io.hfile.CacheConfig.<init>(CacheConfig.java:213)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:1259)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:818)
>       at java.lang.Thread.run(Thread.java:724)
> {code}
> It would be helpful to print a warn message that hbase.bucketcache.size too 
> close or equal to MaxDirectMemorySize.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to