-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/44573/
-----------------------------------------------------------

(Updated March 10, 2016, 12:49 p.m.)


Review request for Ambari, Arpit Gupta, enis, and Vitalyi Brodetskyi.


Bugs: AMBARI-15355
    https://issues.apache.org/jira/browse/AMBARI-15355


Repository: ambari


Description
-------

On one of the clusters i did EU from 2.2.x to 2.3.x.

During upgrade there were problems with HBase service checks for region servers 
and thus upgrade is paused.

Region server start is failing with error

{code}
2016-03-03 19:55:31,203 ERROR [regionserver:16020] regionserver.HRegionServer: 
Failed init
java.lang.OutOfMemoryError: Direct buffer memory
at java.nio.Bits.reserveMemory(Bits.java:658)
at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:123)
at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:306)
at org.apache.hadoop.hbase.util.ByteBufferArray.<init>(ByteBufferArray.java:65)
at 
org.apache.hadoop.hbase.io.hfile.bucket.ByteBufferIOEngine.<init>(ByteBufferIOEngine.java:47)
at 
org.apache.hadoop.hbase.io.hfile.bucket.BucketCache.getIOEngineFromName(BucketCache.java:307)
at 
org.apache.hadoop.hbase.io.hfile.bucket.BucketCache.<init>(BucketCache.java:217)
at 
org.apache.hadoop.hbase.io.hfile.CacheConfig.getBucketCache(CacheConfig.java:614)
at org.apache.hadoop.hbase.io.hfile.CacheConfig.getL2(CacheConfig.java:553)
at 
org.apache.hadoop.hbase.io.hfile.CacheConfig.instantiateBlockCache(CacheConfig.java:637)
at org.apache.hadoop.hbase.io.hfile.CacheConfig.<init>(CacheConfig.java:231)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:1361)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:899)
at java.lang.Thread.run(Thread.java:745)
2016-03-03 19:55:31,206 FATAL [regionserver:16020] regionserver.RSRpcServices: 
Run out of memory; RSRpcServices will abort itself immediately
java.lang.OutOfMemoryError: Direct buffer memory
at java.nio.Bits.reserveMemory(Bits.java:658)
at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:123)
at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:306)
at org.apache.hadoop.hbase.util.ByteBufferArray.<init>(ByteBufferArray.java:65)
at 
org.apache.hadoop.hbase.io.hfile.bucket.ByteBufferIOEngine.<init>(ByteBufferIOEngine.java:47)
at 
org.apache.hadoop.hbase.io.hfile.bucket.BucketCache.getIOEngineFromName(BucketCache.java:307)
at 
org.apache.hadoop.hbase.io.hfile.bucket.BucketCache.<init>(BucketCache.java:217)
at 
org.apache.hadoop.hbase.io.hfile.CacheConfig.getBucketCache(CacheConfig.java:614)
at org.apache.hadoop.hbase.io.hfile.CacheConfig.getL2(CacheConfig.java:553)
at 
org.apache.hadoop.hbase.io.hfile.CacheConfig.instantiateBlockCache(CacheConfig.java:637)
at org.apache.hadoop.hbase.io.hfile.CacheConfig.<init>(CacheConfig.java:231)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:1361)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:899)
at java.lang.Thread.run(Thread.java:745)
2016-03-03 19:55:35,138 INFO  [main] zookeeper.ZooKeeper: Client 
environment:zookeeper.version=3.4.6-3485--1, built on 12/16/2015 02:35 GMT
{code}

issue is not that hbase_max_direct_memory_size is not set, but the value coming 
from "HBase off-heap MaxDirectMemorySize" which I assume comes as the 
hbase_max_direct_memory_size templete variable is set to 12288, but the bucket 
cache is set to 18G:
hbase.bucketcache.size 18432
Since bucket cache is an offheap cache, hbase_max_direct_memory_size should be 
> hbase.bucketcache.size

This was seen on the following cluster: 
https://s.c:8443/#/main/services/HBASE/configs

the bucket cache config (which is an offheap cache) is set to 18G, but the 
hbase_max_direct_memory_size is set to 12G. hbase_max_direct_memory_size should 
always be higher than the offheap cache size. Both are configured from Ambari. 
it reproduces on nodes with a lot of RAM (>23 GB), stack advisor performs 
recommendation of hbase_max_direct_memory_size


Diffs (updated)
-----

  ambari-server/src/main/resources/stacks/HDP/2.2/services/stack_advisor.py 
2518528 
  ambari-server/src/test/python/stacks/2.3/common/test_stack_advisor.py 11818ba 

Diff: https://reviews.apache.org/r/44573/diff/


Testing
-------

mvn clean test


Thanks,

Dmitro Lisnichenko

Reply via email to