[ 
https://issues.apache.org/jira/browse/HBASE-21436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16675865#comment-16675865
 ] 

Vladimir Rodionov commented on HBASE-21436:
-------------------------------------------

Hi, I do not see anything worth of fixing here. First of all, 2MB per Region is 
the overhead customer must accept and live with. Second of all,  3K regions per 
RS is too high for 4GB heap size - this goes against best practices and 
recommendations.






>  Getting OOM frequently if hold many regions
> --------------------------------------------
>
>                 Key: HBASE-21436
>                 URL: https://issues.apache.org/jira/browse/HBASE-21436
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>    Affects Versions: 1.4.8, 2.1.1, 2.0.2
>            Reporter: Zephyr Guo
>            Priority: Major
>         Attachments: HBASE-21436-UT.patch
>
>
> Recently, some feedback reached me from a customer which complains about 
> NotServingRegionException thrown out at intevals. I examined his cluster and 
> found there were quite a lot of OOM logs there but throughtput is in quite 
> low level. In this customer's case, each RS has 3k regions and heap size of 
> 4G. I dumped heap when OOM took place, and found that a lot of Chunk objects 
> (counts as much as 1700) was there.
> Eventually, piecing all these evidences together, I came to the conclusion 
> that:
>  * The root cause is that global flush is triggered by size of all memstores, 
> rather than size of all chunks.
>  * A chunk is always allocated for each region, even we only write a few data 
> to the region.
> And in this case, a total of 3.4G memory was consumed by 1700 chunks, 
> although throughput is very low.
>  Although 3K regions is too much for RS with 4G memory, it is still wise to 
> improve RS stability in such scenario (In fact, most customers buy a small 
> size HBase on cloud side).
>   
>  I provide a patch (only contain UT) to reproduce this case (just send a 
> batch).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to