[ 
https://issues.apache.org/jira/browse/HBASE-9399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13850125#comment-13850125
 ] 

Matt Corgan commented on HBASE-9399:
------------------------------------

It could be cool to flush the memstore into the block cache periodically.  The 
block cache would hold in-memory copies of hfiles, where the blocks are labeled 
as transient so they don't get evicted.  Several in-memory hfiles could build 
up in the block cache before a flush that merges them together while writing to 
disk (or while writing back to the block cache).  This would reduce the memory 
footprint of the data by eliminating significant CSLM overhead, and it could be 
further reduced with block encoding.  It would also let us give a greater % of 
the memory to the block cache where the eviction algorithm can do better 
prioritization of what should be evicted.  Maybe some regions can grow to 2GB 
of transient memstore blocks while other regions are persisted at 64MB.

(sorry this is out of place on this jira)

> Up the memstore flush size
> --------------------------
>
>                 Key: HBASE-9399
>                 URL: https://issues.apache.org/jira/browse/HBASE-9399
>             Project: HBase
>          Issue Type: Task
>          Components: regionserver
>    Affects Versions: 0.98.0, 0.96.0
>            Reporter: Elliott Clark
>            Assignee: Elliott Clark
>             Fix For: 0.98.0
>
>
> As heap sizes get bigger we are still recommending that users keep their 
> number of regions to a minimum.  This leads to lots of un-used memstore 
> memory.
> For example I have a region server with 48 gigs of ram.  30 gigs are there 
> for the region server.  This with current defaults the global memstore size 
> reserved is 8 gigs.
> The per region memstore size is 128mb right now.  That means that I need 80 
> regions actively taking writes to reach the global memstore size.  That 
> number is way out of line with what our split policies currently give users.  
> They are given much fewer regions by default.
> We should up the hbase.hregion.memstore.flush.size size.  Ideally we should 
> auto tune everything.  But until then I think something like 512mb would help 
> a lot with our write throughput on clusters that don't have several hundred 
> regions per RS.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Reply via email to