[jira] Commented: (HBASE-3327) For increment workloads, retain memstores in memory after flushing them

Jonathan Gray (JIRA) Wed, 15 Dec 2010 13:10:34 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-3327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12971837#action_12971837
 ]


Jonathan Gray commented on HBASE-3327:
--------------------------------------

I actually disagree that the biggest benefit is 1 memstore plus snapshot.  That 
would then cover flushes but not compactions.  As stated, flushing w/ 
cacheOnWrite would be virtually the same but consume 25% the memory.  So for 
this case, I don't see the clear benefit of retaining the snapshot vs. 
cacheOnWrite of the flushed file.

This change is significant and would require a good bit of modifications to the 
tracking of aggregate MemStore sizes and the rules around eviction when under 
global heap pressure.  I still do like this idea in general but not sure it's 
the best direction for effort to be spent right now.

> For increment workloads, retain memstores in memory after flushing them
> -----------------------------------------------------------------------
>
>                 Key: HBASE-3327
>                 URL: https://issues.apache.org/jira/browse/HBASE-3327
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Karthik Ranganathan
>
> This is an improvement based on our observation of what happens in an 
> increment workload. The working set is typically small and is contained in 
> the memstores. 
> 1. The reason the memstores get flushed is because the number of wal logs 
> limit gets hit. 
> 2. This in turn triggers compactions, which evicts the block cache. 
> 3. Flushing of memstore and eviction of the block cache causes disk reads for 
> increments coming in after this because the data is no longer in memory.
> We could solve this elegantly by retaining the memstores AFTER they are 
> flushed into files. This would mean we can quickly populate the new memstore 
> with the working set of data from memory itself without having to hit disk. 
> We can throttle the number of such memstores we retain, or the memory 
> allocated to it. In fact, allocating a percentage of the block cache to this 
> would give us a huge boost.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-3327) For increment workloads, retain memstores in memory after flushing them

Reply via email to