[jira] Commented: (HBASE-3327) For increment workloads, retain memstores in memory after flushing them

Kannan Muthukkaruppan (JIRA) Thu, 09 Dec 2010 12:55:26 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-3327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12969944#action_12969944
 ]


Kannan Muthukkaruppan commented on HBASE-3327:
----------------------------------------------

Ryan: If this happened only for recent HFiles or compactions of recent files, 
and not for say bigger compactions-- then yes, the two schemes start to have 
more similarities. The trouble with writing to block cache on all HFile 
creations (i.e. not just flushes but also on all compactions) is too much old 
data could be rewritten, and you might have storms that fully clear out items 
in the block cache. Jonathan has suggested knobs to throttle how much "write 
through" happens--- but they are size based rather than recency of data based.

But I agree your suggestion sounds like a viable alternative with the right 
tweaks.


> For increment workloads, retain memstores in memory after flushing them
> -----------------------------------------------------------------------
>
>                 Key: HBASE-3327
>                 URL: https://issues.apache.org/jira/browse/HBASE-3327
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Karthik Ranganathan
>
> This is an improvement based on our observation of what happens in an 
> increment workload. The working set is typically small and is contained in 
> the memstores. 
> 1. The reason the memstores get flushed is because the number of wal logs 
> limit gets hit. 
> 2. This in turn triggers compactions, which evicts the block cache. 
> 3. Flushing of memstore and eviction of the block cache causes disk reads for 
> increments coming in after this because the data is no longer in memory.
> We could solve this elegantly by retaining the memstores AFTER they are 
> flushed into files. This would mean we can quickly populate the new memstore 
> with the working set of data from memory itself without having to hit disk. 
> We can throttle the number of such memstores we retain, or the memory 
> allocated to it. In fact, allocating a percentage of the block cache to this 
> would give us a huge boost.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-3327) For increment workloads, retain memstores in memory after flushing them

Reply via email to