[
https://issues.apache.org/jira/browse/HBASE-3149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13804537#comment-13804537
]
Gaurav Menghani commented on HBASE-3149:
----------------------------------------
The basic idea is to be able to maintain the smallest LSN amongst the edits
present in a particular memstore for a column family. When we decide to flush a
set of memstores, we find the smallest LSN id amongst the memstores that we are
not flushing, say X, and say that we can remove the logs for any edits with LSN
less than X. We choose a particular memstore to be flushed, if it occupies more
than 't' bytes, when the global memstore size threshold is 'T' (and t/T = 1/4
for our configuration). If there is no memstore with >= t bytes but the total
size of all the memstores is above T, we flush all the memstores.
> Make flush decisions per column family
> --------------------------------------
>
> Key: HBASE-3149
> URL: https://issues.apache.org/jira/browse/HBASE-3149
> Project: HBase
> Issue Type: Improvement
> Components: regionserver
> Reporter: Karthik Ranganathan
> Assignee: Gaurav Menghani
> Priority: Critical
> Fix For: 0.89-fb
>
> Attachments: Per-CF-Memstore-Flush.diff
>
>
> Today, the flush decision is made using the aggregate size of all column
> families. When large and small column families co-exist, this causes many
> small flushes of the smaller CF. We need to make per-CF flush decisions.
--
This message was sent by Atlassian JIRA
(v6.1#6144)