[
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
zhangduo updated HBASE-10201:
-----------------------------
Attachment: HBASE-10201-0.98.patch
I port the 3149-trunk-v1.txt patch to branch 0.98(a "just make it work"
version, not the final version).
Port to master is more difficult because of the rewrite of HLog.
Flush per CF means we need to record the oldest sequence id per store instead
of per region, so the patch add a seqNum parameter when add kv to store, which
means we need to know the seqNum before we add kv to store.
It is easy on branch 0.98, just need to change the order of appendNoSync of wal
and write back to memstore(am I right?). But on master, HLog seems to use a
event-driven framework, and I am not sure when will the seqNum be determined.
The second problem is the flushSeqId. on 0.98, it is just a simple incAndGet,
but on master it uses a method in HLog. So on 0.98, if we only flush some of
the stores, we can set the flushSeqId to the oldest seqNum stored in the stores
that not being flushed and do not inc sequenceId. But on master, I do not know
the side effect of the method.Is it ok to remove the method call, or we still
need to log something?
> Port 'Make flush decisions per column family' to trunk
> ------------------------------------------------------
>
> Key: HBASE-10201
> URL: https://issues.apache.org/jira/browse/HBASE-10201
> Project: HBase
> Issue Type: Improvement
> Reporter: Ted Yu
> Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch
>
>
> Currently the flush decision is made using the aggregate size of all column
> families. When large and small column families co-exist, this causes many
> small flushes of the smaller CF. We need to make per-CF flush decisions.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)