[jira] [Commented] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

zhangduo (JIRA) Mon, 08 Dec 2014 23:30:23 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14239084#comment-14239084
 ]


zhangduo commented on HBASE-10201:
----------------------------------

[~stack] I followed RegionSplitPolicy to write FlushPolicy, expect that 
[~tedyu] suggested using FlushPolicyFactory and placing the factory method in 
it instead of FlushPolicy. Maybe the code of RegionSplitPolicy is old and need 
refactoring too...

{quote}
The FlushPolicy api is a little odd. It implements Configured but where do you 
do a setConf on it? Then in the configureForRegion method, you take a Region 
but all it is used for is to emit region name on Strings and to get instance of 
HTableDescriptor. The flush takes a list of stores. Can't it get them from the 
region it was given when configuredForRegion? This is a nit comment. Ignore for 
now.
{quote}
ReflectionUtils.newInstance(clazz, conf) will call setConf. And I agreed that 
if we implement configureForRegion, then the list of stores is not necessary 
when doing selection. Can be fixed later.

[~jeffreyz] I think the biggest problem is that this patch change the 
flushSeqId generation. flushSeqId will not be bumped if we do not flush all 
stores. I think the flushSeqId should be called as "highestFlushedToDiskSeqId" 
in this patch. And actually I do not know where we use FlushMarker so I do not 
know the meaning of flushSeqId in the Marker...

Thanks.

> Port 'Make flush decisions per column family' to trunk
> ------------------------------------------------------
>
>                 Key: HBASE-10201
>                 URL: https://issues.apache.org/jira/browse/HBASE-10201
>             Project: HBase
>          Issue Type: Improvement
>          Components: wal
>            Reporter: Ted Yu
>            Assignee: zhangduo
>            Priority: Critical
>             Fix For: 1.0.0, 2.0.0, 0.98.9
>
>         Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, 
> HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201-0.99.patch, 
> HBASE-10201.patch, HBASE-10201_1.patch, HBASE-10201_10.patch, 
> HBASE-10201_11.patch, HBASE-10201_12.patch, HBASE-10201_13.patch, 
> HBASE-10201_13.patch, HBASE-10201_14.patch, HBASE-10201_15.patch, 
> HBASE-10201_16.patch, HBASE-10201_17.patch, HBASE-10201_2.patch, 
> HBASE-10201_3.patch, HBASE-10201_4.patch, HBASE-10201_5.patch, 
> HBASE-10201_6.patch, HBASE-10201_7.patch, HBASE-10201_8.patch, 
> HBASE-10201_9.patch, compactions.png, count.png, io.png, memstore.png
>
>
> Currently the flush decision is made using the aggregate size of all column 
> families. When large and small column families co-exist, this causes many 
> small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

Reply via email to