[jira] [Commented] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

stack (JIRA) Mon, 08 Dec 2014 22:53:38 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14239062#comment-14239062
 ]


stack commented on HBASE-10201:
-------------------------------

Here is some feedback on reading through latest version of the patch. Lets 
address these minor items after we are sure the most important part is working, 
the sequenceid handling (I'm running tests here but its taking a while -- first 
I need to prove that hbase 1.0 branch is healthy, then I intro your patch... 
and [~jeffreyz], Mr SequenceId is going to take a look too).  So hold on making 
a patch till testing and Jeffrey's review are done.

Sorry this is taking so long to get in.  On the one hand, you can tell we are 
excited about getting this patch in because the improvement is really nice but 
it touches a very sensitive part of hbase, the region sequenceid'ing, so we 
need to exercise extra caution.  Thanks for you patience [~Apache9]

Nits to be addressed on commit or if you make a new version of the patch 
(you've done enough as it is -- smile -- and I could do below on commit np).

I can go over the javadoc on commit. Small edit would fix it all up nicely.

Below is a nit that can be addressed in a follow-on:

This config is not general.  It belongs to a particular policy ("If 
FlushLargeStoresPolicy is used..."): 
hbase.hregion.percolumnfamilyflush.size.lower.bound  Should probably have the 
policy it is for in its name.  Maybe just don't mention is in 
hbase-default.xml.  Let uses find it if they need it (16MB is a nice default 
low-bound).

It is odd that this is public:

public static Class<? extends FlushPolicy> getFlushPolicyClass(

It is nice that the master tests that we can load a policy but it does not even 
use flush policy (if we fail to load fall back to default with big warning?)  
And flush policies are over in regionserver package so here we have master 
reaching over and into the regionserver package. Would be good to avoid doing 
this x-package reach especially when it does not seem to be needed.

I would think this would be an internal method for the factory to use?

Also in HTD, you call it getFlushPolicyClassName but here you call it 
getFlushPolicyClass... would be good to be same.

This policy stuff you've added is nicer than what was here previous.  Good one.

Should these two strings just be the same?

FLUSH_SIZE_LOWER_BOUND_KEY and DEFAULT_FLUSH_SIZE_LOWER_BOUND even though they 
are read from different places?  No harm the key being the same especially 
since in HTD, you hide the key by providing getter/setters.

The FlushPolicy api is a little odd.  It implements Configured but where do you 
do a setConf on it? Then in the configureForRegion method, you take a Region 
but all it is used for is to emit region name on Strings and to get instance of 
HTableDescriptor.  The flush takes a list of stores.  Can't it get them from 
the region it was given when configuredForRegion?  This is a nit comment.  
Ignore for now.

... Stopped at sequence id  changes.... will be back. Thanks.

> Port 'Make flush decisions per column family' to trunk
> ------------------------------------------------------
>
>                 Key: HBASE-10201
>                 URL: https://issues.apache.org/jira/browse/HBASE-10201
>             Project: HBase
>          Issue Type: Improvement
>          Components: wal
>            Reporter: Ted Yu
>            Assignee: zhangduo
>            Priority: Critical
>             Fix For: 1.0.0, 2.0.0, 0.98.9
>
>         Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, 
> HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201-0.99.patch, 
> HBASE-10201.patch, HBASE-10201_1.patch, HBASE-10201_10.patch, 
> HBASE-10201_11.patch, HBASE-10201_12.patch, HBASE-10201_13.patch, 
> HBASE-10201_13.patch, HBASE-10201_14.patch, HBASE-10201_15.patch, 
> HBASE-10201_16.patch, HBASE-10201_17.patch, HBASE-10201_2.patch, 
> HBASE-10201_3.patch, HBASE-10201_4.patch, HBASE-10201_5.patch, 
> HBASE-10201_6.patch, HBASE-10201_7.patch, HBASE-10201_8.patch, 
> HBASE-10201_9.patch, compactions.png, count.png, io.png, memstore.png
>
>
> Currently the flush decision is made using the aggregate size of all column 
> families. When large and small column families co-exist, this causes many 
> small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

Reply via email to