[jira] [Commented] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

stack (JIRA) Tue, 09 Dec 2014 22:54:59 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14240737#comment-14240737
 ]


stack commented on HBASE-10201:
-------------------------------

[~jeffreyz] When you say...

bq. ... This issue may only happen in 0.98 though.

because we are not doing DLR in 0.98 or for some other reason?  This patch is 
unlikely to make it back to 0.98 I'd say.

On the fix for 1.) above, hfiles, will be written out with the stores flushed 
seqid but we will tell keep on telling master the oldest unflushed edit 
(oldestUnflushedSeqId).  Since flush policies can return any set of Stores 
without regard to sequenceid, we could have edits in memstores with sequenceids 
that are in earlier than those of persisted hfiles.  Since telling the master 
oldestUnflushedSeqId does not guarantee that oldestUnflushedSeqId will be 
available at recovery time (it is in the master memory only IIRC, and master 
may crash and lose it), when region opens post-recovery, we look at sequenceids 
from hfiles to figure the regions sequenceid.  Will this mean we drop edits 
because region thinks its sequenceid is higher than it should be?

3. is a 'known' cost.  Good to know that DLR won't have this issue.

4. is a good point (as is 2.)


> Port 'Make flush decisions per column family' to trunk
> ------------------------------------------------------
>
>                 Key: HBASE-10201
>                 URL: https://issues.apache.org/jira/browse/HBASE-10201
>             Project: HBase
>          Issue Type: Improvement
>          Components: wal
>            Reporter: Ted Yu
>            Assignee: zhangduo
>             Fix For: 1.0.0, 2.0.0
>
>         Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, 
> HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201-0.99.patch, 
> HBASE-10201.patch, HBASE-10201_1.patch, HBASE-10201_10.patch, 
> HBASE-10201_11.patch, HBASE-10201_12.patch, HBASE-10201_13.patch, 
> HBASE-10201_13.patch, HBASE-10201_14.patch, HBASE-10201_15.patch, 
> HBASE-10201_16.patch, HBASE-10201_17.patch, HBASE-10201_2.patch, 
> HBASE-10201_3.patch, HBASE-10201_4.patch, HBASE-10201_5.patch, 
> HBASE-10201_6.patch, HBASE-10201_7.patch, HBASE-10201_8.patch, 
> HBASE-10201_9.patch, compactions.png, count.png, io.png, memstore.png
>
>
> Currently the flush decision is made using the aggregate size of all column 
> families. When large and small column families co-exist, this causes many 
> small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

Reply via email to