[ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14240419#comment-14240419
 ] 

zhangduo commented on HBASE-10201:
----------------------------------

{quote}
1) There may be a correctness issue for same version(same row key & version) 
updates...
{quote}
I think you mean the KVScannerComparator will use sequenceId to compare if we 
get the same key. Yes this is a problem I missed. I think we need to change the 
code below as you suggested, use store's max seqId instead of flushSeqId here.
{code}
        for (Store s : storesToFlush) {
          totalFlushableSizeOfFlushableStores += s.getFlushableSize();
          storeFlushCtxs.add(s.createFlushContext(flushSeqId));
          committedFiles.put(s.getFamily().getName(), null); // for writing 
stores to WAL
        }
{code}
{quote}
2) We have a feature where we force a flush...
{quote}
That's why I introduce a FlushPolicy. Now the policy is simple that we only 
consider the size of a store. So if we keep a store for a long time then there 
will be a force flush all stores request which may generate unnecessary small 
files. I think we can introduce new FlushPolicy later to handle it better.
{quote}
3) For region server recovery...
{quote}
I think the issue in "1)" also make the problem even worse that the flushSeqId 
passed to createFlushContext will be used as maxSeqId in a store...I will fix 
it in the next patch. And If we want to skip WAL exactly, then we need to 
report a familyName->seqId map to master which will change the rpc protocol(and 
the format of zk data in distributed log replay). This is a big change so I 
think we can reopen HBASE-12405 to handle it after HBASE-10201 getting in.
{quote}
4) Relating to your FlushMarker question...
{quote}
I will fix getFamilyNames(), thanks. And is there anything else that make read 
replicas broken? I'm not familiar with read replicas so may miss something.

Thanks~

> Port 'Make flush decisions per column family' to trunk
> ------------------------------------------------------
>
>                 Key: HBASE-10201
>                 URL: https://issues.apache.org/jira/browse/HBASE-10201
>             Project: HBase
>          Issue Type: Improvement
>          Components: wal
>            Reporter: Ted Yu
>            Assignee: zhangduo
>             Fix For: 1.0.0, 2.0.0, 0.98.10
>
>         Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, 
> HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201-0.99.patch, 
> HBASE-10201.patch, HBASE-10201_1.patch, HBASE-10201_10.patch, 
> HBASE-10201_11.patch, HBASE-10201_12.patch, HBASE-10201_13.patch, 
> HBASE-10201_13.patch, HBASE-10201_14.patch, HBASE-10201_15.patch, 
> HBASE-10201_16.patch, HBASE-10201_17.patch, HBASE-10201_2.patch, 
> HBASE-10201_3.patch, HBASE-10201_4.patch, HBASE-10201_5.patch, 
> HBASE-10201_6.patch, HBASE-10201_7.patch, HBASE-10201_8.patch, 
> HBASE-10201_9.patch, compactions.png, count.png, io.png, memstore.png
>
>
> Currently the flush decision is made using the aggregate size of all column 
> families. When large and small column families co-exist, this causes many 
> small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to