[jira] [Commented] (HBASE-16994) Region report a last flushed sequence id that is less than the previous last flushed sequence id

Yu Li (JIRA) Wed, 02 Nov 2016 07:34:31 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-16994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15629141#comment-15629141
 ]


Yu Li commented on HBASE-16994:
-------------------------------

I come across HBASE-16721 when checking branch-1 commit history on 
{{HRegion.java}} and I think it's a similar issue (but not the same). And I 
think we could borrow the method from branch-1 code like below:
{code}
    MultiVersionConcurrencyControl.WriteEntry writeEntry = mvcc.begin();
    // wait for all in-progress transactions to commit to WAL before
    // we can start the flush. This prevents
    // uncommitted transactions from being written into HFiles.
    // We have to block before we start the flush, otherwise keys that
    // were removed via a rollbackMemstore could be written to Hfiles.
    mvcc.completeAndWait(writeEntry);
    // set writeEntry to null to prevent mvcc.complete from being called again 
inside finally
    // block
    writeEntry = null;
{code}
before {{startCacheFlush}} and I think it's safer than clearing the 
{{oldestUnflushedStoreSequenceIds}}? Does this way address your concern 
[~Apache9]?

[~stack] and [~enis], mind take a look here since it's pretty much like 
HBASE-16721 but some case we neglected to address for master branch? Thanks.

> Region report a last flushed sequence id that is less than the previous last 
> flushed sequence id 
> -------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-16994
>                 URL: https://issues.apache.org/jira/browse/HBASE-16994
>             Project: HBase
>          Issue Type: Bug
>            Reporter: binlijin
>         Attachments: HBASE-16994_master_v1.patch, HBASE-16994_master_v2.patch
>
>
> Since append will be published to RingBuffer and handled asynchronously, it's 
> possible that one append (say append-X) of the region handled by 
> RingBufferEventHandler between startCacheFlush and getNextSequenceId, and 
> reset FSHLog#oldestUnflushedStoreSequenceIds which we just cleared in 
> #startCacheFlush. This might disturb ServerManager#flushedSequenceIdByRegion 
> like shown below (assume region-A has two CF: cfA and cfB)
>    
> 1. flush-A runs to startCacheFlush and it will flush both cfA and cfB, 
> oldestUnflushedStoreSequenceIds of regionA got cleared
>  2. append-X on cfB handled by RingBufferEventHandler, 
> oldestUnflushedStoreSequenceIds set to 10, for example
>  3. flush-A runs to getNextSequenceId and returned 11
>  4. ServerManager#flushedSequenceIdByRegion for regionA set to 11
>  5. flush-A finishes
>  6. flush-B starts and only flush cfA, getNextSequenceId returned 10, and 
> flushedSeqId will return 9, and cause warning in ServerManager
> Since this append-X will also got flushed, we should clear the 
> oldestUnflushedStoreSequenceIds again to make sure we won't disturb
>  ServerManager#flushedSequenceIdByRegion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16994) Region report a last flushed sequence id that is less than the previous last flushed sequence id

Reply via email to