[
https://issues.apache.org/jira/browse/HBASE-16994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15629141#comment-15629141
]
Yu Li commented on HBASE-16994:
-------------------------------
I come across HBASE-16721 when checking branch-1 commit history on
{{HRegion.java}} and I think it's a similar issue (but not the same). And I
think we could borrow the method from branch-1 code like below:
{code}
MultiVersionConcurrencyControl.WriteEntry writeEntry = mvcc.begin();
// wait for all in-progress transactions to commit to WAL before
// we can start the flush. This prevents
// uncommitted transactions from being written into HFiles.
// We have to block before we start the flush, otherwise keys that
// were removed via a rollbackMemstore could be written to Hfiles.
mvcc.completeAndWait(writeEntry);
// set writeEntry to null to prevent mvcc.complete from being called again
inside finally
// block
writeEntry = null;
{code}
before {{startCacheFlush}} and I think it's safer than clearing the
{{oldestUnflushedStoreSequenceIds}}? Does this way address your concern
[~Apache9]?
[~stack] and [~enis], mind take a look here since it's pretty much like
HBASE-16721 but some case we neglected to address for master branch? Thanks.
> Region report a last flushed sequence id that is less than the previous last
> flushed sequence id
> -------------------------------------------------------------------------------------------------
>
> Key: HBASE-16994
> URL: https://issues.apache.org/jira/browse/HBASE-16994
> Project: HBase
> Issue Type: Bug
> Reporter: binlijin
> Attachments: HBASE-16994_master_v1.patch, HBASE-16994_master_v2.patch
>
>
> Since append will be published to RingBuffer and handled asynchronously, it's
> possible that one append (say append-X) of the region handled by
> RingBufferEventHandler between startCacheFlush and getNextSequenceId, and
> reset FSHLog#oldestUnflushedStoreSequenceIds which we just cleared in
> #startCacheFlush. This might disturb ServerManager#flushedSequenceIdByRegion
> like shown below (assume region-A has two CF: cfA and cfB)
>
> 1. flush-A runs to startCacheFlush and it will flush both cfA and cfB,
> oldestUnflushedStoreSequenceIds of regionA got cleared
> 2. append-X on cfB handled by RingBufferEventHandler,
> oldestUnflushedStoreSequenceIds set to 10, for example
> 3. flush-A runs to getNextSequenceId and returned 11
> 4. ServerManager#flushedSequenceIdByRegion for regionA set to 11
> 5. flush-A finishes
> 6. flush-B starts and only flush cfA, getNextSequenceId returned 10, and
> flushedSeqId will return 9, and cause warning in ServerManager
> Since this append-X will also got flushed, we should clear the
> oldestUnflushedStoreSequenceIds again to make sure we won't disturb
> ServerManager#flushedSequenceIdByRegion.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)