[ 
https://issues.apache.org/jira/browse/HBASE-16994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15629054#comment-15629054
 ] 

Duo Zhang commented on HBASE-16994:
-----------------------------------

Thanks for pointing out this, I think the stage to reproduce the bug is correct.

On the fix, I think we need to do the reset work after fencing mvcc? Otherwise 
you can not make sure whether the RingBufferEventHandler has done the sequence 
id accounting work. And if we do not have such a fencing when flush, then I 
think this is a very critical bug that we may lose data...

> Region report a last flushed sequence id that is less than the previous last 
> flushed sequence id 
> -------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-16994
>                 URL: https://issues.apache.org/jira/browse/HBASE-16994
>             Project: HBase
>          Issue Type: Bug
>            Reporter: binlijin
>         Attachments: HBASE-16994_master_v1.patch, HBASE-16994_master_v2.patch
>
>
> Since append will be published to RingBuffer and handled asynchronously, it's 
> possible that one append (say append-X) of the region handled by 
> RingBufferEventHandler between startCacheFlush and getNextSequenceId, and 
> reset FSHLog#oldestUnflushedStoreSequenceIds which we just cleared in 
> #startCacheFlush. This might disturb ServerManager#flushedSequenceIdByRegion 
> like shown below (assume region-A has two CF: cfA and cfB)
>    
> 1. flush-A runs to startCacheFlush and it will flush both cfA and cfB, 
> oldestUnflushedStoreSequenceIds of regionA got cleared
>  2. append-X on cfB handled by RingBufferEventHandler, 
> oldestUnflushedStoreSequenceIds set to 10, for example
>  3. flush-A runs to getNextSequenceId and returned 11
>  4. ServerManager#flushedSequenceIdByRegion for regionA set to 11
>  5. flush-A finishes
>  6. flush-B starts and only flush cfA, getNextSequenceId returned 10, and 
> flushedSeqId will return 9, and cause warning in ServerManager
> Since this append-X will also got flushed, we should clear the 
> oldestUnflushedStoreSequenceIds again to make sure we won't disturb
>  ServerManager#flushedSequenceIdByRegion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to