[
https://issues.apache.org/jira/browse/HBASE-16994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15629060#comment-15629060
]
Yu Li commented on HBASE-16994:
-------------------------------
Supplement about the background: we've observed lots of below warnings in
HMaster log in our production env and above is what we found after
investigation.
{noformat}
2016-09-07 21:17:09,559 WARN [PriorityRpcServer.handler=14,queue=0,port=60100]
master.ServerManager: RegionServer
hadoop0676.et2.tbsite.net,16020,1472107731858 indicates a last flushed sequence
id (26683793) that is less than the
previous last flushed sequence id (26683796) for region
main_result_b,1879,1465227739374.a5b18fc39144b7333dec8bad22d56f11.
Ignoring.
{noformat}
> Region report a last flushed sequence id that is less than the previous last
> flushed sequence id
> -------------------------------------------------------------------------------------------------
>
> Key: HBASE-16994
> URL: https://issues.apache.org/jira/browse/HBASE-16994
> Project: HBase
> Issue Type: Bug
> Reporter: binlijin
> Attachments: HBASE-16994_master_v1.patch, HBASE-16994_master_v2.patch
>
>
> Since append will be published to RingBuffer and handled asynchronously, it's
> possible that one append (say append-X) of the region handled by
> RingBufferEventHandler between startCacheFlush and getNextSequenceId, and
> reset FSHLog#oldestUnflushedStoreSequenceIds which we just cleared in
> #startCacheFlush. This might disturb ServerManager#flushedSequenceIdByRegion
> like shown below (assume region-A has two CF: cfA and cfB)
>
> 1. flush-A runs to startCacheFlush and it will flush both cfA and cfB,
> oldestUnflushedStoreSequenceIds of regionA got cleared
> 2. append-X on cfB handled by RingBufferEventHandler,
> oldestUnflushedStoreSequenceIds set to 10, for example
> 3. flush-A runs to getNextSequenceId and returned 11
> 4. ServerManager#flushedSequenceIdByRegion for regionA set to 11
> 5. flush-A finishes
> 6. flush-B starts and only flush cfA, getNextSequenceId returned 10, and
> flushedSeqId will return 9, and cause warning in ServerManager
> Since this append-X will also got flushed, we should clear the
> oldestUnflushedStoreSequenceIds again to make sure we won't disturb
> ServerManager#flushedSequenceIdByRegion.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)