[ 
https://issues.apache.org/jira/browse/HBASE-11099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13986942#comment-13986942
 ] 

stack commented on HBASE-11099:
-------------------------------

Looking, the ring blocks on a sequence until published.  So the ring buffer 
sequence reflects the order in which the items are taken off the ring buffer on 
the other side by the single consumer thread.  We could exploit this fact 
keeping the current ring buffer sequence and then upping the region sequence id 
by the difference.  But the region sequence id would not be synced edits.  It 
would be appended edits only.  Wouldn't this make region sequence unusable for 
mvcc? (presuming mvcc is about being when to make sync'd edits readable).

Using ringbuffer sequence as the region seqid won't work though I suggest it 
above; they need to be disassociated since their scopes are different (HRegion 
and HRegionServer)... tying them as I suggest above will cause headache at 
region open time trying to keep the ringbuffer ahead of all current edits.







> Two situations where we could open a region with smaller sequence number
> ------------------------------------------------------------------------
>
>                 Key: HBASE-11099
>                 URL: https://issues.apache.org/jira/browse/HBASE-11099
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.99.0
>            Reporter: Jeffrey Zhong
>             Fix For: 0.99.0
>
>
> Recently I happened to run into code where we potentially could open region 
> with smaller sequence number:
> 1) Inside function: HRegion#internalFlushcache. This is due to we change the 
> way WAL Sync where we use late binding(assign sequence number right before 
> wal sync).
> The flushSeqId may less than the change sequence number included in the flush 
> which may cause later region opening code to use a smaller than expected 
> sequence number when we reopen the region.
> {code}
> flushSeqId = this.sequenceId.incrementAndGet();
> ...
> mvcc.waitForRead(w);
> {code}
> 2) HRegion#replayRecoveredEdits where we have following code:
> {code}
> ...
>           if (coprocessorHost != null) {
>             status.setStatus("Running pre-WAL-restore hook in coprocessors");
>             if (coprocessorHost.preWALRestore(this.getRegionInfo(), key, 
> val)) {
>               // if bypass this log entry, ignore it ...
>               continue;
>             }
>           }
> ...
>           currentEditSeqId = key.getLogSeqNum();
> {code} 
> If coprocessor skip some tail WALEdits, then the function will return smaller 
> currentEditSeqId. In the end, a region may also open with a smaller sequence 
> number. This may cause data loss because Master may record a larger flushed 
> sequence Id and some WALEdits maybe skipped during recovery if the region 
> fail again.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to