[ 
https://issues.apache.org/jira/browse/BOOKKEEPER-685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13778426#comment-13778426
 ] 

Sijie Guo commented on BOOKKEEPER-685:
--------------------------------------

{code}
Th1 - compaction thread
Th2 - SyncThread

1) Th1: addEntry and sets flushed.set(false); // Consider that added entry is 
the 'last entry' of the last ledger participated in compaction. After this, 
compaction would move to flush.
2) Th2: onRotateEntryLog and sets flushed.set(true);
3) Th1: scannerFactory.flush(); // since it sees flushed==true, it will iterate 
over the offsets and flush out
4) Th1: removeEntryLog
5) server crashed

In the above sequence, I could see a possible loss of 'last entry' which is not 
flushed into the entry logger. Any thoughts?
{code}

first to clarify, th2 could never be SyncThread.

in step2, when entry logger roate entry log, it already flushed previous entry 
log, which means the entry added by th1 is flushed. I don't see how it would 
loss the last entry. And again, I already explained your case in my previous 
comment (right after your first question).

{quote}
in 4.2 branch, if addEntry happened after #onRotateEntryLog, GCThread will 
setFlushed to false again; if addEntry happened before #onRotateEntryLog, this 
entry is already flushed.
{quote}
                
> Race in compaction algorithm from BOOKKEEPER-664
> ------------------------------------------------
>
>                 Key: BOOKKEEPER-685
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-685
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Ivan Kelly
>            Priority: Blocker
>             Fix For: 4.2.2
>
>
> I discovered a race in the algorithm when I was forward porting to trunk.
> 1) Thread1: flushed.set(false)
> 2) Thread2: onRotateEntryLog() // flushed.set(true)
> 3) Thread1: entryLogger addEntry L123-E456
> 4) Thread1: offsets > max, waits for flushed, flushed is true(as set in 2), 
> L123-E456 updated in ledger cache
> 5) T2: L123 flushed out of ledger cache
> 6) Crash
> This will possible lose 1 entry. I've only reasoned this, not observed it, 
> but it can happen.
> The fix is pretty easy. EntryLoggerListener should notify with the point 
> offset in the entry log it has synced as far as. 
>       

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to