[ 
https://issues.apache.org/jira/browse/BOOKKEEPER-685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13776159#comment-13776159
 ] 

Rakesh R commented on BOOKKEEPER-685:
-------------------------------------

Thanks [~ikelly] for the finding. If I understood the problem statement, its 
the race between SyncThread flushing and the compaction logic.

bq.the problem here is you set flush to false ahead of adding entry to entry 
logger. the better way is to set flushed to false after you add something to 
entry logger, no?

[~hustlmsp], thanks for the simple proposal. In the approach, I could see one 
possible race condition, here #onRotateEntryLog can override 'flushed' to true, 
which was modified to false immediately after #addEntry. Could you please see 
the below execution sequence. If agree, just changing the flushed=false 
sequence would not really help us no?

Th1 : addEntry
Th2 : onRotateEntryLog
Th1 : setFlushed=false
Th2 : setFlushed=true
Th1: offsets > max, waits for flushed, as flushed is true, will update in 
ledger cache
Th2 : flushed out
                
> Race in compaction algorithm from BOOKKEEPER-664
> ------------------------------------------------
>
>                 Key: BOOKKEEPER-685
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-685
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Ivan Kelly
>            Priority: Blocker
>             Fix For: 4.2.2
>
>
> I discovered a race in the algorithm when I was forward porting to trunk.
> 1) Thread1: flushed.set(false)
> 2) Thread2: onRotateEntryLog() // flushed.set(true)
> 3) Thread1: entryLogger addEntry L123-E456
> 4) Thread1: offsets > max, waits for flushed, flushed is true(as set in 2), 
> L123-E456 updated in ledger cache
> 5) T2: L123 flushed out of ledger cache
> 6) Crash
> This will possible lose 1 entry. I've only reasoned this, not observed it, 
> but it can happen.
> The fix is pretty easy. EntryLoggerListener should notify with the point 
> offset in the entry log it has synced as far as. 
>       

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to