[
https://issues.apache.org/jira/browse/BOOKKEEPER-685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13776159#comment-13776159
]
Rakesh R commented on BOOKKEEPER-685:
-------------------------------------
Thanks [~ikelly] for the finding. If I understood the problem statement, its
the race between SyncThread flushing and the compaction logic.
bq.the problem here is you set flush to false ahead of adding entry to entry
logger. the better way is to set flushed to false after you add something to
entry logger, no?
[~hustlmsp], thanks for the simple proposal. In the approach, I could see one
possible race condition, here #onRotateEntryLog can override 'flushed' to true,
which was modified to false immediately after #addEntry. Could you please see
the below execution sequence. If agree, just changing the flushed=false
sequence would not really help us no?
Th1 : addEntry
Th2 : onRotateEntryLog
Th1 : setFlushed=false
Th2 : setFlushed=true
Th1: offsets > max, waits for flushed, as flushed is true, will update in
ledger cache
Th2 : flushed out
> Race in compaction algorithm from BOOKKEEPER-664
> ------------------------------------------------
>
> Key: BOOKKEEPER-685
> URL: https://issues.apache.org/jira/browse/BOOKKEEPER-685
> Project: Bookkeeper
> Issue Type: Bug
> Reporter: Ivan Kelly
> Priority: Blocker
> Fix For: 4.2.2
>
>
> I discovered a race in the algorithm when I was forward porting to trunk.
> 1) Thread1: flushed.set(false)
> 2) Thread2: onRotateEntryLog() // flushed.set(true)
> 3) Thread1: entryLogger addEntry L123-E456
> 4) Thread1: offsets > max, waits for flushed, flushed is true(as set in 2),
> L123-E456 updated in ledger cache
> 5) T2: L123 flushed out of ledger cache
> 6) Crash
> This will possible lose 1 entry. I've only reasoned this, not observed it,
> but it can happen.
> The fix is pretty easy. EntryLoggerListener should notify with the point
> offset in the entry log it has synced as far as.
>
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira