[
https://issues.apache.org/jira/browse/BOOKKEEPER-685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13777289#comment-13777289
]
Rakesh R commented on BOOKKEEPER-685:
-------------------------------------
I'm looking at the below code, where scannerFactory is also flushing the
ledgerCache. In worst case, when the compaction thread finishes compaction and
now if it sees flushed=true, then will flush out the ledgercache. Followed an
immediate bookie crash can leads to trouble. Hope I'm not confusing you guys.
{code}
GarbageCollectorThread#doCompactEntryLogs(){
//....
compactEntryLog(scannerFactory, meta);
//....
// compaction finished, flush any outstanding offsets
scannerFactory.flush();
//...
}
{code}
> Race in compaction algorithm from BOOKKEEPER-664
> ------------------------------------------------
>
> Key: BOOKKEEPER-685
> URL: https://issues.apache.org/jira/browse/BOOKKEEPER-685
> Project: Bookkeeper
> Issue Type: Bug
> Reporter: Ivan Kelly
> Priority: Blocker
> Fix For: 4.2.2
>
>
> I discovered a race in the algorithm when I was forward porting to trunk.
> 1) Thread1: flushed.set(false)
> 2) Thread2: onRotateEntryLog() // flushed.set(true)
> 3) Thread1: entryLogger addEntry L123-E456
> 4) Thread1: offsets > max, waits for flushed, flushed is true(as set in 2),
> L123-E456 updated in ledger cache
> 5) T2: L123 flushed out of ledger cache
> 6) Crash
> This will possible lose 1 entry. I've only reasoned this, not observed it,
> but it can happen.
> The fix is pretty easy. EntryLoggerListener should notify with the point
> offset in the entry log it has synced as far as.
>
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira