[
https://issues.apache.org/jira/browse/BOOKKEEPER-636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ivan Kelly updated BOOKKEEPER-636:
----------------------------------
Attachment: 0001-BOOKKEEPER-636-Latest-txn-logs-might-be-deleted-in-a.patch
The createNewFile change is very small, and ensures two processes don't make a
mess of each others journals. I'm not sure two processes could even get this
far, but it's better to be safe than sorry. (i've attached a patch with
includes this change)
> Latest txn logs might be deleted in a race condition which is not recoverable
> if BK goes down before next txn log created.
> --------------------------------------------------------------------------------------------------------------------------
>
> Key: BOOKKEEPER-636
> URL: https://issues.apache.org/jira/browse/BOOKKEEPER-636
> Project: Bookkeeper
> Issue Type: Bug
> Components: bookkeeper-server
> Affects Versions: 4.2.1, 4.3.0
> Reporter: Vinay
> Assignee: Vinay
> Priority: Blocker
> Fix For: 4.2.2, 4.3.0
>
> Attachments:
> 0001-BOOKKEEPER-636-Latest-txn-logs-might-be-deleted-in-a.patch,
> BOOKKEEPER-636.diff, BOOKKEEPER-636.diff, BOOKKEEPER-636.patch,
> BOOKKEEPER-636.patch
>
>
> With the following scenario latest transaction log can be deleted.
> 1. more than {{journalMaxBackups}} txn logs are there in journal dir.
> 2. BK machine was up for long time and the latest txn log id is some what
> huge number
> 3. Now reboot the machine.
> 4. after reboot BK restarted.
> 5. Now, Immediately after startup, One entry is added, due to which Synthread
> rolled the lastMark in ledger dirs before the lastLogId updated by Journal
> thread. (this lastMark was having the old logId which was before reboot).
> 6. Now after roll, old journal txn logs were gc'ed. *Now latest created the
> txn log was deleted.*
> 7. After this Journal thread updated the lastLogMark, also some more rolls
> happened.
> 8. Now BK restarted again. But BK was not able to start because it was not
> able to find the latest txn log file in journal directory.
> {noformat}java.io.IOException: Recovery log 264564 is missing
> at org.apache.bookkeeper.bookie.Journal.replay(Journal.java:424)
> at org.apache.bookkeeper.bookie.Bookie.readJournal(Bookie.java:547)
> at org.apache.bookkeeper.bookie.Bookie.start(Bookie.java:603)
> at
> org.apache.bookkeeper.proto.BookieServer.start(BookieServer.java:111){noformat}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira