[
https://issues.apache.org/jira/browse/BOOKKEEPER-193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13238075#comment-13238075
]
Sijie Guo commented on BOOKKEEPER-193:
--------------------------------------
this issue is a bug of the logic of garbage collection. currently the garbage
collection is executed, by first fetching a list of all ledgers, fetching the
active ledgers from bookie, then garbage collecting those active ledgers not in
zookeeper list. there is a time period between fetching list from zookeeper and
fetching list from bookie, if a ledger created in this time period, it would be
garbage collected by mistake.
for FlatLedgerManager, this issue could be fixed easily. Since the ledgers are
created in sequence, we can get the max ledger id when fetching list of all
ledgers from zookeeper, during garbage collection, those ledgers are larger
than max ledger id would not be garbage collected until next garbage collection
is executed.
for HierarchicalLedgerManager, it is different, because the id generation and
the ledger creation is two different operations running in asynchronous. one
possible solution is fetching a copy of active ledgers from bookie first (the
requests came in after fetching should not put in the list of active ledgers
used for gc), then fetching the list of all ledgers from zookeeper, which can
ensure we get the right list of all ledgers from zookeeper.
> Ledger is garbage collected by mistake.
> ---------------------------------------
>
> Key: BOOKKEEPER-193
> URL: https://issues.apache.org/jira/browse/BOOKKEEPER-193
> Project: Bookkeeper
> Issue Type: Bug
> Components: bookkeeper-server
> Reporter: Sijie Guo
> Fix For: 4.1.0
>
>
> currently, we encountered such case: ledger is garbage collected by mistake,
> and following requests would fail due to NoLedgerException.
> {code}
> 2012-03-23 19:10:47,403 - INFO
> [GarbageCollectorThread:GarbageCollectorThread@234] - Garbage collecting
> deleted ledger index files.
> 2012-03-23 19:10:48,702 - INFO [GarbageCollectorThread:LedgerCache@544] -
> Deleting ledgerId: 89408
> 2012-03-23 19:10:48,703 - INFO [GarbageCollectorThread:LedgerCache@577] -
> Deleted ledger : 89408
> 2012-03-23 19:11:10,013 - ERROR [NIOServerFactory-3181:BookieServer@361] -
> Error writing 1@89408
> org.apache.bookkeeper.bookie.Bookie$NoLedgerException: Ledger 89408 not found
> at
> org.apache.bookkeeper.bookie.LedgerCache.getFileInfo(LedgerCache.java:228)
> at
> org.apache.bookkeeper.bookie.LedgerCache.updatePage(LedgerCache.java:260)
> at
> org.apache.bookkeeper.bookie.LedgerCache.putEntryOffset(LedgerCache.java:158)
> at
> org.apache.bookkeeper.bookie.LedgerDescriptor.addEntry(LedgerDescriptor.java:135)
> at
> org.apache.bookkeeper.bookie.Bookie.addEntryInternal(Bookie.java:1059)
> at org.apache.bookkeeper.bookie.Bookie.addEntry(Bookie.java:1099)
> at
> org.apache.bookkeeper.proto.BookieServer.processPacket(BookieServer.java:357)
> at
> org.apache.bookkeeper.proto.NIOServerFactory$Cnxn.readRequest(NIOServerFactory.java:315)
> at
> org.apache.bookkeeper.proto.NIOServerFactory$Cnxn.doIO(NIOServerFactory.java:213)
> at
> org.apache.bookkeeper.proto.NIOServerFactory.run(NIOServerFactory.java:124)
> {code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira