[
https://issues.apache.org/jira/browse/BOOKKEEPER-152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ivan Kelly updated BOOKKEEPER-152:
----------------------------------
Attachment: BOOKKEEPER-152.diff
Proposed fix ensures that at least one of each quorum replies to
ReadLastConfirmed.
Refactors code a bit to make the read last confirmed common for recovery and
standalone read last confirmed.
The bug here was actually that we were waiting for quorumSize responses, from
the bookies, when really all we need to get a response from one bookie in each
possible quorum. in the 2/2 case as above this means only 1 bookie need
response.
There's a fix for the timeouts and an improvement in fencing which fixing this
uncovered.
> Can't recover a ledger whose current ensemble contain failed bookie.
> --------------------------------------------------------------------
>
> Key: BOOKKEEPER-152
> URL: https://issues.apache.org/jira/browse/BOOKKEEPER-152
> Project: Bookkeeper
> Issue Type: Bug
> Components: bookkeeper-client
> Affects Versions: 4.0.0
> Reporter: Sijie Guo
> Fix For: 4.1.0
>
> Attachments: BK-152.draft.patch, BOOKKEEPER-152.diff
>
>
> Suppose we have a unclosed ledger L, whose ensemble size is 2, quorum size is
> 2. the ledger's current ensemble is <bk1, bk2>.
> bk2 is crashed.
> we use recovery tool to recover entries in bk2. $
> bookkeeper-server/bin/bookkeeper org.apache.bookkeeper.tools.BookKeeperTools
> bk2
> recovery failed due to recovery tool can't open ledger L, since ledger L
> doesn't have enough quorum to readLastConfirmed entry.
> (asyncOpenLedgerNoRecovery)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira