[
https://issues.apache.org/jira/browse/BOOKKEEPER-237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13270345#comment-13270345
]
Ivan Kelly commented on BOOKKEEPER-237:
---------------------------------------
This doc is quite in line to what I had been thinking of. It needs to be broken
into little parts though, because as it is now, it's quite hard to digest.
Firstly, the change to replication is a really a special case of what can be
done with BOOKKEEPER-208. With BK-208, we have the concept of a write and an
ack quorum. The write quorum is the set of bookies which an entry will be set
to. The ack quorum is the number of bookies who must acknowledge the entry
before it is acked to the client application. What is proposed in your doc
seems to be this, but with the write quorum always set to ensemble size.
For the "Accountant", I think a better name would be "Auditor", as accountants
have a habit of cooking the books ;). I think detection should be separated
into two subparts, one performed by the accountant, which is elected. The other
performed by the bookies themselves.
* The accountant check should check the zookeeper metadata only. For each
ledger, it should check whether all the bookies are available. If not the
ledger should be marked as underreplicated. As well as the conditions you
propose in 1.6, it should also run periodically.
* The bookie should check whether it can read the first and last entry for
each ledger fragment which it is supposed to have[1]. If it finds that it
cannot read a ledger, it should mark the ledger as underreplicated. This is
basically a fsck. This should run periodically.
I don't think its necessary to maintain a mapping of bookies to ledgers. To
rereplicate a ledger, it is necessary to read the whole ledger metadata for the
ledger, so a mapping of which bookies should contain the ledger is
straightforward to build on the fly.
The rereplication process should be distributed across all bookies. Each bookie
should run a replication process which takes the first available unreplicated
ledger off the queue and rereplicates it. I don't think we should take load
balancing into consideration yet, as it complicates matters a lot.
[1] This won't necessarily be the first and last entry of the fragment, due to
striping.
> Automatic recovery of under-replicated ledgers and its entries
> --------------------------------------------------------------
>
> Key: BOOKKEEPER-237
> URL: https://issues.apache.org/jira/browse/BOOKKEEPER-237
> Project: Bookkeeper
> Issue Type: New Feature
> Components: bookkeeper-client, bookkeeper-server
> Affects Versions: 4.0.0
> Reporter: Rakesh R
> Assignee: Rakesh R
> Attachments: Auto Recovery and Bookie sync-ups.pdf
>
>
> As per the current design of BookKeeper, if one of the BookKeeper server
> dies, there is no automatic mechanism to identify and recover the under
> replicated ledgers and its corresponding entries. This would lead to losing
> the successfully written entries, which will be a critical problem in
> sensitive systems. This document is trying to describe few proposals to
> overcome these limitations.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira