[
https://issues.apache.org/jira/browse/BOOKKEEPER-237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13270519#comment-13270519
]
Flavio Junqueira commented on BOOKKEEPER-237:
---------------------------------------------
There are a lot (really a lot) of good observations in this document, but I
feel that it will be difficult to converge with so much detail at a time. It
might be a good idea to agree on the high-level observations first, and here
are the two key ones I've been able to extract:
# The first part of the document focuses on the difficulty of guaranteeing that
all entries of a ledger are fully replicated;
# The second part proposes an accountant abstraction.
For the first part, I got stuck on two points. First, in the regular case, I
wouldn't expect many changes to the ensemble of a ledger, so my feeling is that
the example on page 2 is a corner case, so I'm not sure we should optimize for
such cases. Second, the same example points out that entries can be become
underreplicated with so many consecutive replacements, but at the same time the
same bookies pop up later in future ensembles. Are you considering that the
memory of a bookie is gone once it is removed from an ensemble? If not, then
there is no need to re-establish the degree of replication. If the memory of
the bookie is wiped out, then we should consider just reconstructing the ledger
fragments of the faulty bookie using the recovery tool. Why doesn't it work if
we operate at the bookie level?
About the second part, I like the idea of proposing a mechanism to make sure
that ledgers are properly replicated. However, I'm not entirely convinced that
we need a new entity in the system. Perhaps we can have bookie running an
accountant thread instead and use a simpler mechanism. Here is one proposal.
Using ZK, we can create a chain of bookies, where each bookie watches the
previous bookie in the sequence of sequential znodes. Let's call the watcher
bookie the buddy of the watched bookie. If a bookie crashes, its buddy receives
a notification and the buddy is responsible for replicating the content of the
crashed bookie. After a crash, we of course need to restore the chain by
finding other buddies. Also, there are some corner cases related to multiple
failures that we would need to think about more carefully. The bottom line it
that a distributed solution might be more robust than a centralized one, and it
does not require a new independent entity or a specialized bookie.
I also have some thoughts about the new suggested schedules. I like the idea in
general of having different schedules, especially the one that errors an
operation to the ledger upon a crash instead of changing the ensemble
automatically. But, I'll postpone my thoughts on them, if it is ok. This is
already long...
> Automatic recovery of under-replicated ledgers and its entries
> --------------------------------------------------------------
>
> Key: BOOKKEEPER-237
> URL: https://issues.apache.org/jira/browse/BOOKKEEPER-237
> Project: Bookkeeper
> Issue Type: New Feature
> Components: bookkeeper-client, bookkeeper-server
> Affects Versions: 4.0.0
> Reporter: Rakesh R
> Assignee: Rakesh R
> Attachments: Auto Recovery and Bookie sync-ups.pdf
>
>
> As per the current design of BookKeeper, if one of the BookKeeper server
> dies, there is no automatic mechanism to identify and recover the under
> replicated ledgers and its corresponding entries. This would lead to losing
> the successfully written entries, which will be a critical problem in
> sensitive systems. This document is trying to describe few proposals to
> overcome these limitations.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira