[ 
https://issues.apache.org/jira/browse/BOOKKEEPER-237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13270519#comment-13270519
 ] 

Flavio Junqueira commented on BOOKKEEPER-237:
---------------------------------------------

There are a lot (really a lot) of good observations in this document, but I 
feel that it will be difficult to converge with so much detail at a time. It 
might be a good idea to agree on the high-level observations first, and here 
are the two key ones I've been able to extract: 

# The first part of the document focuses on the difficulty of guaranteeing that 
all entries of a ledger are fully replicated;
# The second part proposes an accountant abstraction.

For the first part, I got stuck on two points. First, in the regular case, I 
wouldn't expect many changes to the ensemble of a ledger, so my feeling is that 
the example on page 2 is a corner case, so I'm not sure we should optimize for 
such cases. Second, the same example points out that entries can be become 
underreplicated with so many consecutive replacements, but at the same time the 
same bookies pop up later in future ensembles. Are you considering that the 
memory of a bookie is gone once it is removed from an ensemble? If not, then 
there is no need to re-establish the degree of replication. If the memory of 
the bookie is wiped out, then we should consider just reconstructing the ledger 
fragments of the faulty bookie using the recovery tool. Why doesn't it work if 
we operate at the bookie level?

About the second part, I like the idea of proposing a mechanism to make sure 
that ledgers are properly replicated. However, I'm not entirely convinced that 
we need a new entity in the system. Perhaps we can have bookie running an 
accountant thread instead and use a simpler mechanism. Here is one proposal. 
Using ZK, we can create a chain of bookies, where each bookie watches the 
previous bookie in the sequence of sequential znodes. Let's call the watcher 
bookie the buddy of the watched bookie. If a bookie crashes, its buddy receives 
a notification and the buddy is responsible for replicating the content of the 
crashed bookie. After a crash, we of course need to restore the chain by 
finding other buddies. Also, there are some corner cases related to multiple 
failures that we would need to think about more carefully. The bottom line it 
that a distributed solution might be more robust than a centralized one, and it 
does not require a new independent entity or a specialized bookie.

I also have some thoughts about the new suggested schedules. I like the idea in 
general of having different schedules, especially the one that errors an 
operation to the ledger upon a crash instead of changing the ensemble 
automatically. But, I'll postpone my thoughts on them, if it is ok. This is 
already long...

                
> Automatic recovery of under-replicated ledgers and its entries
> --------------------------------------------------------------
>
>                 Key: BOOKKEEPER-237
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-237
>             Project: Bookkeeper
>          Issue Type: New Feature
>          Components: bookkeeper-client, bookkeeper-server
>    Affects Versions: 4.0.0
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>         Attachments: Auto Recovery and Bookie sync-ups.pdf
>
>
> As per the current design of BookKeeper, if one of the BookKeeper server 
> dies, there is no automatic mechanism to identify and recover the under 
> replicated ledgers and its corresponding entries. This would lead to losing 
> the successfully written entries, which will be a critical problem in 
> sensitive systems. This document is trying to describe few proposals to 
> overcome these limitations. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to