[ 
https://issues.apache.org/jira/browse/BOOKKEEPER-237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13271457#comment-13271457
 ] 

Rakesh R commented on BOOKKEEPER-237:
-------------------------------------

bq. If I understand your scheme correctly, then it is not exactly centralized. 
An accountant could be any bookie and all bookies would bid for accountantship. 
It does put the burden of the accountant on a single machine at a time, and I 
wonder if we can spread the responsibility across the available machines to 
balance load. 

Here, Accountant is light weight and internally one daemon inside the elected 
Bookie. It would use ZK watchers for knowing Bookie failures and timeouts from 
clients. (like how the ZK Leader will do). Also I feel, the level of 
concurrency would get reduced.


{quote} 
On a side note, I can't recall right now, but I think the accountant is 
stateless, correct?
{quote}

Yes, Accountant is stateless, when it identifies any under replicated ledgers, 
he will put into corresponding ZK node and watchers inturn give rereplica 
notification to peer Bookies. Also, able to withstand Accountant failures and 
re-election.


bq.Bookies can have multiple pointers and watch multiple nodes.
Here, who will be creating groups and also needs to consider the group 
reformation on failures.
Also, should design multiple groups and pointers to withstand multipe crashes. 
Instead can we make it simple by choosing one guy for monitoring?


bq.Some applications might not want to have entries spread across multiple 
bookies. They could for example turn off striping and prefer not to create 
another ledger instead of having an ensemble change.

If I understand correctly, you are suggesting to provide turn off striping and 
prefer to create another ledger instead of having an ensemble change. Still 
recovery logic should consider ensemble reformation.

Why I am thinking to avoid ensemble reformation for each bookie down,

# When a slow replica goes down, if client reforms the ensemble, from which 
entry the new ensemble will be formed?
# When a bookie goes down, all the ledgers in that Bookie can be assigned to 
another Bookie if no reformation is allowed as the unit of replication. Otw I 
should go one more level down and parse each ensemble level within a ledger and 
has to be considered as the unit of replication. Also, the 
tracking(rereplication) needs to be at that level?
                
> Automatic recovery of under-replicated ledgers and its entries
> --------------------------------------------------------------
>
>                 Key: BOOKKEEPER-237
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-237
>             Project: Bookkeeper
>          Issue Type: New Feature
>          Components: bookkeeper-client, bookkeeper-server
>    Affects Versions: 4.0.0
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>         Attachments: Auto Recovery and Bookie sync-ups.pdf
>
>
> As per the current design of BookKeeper, if one of the BookKeeper server 
> dies, there is no automatic mechanism to identify and recover the under 
> replicated ledgers and its corresponding entries. This would lead to losing 
> the successfully written entries, which will be a critical problem in 
> sensitive systems. This document is trying to describe few proposals to 
> overcome these limitations. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to