[ 
https://issues.apache.org/jira/browse/BOOKKEEPER-247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13293634#comment-13293634
 ] 

Uma Maheswara Rao G commented on BOOKKEEPER-247:
------------------------------------------------

We are thinking about the sequence till now like below:

1.Bookie fails
2.Auditor puts list of affected ledgers in suspected/underreplicated ledgers 
znode
3.Replication worker will take one by one ledger from suspected ledgers znode 
and re-replicate it.
  If we are able reuse the BookKeeperAdmin code to re-replicate, then 
BookKeeperAdmin #recoverLedger already finding the fragments and replicating 
then and there. Am I missing some thing here?

Otherwise Recovery worker/Replication worker may need to watch two level of 
data. 1. suspected ledgers znode 2. underreplicated znode.


{quote}
 Also, i think bookies should run this detection on all their ledgers, every 
few hours, to detect disk issues
{quote}
I agree. I think work can be triggered on disk failures and will run hourly 
basis by default.
                
> Detection of under replication
> ------------------------------
>
>                 Key: BOOKKEEPER-247
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-247
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-client, bookkeeper-server
>            Reporter: Ivan Kelly
>            Assignee: Rakesh R
>
> This JIRA discusses how the bookkeeper system will detect underreplication of 
> ledger entries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to