[ 
https://issues.apache.org/jira/browse/BOOKKEEPER-272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13440194#comment-13440194
 ] 

Rakesh R commented on BOOKKEEPER-272:
-------------------------------------

Thanks for the clarifications. Just commented the fix I'm thinking. Could you 
have a look.

bq.The point of this code is, that if the node already exists, then there is 
already a missing replica. We loop through the missingReplicas, to see if the 
new missingReplica is already there or not. If so, then we can assume someone 
else has reported this replica missing, so we return.

so we have two scenarios:
# L0001 contains only BK1. while marking missingReplica of BK2, got NEE. Assume 
there is already a missing replica. Silently return as you told
# L0001 contains only BK2. while marking missingReplica of BK2, got NEE. Assume 
only single auditor and no other is marking. So again we need to merge to the 
zkMetadata and update in zk.

On NEE, 
check whether missing replica has already present in the zk urLedger metadata. 
If yes return otherwise merge the missing replica to the urLedger 
missingreplicas and call setData()

{code}
try {
       byte[] bytes = zkc.getData(znode, false, s);
       String existingMissingReplicas = new String(bytes, UTF8);
       if(existingMissingReplicas.contains(missingReplica)){
              return;
       }
       TextFormat.merge(existingMissingReplicas, builder);
       zkc.setData(znode,
                  TextFormat.printToString(builder.build()).getBytes(UTF8),
                  s.getVersion());
       return;
}catch (KeeperException.NoNodeException nne) {
{code}
                
> Provide automatic mechanism to know bookie failures
> ---------------------------------------------------
>
>                 Key: BOOKKEEPER-272
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-272
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-auto-recovery
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>             Fix For: 4.2.0
>
>         Attachments: BOOKKEEPER-272.1.patch, BOOKKEEPER-272.2.patch, 
> BOOKKEEPER-272.3.patch, BOOKKEEPER-272.Auditor.1.patch, 
> BOOKKEEPER-272.Auditor.patch
>
>
> The idea is to build automatic mechanism to find out the bookie failures. 
> Setup the bookie failure notifications to start the re-replication process.
> There are multiple approaches to findout bookie failures. Please refer the 
> documents attached in BookKeeper-237.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to