[ 
https://issues.apache.org/jira/browse/BOOKKEEPER-278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13451787#comment-13451787
 ] 

Rakesh R commented on BOOKKEEPER-278:
-------------------------------------

Hi Ivan, Thanks for the reviews.

bq.It would be better to disable at the auditor level.

I just confused by looking your comments. Its doing the disabling at Auditor 
level as well as RW level also. Could you please give more details.

The proposed patch is a kind of delaying/waiting the replication 
processes(Auditor and RW) by using a CountDownLatch. The logic what I've 
followed is:

# Admin is calling disable call, then creats the 'disable' znode in 
/underreplication root node.
# When Auditor recieved any failure notification, it will get lost 
bookie/ledgers and during "markLedgerUnderreplicated" seeing "disable" znode 
then add a znode watcher and enters to WAITING state. 
# Also RW, if he tries to 'getLedgerToRereplicate' seeing "disable" znode then 
add a znode watcher and enters to WAITING state.

CountDownLatch makes blocking wait and will internally suspending both the 
Auditor, RW processes. On disabling both Auditor/RW will continue with the 
previous populated data.


Enters to waiting state:
{code}
if (null != zkc.exists(basePath + '/' + DISABLE_NODE, w)) {
  LOG.info("Automatic ledger re-replication is disabled "
       + "by Administrator!. So waiting until its enabled.")
  changedLatch.await();
}
{code}


Comes out from the infinite waiting and only after "disable" node deletion:
{code}
if (e.getType() == Watcher.Event.EventType.NodeDeleted) {
      changedLatch.countDown();
}
{code}

                
> Ability to disable auto recovery temporarily
> --------------------------------------------
>
>                 Key: BOOKKEEPER-278
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-278
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-auto-recovery
>    Affects Versions: 4.0.0
>            Reporter: Ivan Kelly
>            Assignee: Rakesh R
>             Fix For: 4.2.0
>
>         Attachments: BOOKKEEPER-278.patch
>
>
> Administrators will need to do rolling upgrades of bookies. If auto recovery 
> is enabled during a rolling upgrade, there will be a lot of thrashing of 
> ledgers as they recovery gets kicked off. Therefore we need a way to 
> temporarily disable it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to