[
https://issues.apache.org/jira/browse/BOOKKEEPER-278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13459582#comment-13459582
]
Rakesh R commented on BOOKKEEPER-278:
-------------------------------------
Thanks Ivan and Uma for your time and responses. Could you please go through
the following and would like to know the opinion.
@Ivan
bq.is that the markLedgerUnderreplicated is blocking
Yup, its a blocking call and latch enters into infinite waiting state if it
sees a 'disable' znode.
bq.but there will be a number call calls to it queued up once it is unblocked.
Hope you are pointing me to: the multiple bookie failure notifictions which are
queuing into 'bookieNotifications' queue.
As we know Auditor is recieving the bookie failure notifications only through
the getChildren() watcher. When Auditor enters into the waiting state, it will
be in a blocking call at markLedgerUnderreplicated() and consequently run()
method also will not be finished unless recieved 'enable' notification. Since
Auditor has only registered one getChildren() zk watcher before enters to
waiting state, at max he will recieve only one bookie failure notification and
will not see further failures(because watcher is already fired and not doing
the reregistration of it). After enabling, anyway he is getting available
bookies and will recalculate lost bookies...and continue the cycle. Am I
missing anything?
Its good scenario, I will add one more test case: "behaviour of multiple bookie
failures in disable mode".
bq.It would be better for the auditor to check is auto recovery is enabled
after seeing a bookie drop, and only build the index, mark the ledgers, if it
is enabled.
I agree to place the disable checks just before processing bookie failure. In
that case, once it started generating index, will finish the publishing/cycle
of ledgers. Then, only on the next bookie failure notification he will enter
into the waiting state. Does this sound good to you?
> Ability to disable auto recovery temporarily
> --------------------------------------------
>
> Key: BOOKKEEPER-278
> URL: https://issues.apache.org/jira/browse/BOOKKEEPER-278
> Project: Bookkeeper
> Issue Type: Sub-task
> Components: bookkeeper-auto-recovery
> Affects Versions: 4.0.0
> Reporter: Ivan Kelly
> Assignee: Rakesh R
> Fix For: 4.2.0
>
> Attachments: BOOKKEEPER-278.patch
>
>
> Administrators will need to do rolling upgrades of bookies. If auto recovery
> is enabled during a rolling upgrade, there will be a lot of thrashing of
> ledgers as they recovery gets kicked off. Therefore we need a way to
> temporarily disable it.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira