[
https://issues.apache.org/jira/browse/BOOKKEEPER-278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13458532#comment-13458532
]
Ivan Kelly commented on BOOKKEEPER-278:
---------------------------------------
[~rakeshr] sorry for taking so long to get back to you on this one.
Consider the following.
There's three bookies, A B C, with ledgers on all three, 1 2 3 4 5.
# The disable Znode is set.
# A is taken down for upgrade.
# The auditor sees that bookie A is down.
# The auditor builds a list of the ledgers on bookie A, which are 1, 2, 3, 4 & 5
# The auditor starts marking these ledgers as underreplicated, enters into
WAITING.
Now the upgrade of all bookies continues as expected. Once finished.
# the disable ZNode is unset
# auditor exits the waiting state. marks 1, 2, 3, 4 & 5 as underreplicated
Now this isn't a huge problem, as the replication worker will see that there
are in fact no fragments unavailable, but it toes induce an extra check
implicitly.
It would be better for the auditor to check is auto recovery is enabled after
seeing a bookie drop, and only build the index, mark the ledgers, if it is
enabled.
> Ability to disable auto recovery temporarily
> --------------------------------------------
>
> Key: BOOKKEEPER-278
> URL: https://issues.apache.org/jira/browse/BOOKKEEPER-278
> Project: Bookkeeper
> Issue Type: Sub-task
> Components: bookkeeper-auto-recovery
> Affects Versions: 4.0.0
> Reporter: Ivan Kelly
> Assignee: Rakesh R
> Fix For: 4.2.0
>
> Attachments: BOOKKEEPER-278.patch
>
>
> Administrators will need to do rolling upgrades of bookies. If auto recovery
> is enabled during a rolling upgrade, there will be a lot of thrashing of
> ledgers as they recovery gets kicked off. Therefore we need a way to
> temporarily disable it.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira