Nicklee007 opened a new pull request, #3339:
URL: https://github.com/apache/bookkeeper/pull/3339
Master Issue: #3338
### Motivation
As the Decommissioning bookie case, always change the bookie status to
readonly firstly, and then wait some data expired, but always it has some
ledgers (about 100+ -- 300+) legacy not be cleaned and the leaved ledgers only
has little data , when we running `bin/bookkeeper shell decommissionbookie
-bookieid ` to decommission the bookie , we always pending on
`waitForLedgersToBeReplicated()` about 10 min and have not any log print, but
we could find the znode /ledgers/underreplication/ledgers cleaned only few
seconds and then the ledgers be rereplicate completed, we find the sleep time
is defined as the `Min(ledgers.size() * sleepTimePerLedger(10s) ,
maxSleepTimeInBetweenChecks(10 min))`, in the way , the both time always wait
too long and has not any print will let user confused。
### Changes
In the bookie has many data to rereplicate case , we need the backoff policy
to protect zk server, so
1. To look the `/ledgers/underreplication/ledgers ` and `
/ledgers/underreplication/locks` every 10 sec, help us check if the ledgers
replicate completed.
2. To avoid the auditor is running as CheckAllLedgers or other
time-consuming operation when we trigger the audit bookie, then
`/ledgers/underreplication/ledgers ` and ` /ledgers/underreplication/locks`
will empty to long time , we need a back off policy to avoid frequent request
zk to `validateBookieIsNotPartOfEnsemble`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]