hangc0276 opened a new pull request #2870:
URL: https://github.com/apache/bookkeeper/pull/2870
### Motivation
When we use `bin/bookkeeper shell recover bookieId` command to recover
specific bookie's ledgers, the recover process will exit when occurs recover
ledger failed.
In our production bookkeeper cluster, we found some ledgers in Open state
and has no entry. When we call `bin/bookkeeper shell recover bookieId`
command, it will traverse all the ledgers level by level. In the end, for each
ledger, it will call the following code to process recover.
```Java
Processor<Long> ledgerProcessor = new Processor<Long>() {
@Override
public void process(Long ledgerId, AsyncCallback.VoidCallback
iterCallback) {
recoverLedger(bookiesSrc, ledgerId, dryrun, skipOpenLedgers,
skipUnrecoverableLedgers, iterCallback);
}
};
```
In the `recoverLedger` method, it will call `asyncOpenLedgerNoRecovery` to
open ledger and get LAC if the ledger in `OPEN` state. For the `getLAC`
request, if the request ledger has no entry, it will return entry = -1 and
return ERROR for this `getLAC` request.
https://github.com/apache/bookkeeper/blob/98ddf8149592572eebcfaf6bdd4916f295ffd9d7/bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java#L756-L769
And for the `asyncOpenLedgerNoRecovery` callback, it will return error for
this process. It will stop the recover process of the following ledgers.
In the end, the recover command runs failed, and the following ledger can't
be recovered.
### Changes
We should expose a flag for user to determine whether to move forward to
recover the following ledgers when some ledgers recover failed.
So, I provide the parameter `sku` to handle this case.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]