wolfstudy opened a new issue, #3405:
URL: https://github.com/apache/bookkeeper/issues/3405
**BUG REPORT**
***Describe the bug***
Initially, we found that a topic in Pulsar could not do normal message
production and sending operations. Therefore, querying the stats and
stats-internal of the current topic returns no results, as follows:
Status:
```
{
"msgRateIn" : 0.0,
"msgThroughputIn" : 0.0,
"msgRateOut" : 0.0,
"msgThroughputOut" : 0.0,
"bytesInCounter" : 0,
"msgInCounter" : 0,
"bytesOutCounter" : 0,
"msgOutCounter" : 0,
"averageMsgSize" : 0.0,
"msgChunkPublished" : false,
"storageSize" : 0,
"backlogSize" : 0,
"offloadedStorageSize" : 0,
"publishers" : [ ],
"subscriptions" : { },
"replication" : { },
"nonContiguousDeletedMessagesRanges" : 0,
"nonContiguousDeletedMessagesRangesSerializedSize" : 0,
"publishRateLimitedTimes" : 0,
"metadata" : {
"partitions" : 1
},
"partitions" : { }
}
```
Status-internal:
```
{
"metadata" : {
"partitions" : 0
},
"partitions" : { }
}
```
At this point, I went to query Bookie's log and found the following error
message:

And the source code is here:
https://github.com/apache/pulsar/blob/6704f12104/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedLedgerFactoryImpl.java#L366-L374
Here we can see that the current Ledger state has not been restored within
the time of the default configuration
managedLedgerMetadataOperationsTimeoutSeconds=60s.
So we tried to use the readledger command to find that the current Ledger
stores nearly 700w of Entries, so when the Broker tries to load the topic, it
will open the current ledger, but the Ledger is always in the state of
IN_RECOVERY, So it will enter the following code logic:
```
if (doRecovery) {
lh.recover(new
OrderedGenericCallback<Void>(bk.getMainWorkerPool(), ledgerId) {
@Override
public void safeOperationComplete(int rc, Void result) {
if (rc == BKException.Code.OK) {
openComplete(BKException.Code.OK, lh);
} else if (rc ==
BKException.Code.UnauthorizedAccessException) {
closeLedgerHandle();
openComplete(BKException.Code.UnauthorizedAccessException, null);
} else {
closeLedgerHandle();
openComplete(bk.getReturnRc(BKException.Code.LedgerRecoveryException), null);
}
}
@Override
public String toString() {
return String.format("Recover(%d)", ledgerId);
}
});
}
```
However, since there are many Entry that need to be restored, we can see
that the current Ledger has not been able to be restored, so the Topic on the
Broker side cannot be loaded correctly, so the entire service has been
unavailable.
***To Reproduce***
I don't know how this phenomenon came about, and I don't know how to
reproduce this scene
## Solution
We locally mocked a bookie client object, and then re-fetched the properties
of LedgerMetadata, forcing the current Ledger state to be reset from
IN_RECOVERY to CLOSED state, and the problem recovered
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]