BewareMyPower opened a new pull request, #390:
URL: https://github.com/apache/pulsar-client-cpp/pull/390
### Motivation
When the broker failed to acquire the ownership of a namespace bundle by
`LockBusyException`. It means there is another broker that has acquired the
metadata store path and didn't release that path. For example:
Broker 1:
```
2024-01-24T23:35:36,626+0000 [metadata-store-10-1] WARN
org.apache.pulsar.broker.lookup.TopicLookupBase - Failed to lookup <role> for
topic persistent://<tenant>/<ns>/<topic> with error
org.apache.pulsar.broker.PulsarServerException: Failed to acquire ownership for
namespace bundle <tenant>/<ns>/0x50000000_0x51000000
Caused by: java.util.concurrent.CompletionException:
org.apache.pulsar.metadata.api.MetadataStoreException$LockBusyException:
Resource at /namespace/<tenant>/<ns>/0x50000000_0x51000000 is already locked
```
Broker 2:
```
2024-01-24T23:35:36,650+0000 [broker-topic-workers-OrderedExecutor-3-0] INFO
org.apache.pulsar.broker.PulsarService - Loaded 1 topics on
<tenant>/<ns>/0x50000000_0x51000000 -- time taken: 0.044 seconds
```
After broker 2 released the lock at 23:35:36,650, the lookup request to
broker 1 should tell the client that namespace bundle 0x50000000_0x51000000 is
currently being unloaded and in the next retry the client will connect to the
new owner broker.
Here is another typical error:
```
2024-01-24T23:57:57,264+0000 [pulsar-io-4-5] INFO
org.apache.pulsar.broker.lookup.TopicLookupBase - Failed to lookup <role> for
topic persistent://<tenant>/<ns>/<topic> with error Namespace bundle
<tenant>/<ns>/0x0d000000_0x0e000000 is being unloaded
```
Though after https://github.com/apache/pulsar/pull/21211, the server error
becomes `MetadataError` rather than `ServiceNotReady`.
However, since the `ServerError` is `ServiceNotReady`, the client will close
the connection. If there are many other producers or consumers on the same
connection, they will all reestablish connection to the broker, which is
unnecessary and brings much pressure to broker side.
### Modifications
In `checkServerError`, when the error code is `ServiceNotReady`, check the
error message as well, if it hit the case in `handleLookupError`, do not close
the connection.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]