BewareMyPower opened a new pull request, #390:
URL: https://github.com/apache/pulsar-client-cpp/pull/390

   ### Motivation
   
   When the broker failed to acquire the ownership of a namespace bundle by 
`LockBusyException`. It means there is another broker that has acquired the 
metadata store path and didn't release that path. For example:
   
   Broker 1:
   
   ```
   2024-01-24T23:35:36,626+0000 [metadata-store-10-1] WARN  
org.apache.pulsar.broker.lookup.TopicLookupBase - Failed to lookup <role> for 
topic persistent://<tenant>/<ns>/<topic> with error 
org.apache.pulsar.broker.PulsarServerException: Failed to acquire ownership for 
namespace bundle <tenant>/<ns>/0x50000000_0x51000000
      Caused by: java.util.concurrent.CompletionException: 
org.apache.pulsar.metadata.api.MetadataStoreException$LockBusyException: 
Resource at /namespace/<tenant>/<ns>/0x50000000_0x51000000 is already locked
   ```
   
   Broker 2:
   
   ```
   2024-01-24T23:35:36,650+0000 [broker-topic-workers-OrderedExecutor-3-0] INFO 
 org.apache.pulsar.broker.PulsarService - Loaded 1 topics on 
<tenant>/<ns>/0x50000000_0x51000000 -- time taken: 0.044 seconds
   ```
   
   After broker 2 released the lock at 23:35:36,650, the lookup request to 
broker 1 should tell the client that namespace bundle 0x50000000_0x51000000 is 
currently being unloaded and in the next retry the client will connect to the 
new owner broker.
   
   Here is another typical error:
   
   ```
   2024-01-24T23:57:57,264+0000 [pulsar-io-4-5] INFO  
org.apache.pulsar.broker.lookup.TopicLookupBase - Failed to lookup <role> for 
topic persistent://<tenant>/<ns>/<topic> with error Namespace bundle 
<tenant>/<ns>/0x0d000000_0x0e000000 is being unloaded
   ```
   
   Though after https://github.com/apache/pulsar/pull/21211, the server error 
becomes `MetadataError` rather than `ServiceNotReady`.
   
   However, since the `ServerError` is `ServiceNotReady`, the client will close 
the connection. If there are many other producers or consumers on the same 
connection, they will all reestablish connection to the broker, which is 
unnecessary and brings much pressure to broker side.
   
   ### Modifications
   
   In `checkServerError`, when the error code is `ServiceNotReady`, check the 
error message as well, if it hit the case in `handleLookupError`, do not close 
the connection.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to