jiazhai opened a new issue #8533:
URL: https://github.com/apache/pulsar/issues/8533


   **Is your enhancement request related to a problem? Please describe.**
   In class NamespaceService, method `getWebServiceUrl` is a sync method, it 
will finally call into getting data from zkcache, and if the zk is in high 
pressure, it will wait for the CompletableFuture.get, and block for a while.  
And such sync mode methods will occupy the thread resources and caused other 
threads not get handled.
   
   User meet this error info in broker side.
   ```
    [pulsar-web-48-6] ERROR org.apache.pulsar.broker.web.PulsarWebResource - 
[xxxx-broker] Failed to check whether namespace bundle is owned 
public/default/0xb0000000_0xc0000000
   java.util.concurrent.TimeoutException: null
          at 
java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1784) 
~[?:1.8.0_252]
          at 
java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1928) 
~[?:1.8.0_252]
          at 
org.apache.pulsar.broker.namespace.NamespaceService.getWebServiceUrl(NamespaceService.java:225)
 ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
          at 
org.apache.pulsar.broker.web.PulsarWebResource.isBundleOwnedByAnyBroker(PulsarWebResource.java:500)
 ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
          at 
org.apache.pulsar.broker.admin.v2.NonPersistentTopics.getListFromBundle(NonPersistentTopics.java:323)
 ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
          at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
~[?:1.8.0_252]
          at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
~[?:1.8.0_252]
   ```
   And found other thread like the topic-lookup/message-produce threads hang 
sometimes in client-side.
   
   ```
   ERROR 105 [Task-Generation-Thread] --- c.m.i.p.t.service.TaskSender          
   : The producer xxx-6-160453 can not send message to the topic 
persistent://public/default/xxx_topic-partition-0 within given timeout
   org.apache.pulsar.client.api.PulsarClientException$TimeoutException: The 
producer xxx-6-160453 can not send message to the topic 
persistent://public/default/xxx_topic-partition-0 within given timeout
   
   DEBUG 105 [pulsar-client-io-14-1] --- 
org.apache.pulsar.client.impl.ClientCnx  : Received Broker lookup response: 
Failed
   WARN 105 [pulsar-client-io-14-1] --- o.a.p.c.impl.BinaryProtoLookupService   
 : [persistent://public/default/xxxx_topic-partition-0] failed to send lookup 
request : 
org.apache.pulsar.client.api.PulsarClientException$TooManyRequestsException: 
Failed due to too many pending lookup requests
   WARN 105 [pulsar-client-io-14-1] --- o.a.p.c.impl.BinaryProtoLookupService   
 : [persistent://public/default/xxxx_topic-partition-0] Lookup response 
exception: {}
   java.util.concurrent.CompletionException: 
org.apache.pulsar.client.api.PulsarClientException$TooManyRequestsException: 
Failed due to too many pending lookup requests
           at 
java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
   ```
   
   **Describe the solution you'd like**
   We should turn  `getWebServiceUrl` and maybe also sync methods into the 
async mode to avoid thread hangs.  
   There was already some other async methods example in `NamespaceService`, 
e.g. getBrokerServiceUrlAsync/getBundleAsync
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to