jiazhai opened a new issue #8533:
URL: https://github.com/apache/pulsar/issues/8533
**Is your enhancement request related to a problem? Please describe.**
In class NamespaceService, method `getWebServiceUrl` is a sync method, it
will finally call into getting data from zkcache, and if the zk is in high
pressure, it will wait for the CompletableFuture.get, and block for a while.
And such sync mode methods will occupy the thread resources and caused other
threads not get handled.
User meet this error info in broker side.
```
[pulsar-web-48-6] ERROR org.apache.pulsar.broker.web.PulsarWebResource -
[xxxx-broker] Failed to check whether namespace bundle is owned
public/default/0xb0000000_0xc0000000
java.util.concurrent.TimeoutException: null
at
java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1784)
~[?:1.8.0_252]
at
java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1928)
~[?:1.8.0_252]
at
org.apache.pulsar.broker.namespace.NamespaceService.getWebServiceUrl(NamespaceService.java:225)
~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
at
org.apache.pulsar.broker.web.PulsarWebResource.isBundleOwnedByAnyBroker(PulsarWebResource.java:500)
~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
at
org.apache.pulsar.broker.admin.v2.NonPersistentTopics.getListFromBundle(NonPersistentTopics.java:323)
~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
~[?:1.8.0_252]
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
~[?:1.8.0_252]
```
And found other thread like the topic-lookup/message-produce threads hang
sometimes in client-side.
```
ERROR 105 [Task-Generation-Thread] --- c.m.i.p.t.service.TaskSender
: The producer xxx-6-160453 can not send message to the topic
persistent://public/default/xxx_topic-partition-0 within given timeout
org.apache.pulsar.client.api.PulsarClientException$TimeoutException: The
producer xxx-6-160453 can not send message to the topic
persistent://public/default/xxx_topic-partition-0 within given timeout
DEBUG 105 [pulsar-client-io-14-1] ---
org.apache.pulsar.client.impl.ClientCnx : Received Broker lookup response:
Failed
WARN 105 [pulsar-client-io-14-1] --- o.a.p.c.impl.BinaryProtoLookupService
: [persistent://public/default/xxxx_topic-partition-0] failed to send lookup
request :
org.apache.pulsar.client.api.PulsarClientException$TooManyRequestsException:
Failed due to too many pending lookup requests
WARN 105 [pulsar-client-io-14-1] --- o.a.p.c.impl.BinaryProtoLookupService
: [persistent://public/default/xxxx_topic-partition-0] Lookup response
exception: {}
java.util.concurrent.CompletionException:
org.apache.pulsar.client.api.PulsarClientException$TooManyRequestsException:
Failed due to too many pending lookup requests
at
java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
```
**Describe the solution you'd like**
We should turn `getWebServiceUrl` and maybe also sync methods into the
async mode to avoid thread hangs.
There was already some other async methods example in `NamespaceService`,
e.g. getBrokerServiceUrlAsync/getBundleAsync
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]