rdhabalia opened a new issue, #23250: URL: https://github.com/apache/pulsar/issues/23250
### Search before asking - [X] I searched in the [issues](https://github.com/apache/pulsar/issues) and found nothing similar. ### Read release policy - [X] I understand that unsupported versions don't get bug fixes. I will attempt to reproduce the issue on a supported version of Pulsar client and Pulsar broker. ### Version > 2.10 ### Minimal reproduce step 1. Create global topic with unreachable remote replication cluster 2. Topic tries to start replicator producer which keeps failing to start and start logging large amount of data by creating large number of Exception objects ``` final String namespace = BrokerTestUtil.newUniqueName("pulsar/concurrent"); updateTenantInfo("pulsar", new TenantInfoImpl(Sets.newHashSet("appid1", "appid2", "appid3"), Sets.newHashSet("r1", "r2", "r3", "r10"))); admin1.namespaces().createNamespace(namespace); admin1.namespaces().setNamespaceReplicationClusters(namespace, Sets.newHashSet("r1", "r10")); final TopicName topicName = TopicName.get(BrokerTestUtil.newUniqueName("persistent://" + namespace + "/topic")); @Cleanup PulsarClient client1 = PulsarClient.builder().serviceUrl(url1.toString()).statsInterval(0, TimeUnit.SECONDS) .build(); Producer<byte[]> producer = client1.newProducer().topic(topicName.toString()).enableBatching(false) .messageRoutingMode(MessageRoutingMode.SinglePartition).create(); ``` 3. You can also reproduce by creating producer with unreachable connection url ``` @Cleanup PulsarClient newPulsarClient = PulsarClient.builder() .serviceUrl(lookupUrl.toString()) .lookupTimeout(1, TimeUnit.MINUTES) .build(); ProducerBuilder<byte[]> producerBuilder = newPulsarClient.newProducer() .topic("persistent://my-property/my-ns/my-topic1"); Producer<byte[]> producer = producerBuilder.create(); stopBroker(); for (int i = 0; i < 10; i++) { String message = "my-message-" + i; producer.sendAsync(message.getBytes()); } ``` This will cause below issues in Broker 1. Broker is busy in generating Exception toString() message ``` 2024-08-16T18:41:53,921+0000 [pulsar-io-4-2] WARN org.apache.pulsar.broker.service.AbstractReplicator - [persistent://tenant1/global/tenant1.ns/t1-partition-120][prod-use 1 -> prod-usw2] Failed to create remote producer (org.apache.pulsar.client.api.PulsarClientException: Connection already closed{"previous":[{"attempt":0,"error":"org.apache.pulsar.client.api .PulsarClientException: Connection already closed{\"previous\":[{\"attempt\":0,\"error\":\"org.apache.pulsar.client.api.PulsarClientException: Connection already closed\"},{\"attempt\":1,\"er ror\":\"org.apache.pulsar.client.api.PulsarClientException: Connection already closed{\\"previous\\":[{\\"attempt\\":0,\\"error\\":\\"org.apache.pulsar.client.api.PulsarClientException: Conne ction already closed{\\\"previous\\\":[{\\\"attempt\\\":0,\\\"error\\\":\\\"org.apache.pulsar.client.api.PulsarClientException: Connection already closed\\\"},{\\\"attempt\\\":1,\\\"error\\\" :\\\"org.apache.pulsar.client.api.PulsarClientException: Connection already closed\\\"}]}\\"},{\\"attempt\\":1,\\"error\\":\\"org.apache.pulsar.client.api.PulsarClientException: Connection al ready closed\\"},{\\"attempt\\":2,\\"error\\":\\"org.apache.pulsar.client.api.PulsarClientException: Connection already closed\\"},{\\"attempt\\":3,\\"error\\":\\"org.apache.pulsar.client.api .PulsarClientException: Connection already closed\\"},{\\"attempt\\":4,\\"error\\":\\"org.apache.pulsar.client.api.PulsarClientException: Connection already closed{\\\"previous\\\":[{\\\"atte mpt\\\":0,\\\"error\\\":\\\"org.apache.pulsar.client.api.PulsarClientException: Connection already closed\\\"},{\\\"attempt\\\":1,\\\"error\\\":\\\"org.apache.pulsar.client.api.PulsarClientEx ception: Connection already closed\\\"},{\\\"attempt\\\":2,\\\"error\\\":\\\"org.apache.pulsar.client.api.PulsarClientException: Connection already closed\\\"},{\\\"attempt\\\":3,\\\"error\\\ ":\\\"org.apache.pulsar.client.api.PulsarClientException: Connection already closed{\\\\"previous\\\\":[{\\\\"attempt\\\\":0,\\\\"error\\\\":\\\\"org.apache.pulsar.client.api.PulsarClientExce ption: Connection already closed{\\\\\"previous\\\\\":[{\\\\\"attempt\\\\\":0,\\\\\"error\\\\\":\\\\\"org.apache.pulsar.client.api.PulsarClientException: Connection already closed\\\\\"}]}\\\ \"}]}\\\"},{\\\"attempt\\\":4,\\\"error\\\":\\\"org.apache.pulsar.client.api.PulsarClientException: Connection already closed{\\\\"previous\\\\":[{\\\\"attempt\\\\":0,\\\\"error\\\\":\\\\"org .apache.pulsar.client.api.PulsarClientException: Connection already closed\\\\"},{\\\\"attempt\\\\":1,\\\\"error\\\\":\\\\"org.apache.pulsar.client.api.PulsarClientException: Connection alrea dy closed\\\\"},{\\\\"attempt\\\\":2,\\\\"error\\\\":\\\\"org.apache.pulsar.client.api.PulsarClientException: Connection already closed\\\\"}]}\\\"},{\\\"attempt\\\":5,\\\"error\\\":\\\"org.a pache.pulsar.client.api.PulsarClientException: Connection already closed{\\\\"previous\\\\":[{\\\\"attempt\\\\":0,\\\\"error\\\\":\\\\"org.apache.pulsar.client.api.PulsarClientException: Conn ection already closed{\\\\\"previous\\\\\":[{\\\\\"attempt\\\\\":0,\\\\\"error\\\\\":\\\\\"org.apache.pulsar.client.api.PulsarClientException: Connection already closed\\\\\"},{\\\\\"attempt\ \\\\":1,\\\\\"error\\\\\":\\\\\"org.apache.pulsar.client.api.PulsarClientException: Connection already closed\\\\\"}]}\\\\"},{\\\\"attempt\\\\":1,\\\\"error\\\\":\\\\"org.apache.pulsar.client .api.PulsarClientException: Connection already closed\\\\"}]}\\\"},{\\\"attempt\\\":6,\\\"error\\\":\\\"org.apache.pulsar.client.api.PulsarClientException: Connection already closed\\\"}]}\\" }]}\"},{\"attempt\":2,\"error\":\"org.apache.pulsar.client.api.PulsarClientException: Connection already closed\"},{\"attempt\":3,\"error\":\"org.apache.pulsar.client.api.PulsarClientExceptio n: Connection already closed\"}]}"},{"attempt":1,"error":"org.apache.pulsar.client.api.PulsarClientException: Connection already closed"},{"attempt":2,"error":"org.apache.pulsar.client.api.Pu lsarClientException: Connection already closed{\"previous\":[{\"attempt\":0,\"error\":\"org.apache.pulsar.client.api.PulsarClientException: Connection already closed\"},{\"attempt\":1,\"error \":\"org.apache.pulsar.client.api.PulsarClientException: Connection already closed{\\"previous\\":[{\\"attempt\\":0,\\"error\\":\\"org.apache.pulsar.client.api.PulsarClientException: Connecti on already closed{\\\"previous\\\":[{\\\"attempt\\\":0,\\\"error\\\":\\\"org.apache.pulsar.client.api.PulsarClientException: Connection already closed{\\\\"previous\\\\":[{\\\\"attempt\\\\":0 ,\\\\"error\\\\":\\\\"org.apache.pulsar.client.api.PulsarClientException: Connection already closed{\\\\\"previous\\\\\":[{\\\\\"attempt\\\\\":0,\\\\\"error\\\\\":\\\\\"org.apache.pulsar.clie nt.api.PulsarClientException: Connection already closed\\\\\"},{\\\\\"attempt\\\\\":1,\\\\\"error\\\\\":\\\\\"org.apache.pulsar.client.api.PulsarClientException: Connection already closed\\\\ \"},{\\\\\"attempt\\\\\":2,\\\\\"error\\\\\":\\\\\"org.apache.pulsar.client.api.PulsarClientException: Connection already closed{\\\\\\"previous\\\\\\":[{\\\\\\"attempt\\\\\\":0,\\\\\\"error\ \\\\\":\\\\\\"org.apache.pulsar.client.api.PulsarClientException: Connection already closed\\\\\\"},{\\\\\\"attempt\\\\\\":1,\\\\\\"error\\\\\\":\\\\\\"org.apache.pulsar.client.api.PulsarClie ntException: Connection already closed{\\\\\\\"previous\\\\\\\":[{\\\\\\\"attempt\\\\\\\":0,\\\\\\\"error\\\\\\\":\\\\\\\"org.apache.pulsar.client.api.PulsarClientException: Connection alread y closed{\\\\\\\\"previous\\\\\\\\":[{\\\\\\\\"attempt\\\\\\\\":0,\\\\\\\\"error\\\\\\\\":\\\\\\\\"org.apache.pulsar.client.api.PulsarClientException: Connection already closed\\\\\\\\"},{\\\ \\\\\"attempt\\\\\\\\":1,\\\\\\\\"error\\\\\\\\":\\\\\\\\"org.apache.pulsar.client.api.PulsarClientException: Connection already closed{\\\\\\\\\"previous\\\\\\\\\":[{\\\\\\\\\"attempt\\\\\\\ \\":0,\\\\\\\\\"error\\\\\\\\\":\\\\\\\\\"org.apache.pulsar.client.api.PulsarClientException: Connection already closed{\\\\\\\\\\"previous\\\\\\\\\\":[{\\\\\\\\\\"attempt\\\\\\\\\\":0,\\\\\\ \\\\"error\\\\\\\\\\":\\\\\\\\\\"org.apache.pulsar.client.api.PulsarClientException: Connection already closed\\\\\\\\\\"},{\\\\\\\\\\"attempt\\\\\\\\\\":1,\\\\\\\\\\"error\\\\\\\\\\":\\\\\\\ \\\"org.apache.pulsar.client.api.PulsarClientException: Connection already closed\\\\\\\\\\"},{\\\\\\\\\\"attempt\\\\\\\\\\":2,\\\\\\\\\\"error\\\\\\\\\\":\\\\\\\\\\"org.apache.pulsar.client. api.PulsarClientException: Connection already closed\\\\\\\\\\"},{\\\\\\\\\\"attempt\\\\\\\\\\":3,\\\\\\\\\\"error\\\\\\\\\\":\\\\\\\\\\"org.apache.pulsar.client.api.PulsarClientException: Co nnection already closed{\\\\\\\\\\\"previous\\\\\\\\\\\":[{\\\\\\\\\\\"attempt\\\\\\\\\\\":0,\\\\\\\\\\\"error\\\\\\\\\\\":\\\\\\\\\\\"org.apache.pulsar.client.api.PulsarClientException: Conn ection already closed\\\\\\\\\\\"},{\\\\\\\\\\\"attempt\\\\\\\\\\\":1,\\\\\\\\\\\"error\\\\\\\\\\\":\\\\\\\\\\\"org.apache.pulsar.client.api.PulsarClientException: Connection already closed{\ \\\\\\\\\\\"previous\\\\\\\\\\\\":[{\\\\\\\\\\\\"attempt\\\\\\\\\\\\":0,\\\\\\\\\\\\"error\\\\\\\\\\\\":\\\\\\\\\\\\"org.apache.pulsar.client.api.PulsarClientException: Connection already clo sed\\\\\\\\\\\\"},{\\\\\\\\\\\\"attempt\\\\\\\\\\\\":1,\\\\\\\\\\\\"error\\\\\\\\\\\\":\\\\\\\\\\\\"org.apache.pulsar.client.api.PulsarClientException: Connection already closed{\\\\\\\\\\\\\ "previous\\\\\\\\\\\\\":[{\\\\\\\\\\\\\"attempt\\\\\\\\\\\\\":0,\\\\\\\\\\\\\"error\\\\\\\\\\\\\":\\\\\\\\\\\\\"org.apache.pulsar.client.api.PulsarClientException: Connection already closed\\ \\\\\\\\\\\"},{\\\\\\\\\\\\\"attempt\\\\\\\\\\\\\":1,\\\\\\\\\\\\\"error\\\\\\\\\\\\\":\\\\\\\\\\\\\"org.apache.pulsar.client.api.PulsarClientException: Connection already closed\\\\\\\\\\\\\ "},{\\\\\\\\\\\\\"attempt\\\\\\\\\\\\\":2,\\\\\\\\\\\\\"error\\\\\\\\\\\\\":\\\\\\\\\\\\\"org.apache.pulsar.client.api.Pulsa ``` 3. Thread dump shows most of the IO threads are busy in logging ``` "pulsar-io-4-1" #72 prio=5 os_prio=0 cpu=141510.46ms elapsed=943.74s tid=0x0000556dad05f7c0 nid=0x200 runnable [0x00007f0998393000] java.lang.Thread.State: RUNNABLE at java.lang.StringLatin1.replace([email protected]/StringLatin1.java:357) at java.lang.String.replace([email protected]/String.java:2973) at org.apache.pulsar.client.api.PulsarClientException.toString(PulsarClientException.java:125) at org.apache.pulsar.client.api.PulsarClientException.toString(PulsarClientException.java:125) at org.apache.pulsar.client.api.PulsarClientException.toString(PulsarClientException.java:125) at org.apache.pulsar.client.api.PulsarClientException.toString(PulsarClientException.java:125) at org.apache.pulsar.client.api.PulsarClientException.toString(PulsarClientException.java:125) at org.apache.pulsar.client.api.PulsarClientException.toString(PulsarClientException.java:125) at org.apache.pulsar.client.api.PulsarClientException.toString(PulsarClientException.java:125) at org.apache.pulsar.client.api.PulsarClientException.toString(PulsarClientException.java:125) at org.apache.pulsar.client.api.PulsarClientException.toString(PulsarClientException.java:125) at org.apache.pulsar.client.api.PulsarClientException.toString(PulsarClientException.java:125) at org.apache.pulsar.client.api.PulsarClientException.toString(PulsarClientException.java:125) at org.apache.pulsar.client.api.PulsarClientException.toString(PulsarClientException.java:125) at org.apache.pulsar.client.api.PulsarClientException.toString(PulsarClientException.java:125) at org.apache.pulsar.client.api.PulsarClientException.toString(PulsarClientException.java:125) at org.apache.pulsar.client.api.PulsarClientException.toString(PulsarClientException.java:125) at java.lang.Throwable.<init>([email protected]/Throwable.java:317) at java.lang.Exception.<init>([email protected]/Exception.java:103) at java.lang.RuntimeException.<init>([email protected]/RuntimeException.java:97) at java.util.concurrent.CompletionException.<init>([email protected]/CompletionException.java:88) at java.util.concurrent.CompletableFuture.encodeThrowable([email protected]/CompletableFuture.java:332) at java.util.concurrent.CompletableFuture.completeThrowable([email protected]/CompletableFuture.java:347) at java.util.concurrent.CompletableFuture$UniAccept.tryFire([email protected]/CompletableFuture.java:708) at java.util.concurrent.CompletableFuture.postComplete([email protected]/CompletableFuture.java:510) at java.util.concurrent.CompletableFuture.completeExceptionally([email protected]/CompletableFuture.java:2162) at org.apache.pulsar.client.impl.BinaryProtoLookupService.lambda$findBroker$0(BinaryProtoLookupService.java:157) at org.apache.pulsar.client.impl.BinaryProtoLookupService$$Lambda$1137/0x00000008013fbc20.apply(Unknown Source) at java.util.concurrent.CompletableFuture.uniExceptionally([email protected]/CompletableFuture.java:990) at java.util.concurrent.CompletableFuture$UniExceptionally.tryFire([email protected]/CompletableFuture.java:974) at java.util.concurrent.CompletableFuture.postComplete([email protected]/CompletableFuture.java:510) at java.util.concurrent.CompletableFuture.completeExceptionally([email protected]/CompletableFuture.java:2162) at org.apache.pulsar.client.impl.BinaryProtoLookupService.lambda$findBroker$3(BinaryProtoLookupService.java:181) at org.apache.pulsar.client.impl.BinaryProtoLookupService$$Lambda$937/0x000000080137a3c8.apply(Unknown Source) at java.util.concurrent.CompletableFuture.uniExceptionally([email protected]/CompletableFuture.java:990) at java.util.concurrent.CompletableFuture$UniExceptionally.tryFire([email protected]/CompletableFuture.java:974) at java.util.concurrent.CompletableFuture.postComplete([email protected]/CompletableFuture.java:510) at java.util.concurrent.CompletableFuture.completeExceptionally([email protected]/CompletableFuture.java:2162) at org.apache.pulsar.client.impl.ConnectionPool.lambda$createConnection$8(ConnectionPool.java:230) ``` ### What did you expect to see? Broker should not be impacted by any connection or config issue with user topics ### What did you see instead?  As you can see in graphs that broker started seeing high publish latency and GC pause was taking more than a min. Thread dump shows most of the IO threads are busy in logging ``` "pulsar-io-4-1" #72 prio=5 os_prio=0 cpu=141510.46ms elapsed=943.74s tid=0x0000556dad05f7c0 nid=0x200 runnable [0x00007f0998393000] java.lang.Thread.State: RUNNABLE at java.lang.StringLatin1.replace([email protected]/StringLatin1.java:357) at java.lang.String.replace([email protected]/String.java:2973) at org.apache.pulsar.client.api.PulsarClientException.toString(PulsarClientException.java:125) at org.apache.pulsar.client.api.PulsarClientException.toString(PulsarClientException.java:125) at org.apache.pulsar.client.api.PulsarClientException.toString(PulsarClientException.java:125) at org.apache.pulsar.client.api.PulsarClientException.toString(PulsarClientException.java:125) at org.apache.pulsar.client.api.PulsarClientException.toString(PulsarClientException.java:125) at org.apache.pulsar.client.api.PulsarClientException.toString(PulsarClientException.java:125) at org.apache.pulsar.client.api.PulsarClientException.toString(PulsarClientException.java:125) at org.apache.pulsar.client.api.PulsarClientException.toString(PulsarClientException.java:125) at org.apache.pulsar.client.api.PulsarClientException.toString(PulsarClientException.java:125) at org.apache.pulsar.client.api.PulsarClientException.toString(PulsarClientException.java:125) at org.apache.pulsar.client.api.PulsarClientException.toString(PulsarClientException.java:125) at org.apache.pulsar.client.api.PulsarClientException.toString(PulsarClientException.java:125) at org.apache.pulsar.client.api.PulsarClientException.toString(PulsarClientException.java:125) at org.apache.pulsar.client.api.PulsarClientException.toString(PulsarClientException.java:125) at org.apache.pulsar.client.api.PulsarClientException.toString(PulsarClientException.java:125) at java.lang.Throwable.<init>([email protected]/Throwable.java:317) at java.lang.Exception.<init>([email protected]/Exception.java:103) at java.lang.RuntimeException.<init>([email protected]/RuntimeException.java:97) at java.util.concurrent.CompletionException.<init>([email protected]/CompletionException.java:88) at java.util.concurrent.CompletableFuture.encodeThrowable([email protected]/CompletableFuture.java:332) at java.util.concurrent.CompletableFuture.completeThrowable([email protected]/CompletableFuture.java:347) at java.util.concurrent.CompletableFuture$UniAccept.tryFire([email protected]/CompletableFuture.java:708) at java.util.concurrent.CompletableFuture.postComplete([email protected]/CompletableFuture.java:510) at java.util.concurrent.CompletableFuture.completeExceptionally([email protected]/CompletableFuture.java:2162) at org.apache.pulsar.client.impl.BinaryProtoLookupService.lambda$findBroker$0(BinaryProtoLookupService.java:157) at org.apache.pulsar.client.impl.BinaryProtoLookupService$$Lambda$1137/0x00000008013fbc20.apply(Unknown Source) at java.util.concurrent.CompletableFuture.uniExceptionally([email protected]/CompletableFuture.java:990) at java.util.concurrent.CompletableFuture$UniExceptionally.tryFire([email protected]/CompletableFuture.java:974) at java.util.concurrent.CompletableFuture.postComplete([email protected]/CompletableFuture.java:510) at java.util.concurrent.CompletableFuture.completeExceptionally([email protected]/CompletableFuture.java:2162) at org.apache.pulsar.client.impl.BinaryProtoLookupService.lambda$findBroker$3(BinaryProtoLookupService.java:181) at org.apache.pulsar.client.impl.BinaryProtoLookupService$$Lambda$937/0x000000080137a3c8.apply(Unknown Source) at java.util.concurrent.CompletableFuture.uniExceptionally([email protected]/CompletableFuture.java:990) at java.util.concurrent.CompletableFuture$UniExceptionally.tryFire([email protected]/CompletableFuture.java:974) at java.util.concurrent.CompletableFuture.postComplete([email protected]/CompletableFuture.java:510) at java.util.concurrent.CompletableFuture.completeExceptionally([email protected]/CompletableFuture.java:2162) at org.apache.pulsar.client.impl.ConnectionPool.lambda$createConnection$8(ConnectionPool.java:230) ``` ### Anything else? _No response_ ### Are you willing to submit a PR? - [X] I'm willing to submit a PR! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
