devinbost opened a new issue #7911: URL: https://github.com/apache/pulsar/issues/7911
I ran into a very strange issue after upgrading to Pulsar 2.6.1. When upgrading, I basically rebuilt this cluster from scratch, so there shouldn't be any remnant of the previous cluster. However, I still had a producer and a consumer writing to and consuming from the topic "persistent://public/default/canary" as a health check. I didn't remove these containers until I finished rolling out the new cluster, but then when I restarted the canary producer and canary consumer, they couldn't connect to their topic. Other producers and consumers started just fine. I was able to reproduce the issue by manually attempting to produce to this topic. When I produce to this topic, I get a broker timeout exception. However, if I try producing to any other topic, I don't get any exception. See below. `root@server26:/pulsar# bin/pulsar-client produce persistent://public/default/canary -m "TEST ME"` > 2020-08-26T20:39:42,787 [pulsar-client-io-1-1] WARN org.apache.pulsar.client.impl.BinaryProtoLookupService - [persistent://public/default/canary] failed to send lookup request : org.apache.pulsar.client.api.PulsarClientException$TimeoutException: 2 lookup request timedout after ms 30000 > 2020-08-26T20:39:42,789 [pulsar-client-io-1-1] WARN org.apache.pulsar.client.impl.BinaryProtoLookupService - [persistent://public/default/canary] lookup failed : org.apache.pulsar.client.api.PulsarClientException$TimeoutException: 2 lookup request timedout after ms 30000 > java.util.concurrent.CompletionException: org.apache.pulsar.client.api.PulsarClientException$TimeoutException: 2 lookup request timedout after ms 30000 > at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292) ~[?:1.8.0_252] > at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308) ~[?:1.8.0_252] > at java.util.concurrent.CompletableFuture.uniAccept(CompletableFuture.java:661) ~[?:1.8.0_252] > at java.util.concurrent.CompletableFuture$UniAccept.tryFire(CompletableFuture.java:646) ~[?:1.8.0_252] > at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) ~[?:1.8.0_252] > at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990) ~[?:1.8.0_252] > at org.apache.pulsar.client.impl.ClientCnx.lambda$addPendingLookupRequests$12(ClientCnx.java:541) ~[org.apache.pulsar-pulsar-client-original-2.6.1.jar:2.6.1] > at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98) [io.netty-netty-common-4.1.48.Final.jar:4.1.48.Final] > at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:170) [io.netty-netty-common-4.1.48.Final.jar:4.1.48.Final] > at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164) [io.netty-netty-common-4.1.48.Final.jar:4.1.48.Final] > at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472) [io.netty-netty-common-4.1.48.Final.jar:4.1.48.Final] > at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:384) [io.netty-netty-transport-native-epoll-4.1.48.Final-linux-x86_64.jar:4.1.48.Final] > at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) [io.netty-netty-common-4.1.48.Final.jar:4.1.48.Final] > at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [io.netty-netty-common-4.1.48.Final.jar:4.1.48.Final] > at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.48.Final.jar:4.1.48.Final] > at java.lang.Thread.run(Thread.java:748) [?:1.8.0_252] > Caused by: org.apache.pulsar.client.api.PulsarClientException$TimeoutException: 2 lookup request timedout after ms 30000 > ... 10 more > 2020-08-26T20:39:42,794 [pulsar-client-io-1-1] WARN org.apache.pulsar.client.impl.ConnectionHandler - [persistent://public/default/canary] [null] Error connecting to broker: org.apache.pulsar.client.api.PulsarClientException$TimeoutException: 2 lookup request timedout after ms 30000 > 2020-08-26T20:39:42,794 [main] ERROR org.apache.pulsar.client.cli.PulsarClientTool - Error while producing messages > 2020-08-26T20:39:42,795 [main] ERROR org.apache.pulsar.client.cli.PulsarClientTool - 2 lookup request timedout after ms 30000 > org.apache.pulsar.client.api.PulsarClientException$TimeoutException: 2 lookup request timedout after ms 30000 > at org.apache.pulsar.client.api.PulsarClientException.unwrap(PulsarClientException.java:856) ~[org.apache.pulsar-pulsar-client-api-2.6.1.jar:2.6.1] > at org.apache.pulsar.client.impl.ProducerBuilderImpl.create(ProducerBuilderImpl.java:93) ~[org.apache.pulsar-pulsar-client-original-2.6.1.jar:2.6.1] > at org.apache.pulsar.client.cli.CmdProduce.publish(CmdProduce.java:211) [org.apache.pulsar-pulsar-client-tools-2.6.1.jar:2.6.1] > at org.apache.pulsar.client.cli.CmdProduce.run(CmdProduce.java:196) [org.apache.pulsar-pulsar-client-tools-2.6.1.jar:2.6.1] > at org.apache.pulsar.client.cli.PulsarClientTool.run(PulsarClientTool.java:169) [org.apache.pulsar-pulsar-client-tools-2.6.1.jar:2.6.1] > at org.apache.pulsar.client.cli.PulsarClientTool.main(PulsarClientTool.java:203) [org.apache.pulsar-pulsar-client-tools-2.6.1.jar:2.6.1] `root@server26:/pulsar# bin/pulsar-client produce persistent://public/default/canary2 -m "TEST ME"` [no exception] ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
