xuesongxs opened a new issue, #15559:
URL: https://github.com/apache/pulsar/issues/15559
**Describe the bug**
Pulsar v2.8.1
Pulsar cluster: 3 brokers
Producer can‘t continue sending messages after all brokers are restarted.
**To Reproduce**
Steps to reproduce the behavior:
1. Create producer
```
public class PulsarProducerDemo3 {
// 连接集群 broker
private static String localClusterUrl =
"pulsar://127.0.0.1:6650,127.0.0.1:6651,127.0.0.1:6652";
public static void main(String[] args) {
try {
Producer<String> producer = getProducer();
Long start = System.currentTimeMillis();
int i = 0;
while (i < 50000) {
producer.send(i + "");
System.out.println("send msg:" + i + "");
i++;
}
} catch (Exception e) {
System.err.println("send fail:" + e);
}
}
public static Producer<String> getProducer() throws Exception {
PulsarClient client;
Map<String, Object> prop = new HashMap<>();
prop.put("topicName", "persistent://public/default/test-string3");
client = PulsarClient.builder().serviceUrl(localClusterUrl).build();
Producer<String> producer = client.newProducer(Schema.STRING)
.loadConf(prop)
.create();
return producer;
}
}
```
2. Run producer
3. After sending 100 messages, stop all brokers
4. See producer's log
```
[pulsar-timer-5-1] INFO org.apache.pulsar.client.impl.ConnectionHandler -
[persistent://public/default/test-string3-partition-2] [pulsar-cluster-10-0]
Reconnecting after connection was closed
[pulsar-client-io-1-1] WARN org.apache.pulsar.client.impl.ConnectionPool -
Failed to open connection to 127.0.0.1:6650 :
org.apache.pulsar.shade.io.netty.channel.AbstractChannel$AnnotatedConnectException:
Connection refused: /127.0.0.1:6650
[pulsar-client-io-1-1] WARN org.apache.pulsar.client.impl.ConnectionHandler
- [persistent://public/default/test-string3-partition-2] [pulsar-cluster-10-0]
Error connecting to broker: org.apache.pulsar.client.api.PulsarClientException:
java.util.concurrent.CompletionException:
org.apache.pulsar.shade.io.netty.channel.AbstractChannel$AnnotatedConnectException:
Connection refused: /127.0.0.1:6650
[pulsar-client-io-1-1] WARN org.apache.pulsar.client.impl.ConnectionHandler
- [persistent://public/default/test-string3-partition-2] [pulsar-cluster-10-0]
Could not get connection to broker:
org.apache.pulsar.client.api.PulsarClientException:
java.util.concurrent.CompletionException:
org.apache.pulsar.shade.io.netty.channel.AbstractChannel$AnnotatedConnectException:
Connection refused: /127.0.0.1:6650 -- Will try again in 1.496 s
[pulsar-timer-5-1] INFO org.apache.pulsar.client.impl.ConnectionHandler -
[persistent://public/default/test-string3-partition-2] [pulsar-cluster-10-0]
Reconnecting after connection was closed
[pulsar-client-io-1-1] WARN org.apache.pulsar.client.impl.ConnectionPool -
Failed to open connection to 172.32.149.123:16650 :
org.apache.pulsar.shade.io.netty.channel.AbstractChannel$AnnotatedConnectException:
Connection refused: /172.32.149.123:16650
[pulsar-client-io-1-1] WARN org.apache.pulsar.client.impl.ConnectionHandler
- [persistent://public/default/test-string3-partition-2] [pulsar-cluster-10-0]
Error connecting to broker: org.apache.pulsar.client.api.PulsarClientException:
java.util.concurrent.CompletionException:
org.apache.pulsar.shade.io.netty.channel.AbstractChannel$AnnotatedConnectException:
Connection refused: /127.0.0.1:6651
[pulsar-client-io-1-1] WARN org.apache.pulsar.client.impl.ConnectionHandler
- [persistent://public/default/test-string3-partition-2] [pulsar-cluster-10-0]
Could not get connection to broker:
org.apache.pulsar.client.api.PulsarClientException:
java.util.concurrent.CompletionException:
org.apache.pulsar.shade.io.netty.channel.AbstractChannel$AnnotatedConnectException:
Connection refused: /127.0.0.1:6651 -- Will try again in 3.193 s
[pulsar-timer-5-1] INFO org.apache.pulsar.client.impl.ConnectionHandler -
[persistent://public/default/test-string3-partition-1] [pulsar-cluster-11-0]
Reconnecting after connection was closed
[pulsar-client-io-1-1] WARN org.apache.pulsar.client.impl.ConnectionPool -
Failed to open connection to 127.0.0.1:6652 :
org.apache.pulsar.shade.io.netty.channel.AbstractChannel$AnnotatedConnectException:
Connection refused: /127.0.0.1:6652
[pulsar-client-io-1-1] WARN org.apache.pulsar.client.impl.ConnectionHandler
- [persistent://public/default/test-string3-partition-1] [pulsar-cluster-11-0]
Error connecting to broker: org.apache.pulsar.client.api.PulsarClientException:
java.util.concurrent.CompletionException:
org.apache.pulsar.shade.io.netty.channel.AbstractChannel$AnnotatedConnectException:
Connection refused: /127.0.0.1:6652
[pulsar-client-io-1-1] WARN org.apache.pulsar.client.impl.ConnectionHandler
- [persistent://public/default/test-string3-partition-1] [pulsar-cluster-11-0]
Could not get connection to broker:
org.apache.pulsar.client.api.PulsarClientException:
java.util.concurrent.CompletionException:
org.apache.pulsar.shade.io.netty.channel.AbstractChannel$AnnotatedConnectException:
Connection refused: /127.0.0.1:6652 -- Will try again in 2.97 s
```
5. Start all brokers
6. See producer's log
```
[pulsar-client-io-1-1] INFO org.apache.pulsar.client.impl.ConnectionPool -
[[id: 0x7fa30072, L:/127.0.0.1:50232 - R:/127.0.0.1:6650]] Connected to server
[pulsar-client-io-1-1] INFO org.apache.pulsar.client.impl.ProducerImpl -
[persistent://public/default/test-string3-partition-2] [pulsar-cluster-14-0]
Creating producer on cnx [id: 0x7fa30072, L:/127.0.0.1:50232 -
R:/127.0.0.1:16651]
[pulsar-timer-5-1] INFO org.apache.pulsar.client.impl.ConnectionHandler -
[persistent://public/default/test-string3-partition-0] [pulsar-cluster-14-1]
Reconnecting after connection was closed
[pulsar-client-io-1-1] INFO org.apache.pulsar.client.impl.ConnectionPool -
[[id: 0xa49808e4, L:/127.0.0.1:58935 - R:/127.0.0.1:6651]] Connected to server
[pulsar-client-io-1-1] INFO org.apache.pulsar.client.impl.ProducerImpl -
[persistent://public/default/test-string3-partition-0] [pulsar-cluster-14-1]
Creating producer on cnx [id: 0xa49808e4, L:/127.0.0.1:58935 -
R:/127.0.0.1:6652]
```
Creating producer success, but producer can‘t continue sending messages
after all brokers are restarted.
7. Jstack producer's pid
```
Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.251-b08 mixed mode):
"Attach Listener" #13 daemon prio=9 os_prio=0 tid=0x00007fc510001000
nid=0x4dc2 waiting on condition [0x00000000
00000000]
java.lang.Thread.State: RUNNABLE
"DestroyJavaVM" #12 prio=5 os_prio=0 tid=0x00007fc53c009800 nid=0x4c40
waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"pulsar-external-listener-3-1" #11 prio=5 os_prio=0 tid=0x00007fc4f40b3800
nid=0x4c95 waiting on condition [0x00
007fc5185f5000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x000000078f4085b8> (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$Con
ditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronize
r.java:2039)
at
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.ja
va:1081)
at
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.ja
va:809)
at
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at
org.apache.pulsar.shade.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.
java:30)
at java.lang.Thread.run(Thread.java:748)
"pulsar-timer-5-1" #10 prio=5 os_prio=0 tid=0x00007fc4f4065800 nid=0x4c4c
waiting on condition [0x00007fc5192170
00]
java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at
org.apache.pulsar.shade.io.netty.util.HashedWheelTimer$Worker.waitForNextTick(HashedWheelTimer.java:5
66)
at
org.apache.pulsar.shade.io.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:462)
at
org.apache.pulsar.shade.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.
java:30)
at java.lang.Thread.run(Thread.java:748)
"pulsar-client-io-1-1" #8 prio=5 os_prio=0 tid=0x00007fc53c40b800 nid=0x4c4b
runnable [0x00007fc52c18a000]
java.lang.Thread.State: RUNNABLE
at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
- locked <0x000000078f300d50> (a
org.apache.pulsar.shade.io.netty.channel.nio.SelectedSelectionKeySet)
- locked <0x000000078f300d68> (a
java.util.Collections$UnmodifiableSet)
- locked <0x000000078f300d08> (a sun.nio.ch.EPollSelectorImpl)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
at
org.apache.pulsar.shade.io.netty.channel.nio.SelectedSelectionKeySetSelector.select(SelectedSelection
KeySetSelector.java:62)
at
org.apache.pulsar.shade.io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:814)
at
org.apache.pulsar.shade.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:457)
at
org.apache.pulsar.shade.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExe
cutor.java:986)
at
org.apache.pulsar.shade.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at
org.apache.pulsar.shade.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.
java:30)
at java.lang.Thread.run(Thread.java:748)
"Service Thread" #7 daemon prio=9 os_prio=0 tid=0x00007fc53c0d4000
nid=0x4c49 runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C1 CompilerThread1" #6 daemon prio=9 os_prio=0 tid=0x00007fc53c0b7800
nid=0x4c48 waiting on condition [0x000000
0000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread0" #5 daemon prio=9 os_prio=0 tid=0x00007fc53c0b4800
nid=0x4c47 waiting on condition [0x000000
0000000000]
java.lang.Thread.State: RUNNABLE
```
**Expected behavior**
A clear and concise description of what you expected to happen.
**Screenshots**
If applicable, add screenshots to help explain your problem.
**Desktop (please complete the following information):**
- OS: [e.g. iOS]
**Additional context**
Add any other context about the problem here.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]