zengguan opened a new issue, #1027:
URL: https://github.com/apache/pulsar-client-go/issues/1027

   #### Expected behavior
   
   When the `reconnectToBroker()` logic is triggered , the reconnection and 
resend-messages can be successful.
   
   #### Actual behavior
   
   After client reconnect to broker and resend-messages, it received an ack 
larger than expected,and trigger the `ConnectionClosed` logic。So it's stuck in 
a reconnection-resend-connectionClosed loop。
   
   #### Steps to reproduce
   we can use `tcpkill` command for stably reproduce this phenomenon。
   
   1.  use pulsar-perf to send a lot messages to broker
   
   `./bin/pulsar-perf produce persistent://pulsar/test/test01 -r 50000 
--profile  --max-connections 30 --profile --service-url pulsar://0.0.0.0:6650 
-q 500000 --batching-time 10`
   
   2.  in client side, use tcp kill to mock connection failed
   `tcpkill -i any -9 port 6650`
   
   3. after 3~5s, we stop tcpkill and recover the connection. But the client 
keeps trying to reconnect, but it always fails even when we restore the network。
   
   ```
   time="20:08:39.551" level=info msg="TCP connection established" 
local_addr="1.1.1.1:53112" remote_addr="pulsar://0.0.0.0:6650"
   time="20:08:39.556" level=info msg="Connection is ready" 
local_addr="1.1.1.1:53112" remote_addr="pulsar://0.0.0.0:6650"
   time="20:08:39.558" level=info msg="Connected producer" cnx="1.1.1.1:53112 
-> 0.0.0.0:6650" epoch=2 producerID=1 
producer_name=pulsar-bxde-110-pulsar-cluster-77-21 
topic="persistent://pulsar/test/test01-partition-1"
   time="20:08:39.558" level=info msg="Resending 273 pending batches" 
producerID=1 producer_name=pulsar-bxde-110-pulsar-cluster-77-21 
topic="persistent://pulsar/test/test01-partition-1"
   time="20:08:39.574" level=warning msg="Received ack for ledgerId:3898 
entryId:13758 on sequenceId 492378 - expected: 517774, closing connection" 
producerID=1 producer_name=pulsar-bxde-110-pulsar-cluster-77-21 
topic="persistent://pulsar/test/test01-partition-1"
   time="20:08:39.580" level=warning msg="Received ack for ledgerId:3898 
entryId:13759 on sequenceId 492490 - expected: 517902, closing connection" 
producerID=1 producer_name=pulsar-bxde-110-pulsar-cluster-77-21 
topic="persistent://pulsar/test/test01-partition-1"
   time="20:08:39.587" level=info msg="Reconnected producer to broker" 
cnx="1.1.1.1:53112 -> 0.0.0.0:6650" producerID=1 
producer_name=pulsar-bxde-110-pulsar-cluster-77-21 
topic="persistent://pulsar/test/test01-partition-1"
   time="20:08:39.587" level=warning msg="Received ack for ledgerId:3898 
entryId:13760 on sequenceId 492616 - expected: 492378, closing connection" 
producerID=1 producer_name=pulsar-bxde-110-pulsar-cluster-77-21 
topic="persistent://pulsar/test/test01-partition-1"
   time="20:08:39.587" level=warning msg="Connection was closed" 
cnx="1.1.1.1:53112 -> 0.0.0.0:6650" producerID=1 
producer_name=pulsar-bxde-110-pulsar-cluster-77-21 
topic="persistent://pulsar/test/test01-partition-1"
   time="20:08:39.587" level=warning msg="Connection Closed from listener" 
local_addr="1.1.1.1:53112" remote_addr="pulsar://0.0.0.0:6650"
   time="20:08:39.587" level=info msg="runEventsLoop will reconnect in 
producer" producerID=1 producer_name=pulsar-bxde-110-pulsar-cluster-77-21 
topic="persistent://pulsar/test/test01-partition-1"
   time="20:08:39.587" level=info msg="Reconnecting to broker in 114.187664ms" 
producerID=1 producer_name=pulsar-bxde-110-pulsar-cluster-77-21 
topic="persistent://pulsar/test/test01-partition-1"
   time="20:08:39.608" level=info msg="Connecting to broker" 
remote_addr="pulsar://0.0.0.0:6650"
   time="20:08:39.610" level=info msg="TCP connection established" 
local_addr="1.1.1.1:53118" remote_addr="pulsar://0.0.0.0:6650"
   time="20:08:39.613" level=info msg="Connection is ready" 
local_addr="1.1.1.1:53118" remote_addr="pulsar://0.0.0.0:6650"
   time="20:08:39.624" level=info msg="Connecting to broker" 
remote_addr="pulsar://0.0.0.0:6650"
   time="20:08:39.625" level=info msg="TCP connection established" 
local_addr="1.1.1.1:53120" remote_addr="pulsar://0.0.0.0:6650"
   time="20:08:39.627" level=info msg="Connection is ready" 
local_addr="1.1.1.1:53120" remote_addr="pulsar://0.0.0.0:6650"
   time="20:08:39.627" level=info msg="Connected producer" cnx="1.1.1.1:53120 
-> 0.0.0.0:6650" epoch=2 producerID=2 
producer_name=pulsar-bxde-110-pulsar-cluster-78-22 
topic="persistent://pulsar/test/test01-partition-0"
   time="20:08:39.627" level=info msg="Resending 282 pending batches" 
producerID=2 producer_name=pulsar-bxde-110-pulsar-cluster-78-22 
topic="persistent://pulsar/test/test01-partition-0"
   time="20:08:39.632" level=warning msg="Received ack for ledgerId:3897 
entryId:14219 on sequenceId 492428 - expected: 517684, closing connection" 
producerID=2 producer_name=pulsar-bxde-110-pulsar-cluster-78-22 
topic="persistent://pulsar/test/test01-partition-0"
   time="20:08:39.633" level=warning msg="Received ack for ledgerId:3897 
entryId:14220 on sequenceId 492554 - expected: 517685, closing connection" 
producerID=2 producer_name=pulsar-bxde-110-pulsar-cluster-78-22 
topic="persistent://pulsar/test/test01-partition-0"
   time="20:08:39.633" level=warning msg="Received ack for ledgerId:3897 
entryId:14221 on sequenceId 492560 - expected: 517686, closing connection" 
producerID=2 producer_name=pulsar-bxde-110-pulsar-cluster-78-22 
topic="persistent://pulsar/test/test01-partition-0"
   time="20:08:39.635" level=warning msg="Received ack for ledgerId:3897 
entryId:14222 on sequenceId 492686 - expected: 517815, closing connection" 
producerID=2 producer_name=pulsar-bxde-110-pulsar-cluster-78-22 
topic="persistent://pulsar/test/test01-partition-0"
   time="20:08:39.637" level=warning msg="Received ack for ledgerId:3897 
entryId:14223 on sequenceId 492812 - expected: 518070, closing connection" 
producerID=2 producer_name=pulsar-bxde-110-pulsar-cluster-78-22 
topic="persistent://pulsar/test/test01-partition-0"
   time="20:08:39.637" level=warning msg="Received ack for ledgerId:3897 
entryId:14224 on sequenceId 492938 - expected: 518070, closing connection" 
producerID=2 producer_name=pulsar-bxde-110-pulsar-cluster-78-22 
topic="persistent://pulsar/test/test01-partition-0"
   time="20:08:39.637" level=warning msg="Received ack for ledgerId:3897 
entryId:14225 on sequenceId 492947 - expected: 518070, closing connection" 
producerID=2 producer_name=pulsar-bxde-110-pulsar-cluster-78-22 
topic="persistent://pulsar/test/test01-partition-0"
   time="20:08:39.637" level=warning msg="Received ack for ledgerId:3897 
entryId:14226 on sequenceId 493049 - expected: 518070, closing connection" 
producerID=2 producer_name=pulsar-bxde-110-pulsar-cluster-78-22 
topic="persistent://pulsar/test/test01-partition-0"
   time="20:08:39.637" level=info msg="Reconnected producer to broker" 
cnx="1.1.1.1:53120 -> 0.0.0.0:6650" producerID=2 
producer_name=pulsar-bxde-110-pulsar-cluster-78-22 
topic="persistent://pulsar/test/test01-partition-0"
   time="20:08:39.639" level=warning msg="Received ack for ledgerId:3897 
entryId:14227 on sequenceId 493175 - expected: 492428, closing connection" 
producerID=2 producer_name=pulsar-bxde-110-pulsar-cluster-78-22 
topic="persistent://pulsar/test/test01-partition-0"
   time="20:08:39.639" level=warning msg="Connection was closed" 
cnx="1.1.1.1:53120 -> 0.0.0.0:6650" producerID=2 
producer_name=pulsar-bxde-110-pulsar-cluster-78-22 
topic="persistent://pulsar/test/test01-partition-0"
   time="20:08:39.639" level=warning msg="Connection Closed from listener" 
local_addr="1.1.1.1:53120" remote_addr="pulsar://0.0.0.0:6650"
   time="20:08:39.639" level=warning msg="Failed to write on connection" 
error="write tcp 1.1.1.1:53120->0.0.0.0:6650: use of closed network connection" 
local_addr="1.1.1.1:53120" remote_addr="pulsar://0.0.0.0:6650"
   time="20:08:39.639" level=warning msg="Failed to write on connection" 
error="write tcp 1.1.1.1:53120->0.0.0.0:6650: use of closed network connection" 
local_addr="1.1.1.1:53120" remote_addr="pulsar://0.0.0.0:6650"
   time="20:08:39.639" level=info msg="runEventsLoop will reconnect in 
producer" producerID=2 producer_name=pulsar-bxde-110-pulsar-cluster-78-22 
topic="persistent://pulsar/test/test01-partition-0"
   time="20:08:39.639" level=info msg="Reconnecting to broker in 104.324896ms" 
producerID=2 producer_name=pulsar-bxde-110-pulsar-cluster-78-22 
topic="persistent://pulsar/test/test01-partition-0"
   time="20:08:39.703" level=info msg="Connecting to broker" 
remote_addr="pulsar://0.0.0.0:6650"
   time="20:08:39.704" level=info msg="TCP connection established" 
local_addr="1.1.1.1:53122" remote_addr="pulsar://0.0.0.0:6650"
   time="20:08:39.707" level=info msg="Connection is ready" 
local_addr="1.1.1.1:53122" remote_addr="pulsar://0.0.0.0:6650"
   time="20:08:39.713" level=info msg="Connecting to broker" 
remote_addr="pulsar://0.0.0.0:6650"
   time="20:08:39.714" level=info msg="TCP connection established" 
local_addr="1.1.1.1:53124" remote_addr="pulsar://0.0.0.0:6650"
   time="20:08:39.718" level=info msg="Connection is ready" 
local_addr="1.1.1.1:53124" remote_addr="pulsar://0.0.0.0:6650"
   time="20:08:39.720" level=info msg="Connected producer" cnx="1.1.1.1:53124 
-> 0.0.0.0:6650" epoch=3 producerID=1 
producer_name=pulsar-bxde-110-pulsar-cluster-77-21 
topic="persistent://pulsar/test/test01-partition-1"
   time="20:08:39.720" level=info msg="Resending 278 pending batches" 
producerID=1 producer_name=pulsar-bxde-110-pulsar-cluster-77-21 
topic="persistent://pulsar/test/test01-partition-1"
   ....
   ....
   ....
   ```
   
   And finally, the client enter the message send timeout error
   
   #### Related issue
   
   #697 
   
   
   #### System configuration
   **Pulsar version**: 2.11.0
   **Pulsar-client-go version**: 0.10.0、master
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to