fracasula opened a new pull request #366:
URL: https://github.com/apache/pulsar-client-go/pull/366


   ### Motivation
   
   This is an integration test built with the intent of replicating the bug 
reported in [this 
issue](https://github.com/apache/pulsar/issues/7682#issuecomment-683914492). I 
opened this PR to gather some feedback before working on a fix.
   
   A possible solution would be to move the `connectionClosed` events out of 
the `eventsCh` channel and have them into their own channel (e.g. 
`connectionEventsCh`): 
https://github.com/apache/pulsar-client-go/blob/master/pulsar/consumer_partition.go#L752
   
   This is because some of the events travelling through the `eventsCh` channel 
rely on an open connection to finish. So if the `eventsCh` channel gets full 
with, say, in-flight `ackRequest` events, the `connectionClosed` event cannot 
be processed until the `ackRequest` are finished and the `ackRequest` cannot 
finish because the connection is closed and the `connectionClosed` event (which 
should trigger a reconnection to the broker) is stuck waiting to be pushed to 
the `eventsCh` channel.
   
   Given that the `connectionClosed` event is needed to trigger a reconnection 
to the broker I think it could make sense to have it in a separate channel.
   
   If this approach sounds sensible to you I can start making modifications.
   
   ### Replicate the bug
   
   The test doesn't always fail, you'll have to try again. In my case it fails 
1 out of 3 times (more or less).
   So if you run it just once and you see the test complete instead of failing, 
then please run it again.
   
   Once the test fails you should see that it timed out after five minutes and 
that it printed a lot of logs saying `Trying to ack {messageID}`.
   
   Please check the attached `deadlock.zip` which shows the logs of a failing 
test.
   
   ### Running the test
   
   * make sure you have `docker` running and usable from the user that runs the 
test 
   * make sure you do `docker pull apachepulsar/pulsar:2.6.1` first
   * then do: `go test -run TestDeadlock ./... -tags integration -v -count 1`
   
   
[deadlock.zip](https://github.com/apache/pulsar-client-go/files/5225547/deadlock.zip)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to