zbentley opened a new issue, #116:
URL: https://github.com/apache/pulsar-client-python/issues/116

   We've observed full python interpreter lockups (not just "blocking": the 
interpreter calling the client halts; can't be unblocked or time out/raise 
exceptions, even if the blocking operation is moved to a python Thread and 
waited on with a timeout) in the presence of:
   - The 2.10.1 python client.
   - Python threading (using pulsar Client from a thread).
   - Python asyncio/event loop Future manipulation.
   - consumers in the act of receiving messages (running client's internal 
receive loop).
   - Many Nacks of the same message.
   - Multiple consumers.
   - using a Python `logger=` argument to Client. We must do this, otherwise 
the logs emitted by the client to STDOUT fill up our disks.
   
   
   All of those have to be present to trigger the issue. When multiple Shared 
consumers are repeatedly nacking messages with a 15sec delay on a topic with a 
few hundred messages (100% of them are nacked over and over), all but one of 
the consumers eventually (within a few minutes) locks up--that is, no Python in 
that consumer can run. It's not just that it's blocked in a 
`negative_acknowledge` call, it's that all threads, signal handlers, 
coroutines, etc. in that interpreter are stuck. This says GIL conflict to me.
   
   While this program has many hundreds of threads, the stacktraces from the 
most relevant ones are included here:
   
[threads.txt](https://github.com/apache/pulsar-client-python/files/11406968/threads.txt)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to