zbentley opened a new issue, #16009: URL: https://github.com/apache/pulsar/issues/16009
**Describe the bug** Occasionally, in flakey CI tests that run Python code in a heavily threaded environment, we have segfaults when calling `connect`, `create_producer`, or `producer.close()` on Pulsar client objects. I wish I had more debugging info or a full C system call trace, but the failures occur only in CI where I can't use `gdb`, and Python's `faulthandler` doesn't seem to provide a stacktrace unfortunately. The errors always have these characteristics: - They exist in an environment with many threads. - Some of the threads have previously used a Pulsar client, and still exist, but are not using that client. - The current thread attempting to use the Pulsar client is doing a `connect`, `create_producer`, or `producer.close()` operation. - The error line is either an unhandled SIGSEGV, or an abort with the description `[__pthread_mutex_lock: Assertion `mutex->__data.__owner == 0' failed](https://stackoverflow.com/questions/9239999/pthread-mutex-lock-c62-pthread-mutex-lock-assertion-mutex-data-owner)`. I'm sorry I don't have more specific debugging or reproduction information. This only occurs on client 2.10.0; we're using that client with Python 3.7.13 on Linux, in Docker, x86_64 arch. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
