RuRo commented on issue #18090: URL: https://github.com/apache/incubator-mxnet/issues/18090#issuecomment-617628230
A little further investigation in GDB with some debug symbols revealed the following information: Thread 1 (main thread) is in `CustomOperator::Find`. (it shows up as [`mxnet::op::custom::AttrParser`](https://github.com/apache/incubator-mxnet/blob/6a809aaca12ea4dd31d1c2d8121b881b25eb916b/src/operator/custom/custom.cc#L98) in the backtrace because of inlining). Thread 1 is [waiting for the `CustomOperator::mutex_`](https://github.com/apache/incubator-mxnet/blob/6a809aaca12ea4dd31d1c2d8121b881b25eb916b/src/operator/custom/custom-inl.h#L63), which is currently held by Thread 13. Thread 13 is in `_CallPythonObject` (python internals called from `libffi`), trying to call `register.do_register.creator.create_operator_entry.delete_entry` which is a callback for `CustomOp::del`. To do that, it tries to acquire the Global Interpreter Lock, which is currently held by Thread 1. I am really not sure, how could Thread 13 arrive at the `CustomOp` deleter, while holding the `CustomOperator::mutex`. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
