Hi, Roland! This is not directly related to the last oops report that I sent.
I noticed that unregister_mad_agent in mad.c flushes the port wq. This has the potential to block until a work is finished. However, one of the things done on the work queue is calling handlers for existing agents. Looking at user_mad.c, ib_umad_close calls ib_unregister_mad_agent with port mutex taken, while send_handler calls queue_packet which in turn takes the port mutex. It seems, therefore, that we can have a deadlock inside user_mad, where ib_umad_close calls ib_unregister_mad_agent which blocks until send_handler runs which is blocked by the port mutex. A possible solution would be to move ib_unregister_mad_agent outside the code section protected by the mutex. Does this make sense? -- MST _______________________________________________ openib-general mailing list [email protected] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
