I see hangs killing opensm related to a bug in user_mad.c.  The problem appears
to be:

ib_umad_close()
        downgrade_write(&file->port->mutex)
        ib_unregister_mad_agent(...)
        up_read(&file->port->mutex)

ib_unregister_mad_agent() flushes any outstanding MADs, resulting in calls to
send_handler() and recv_handler(), both of which call queue_packet():

queue_packet()
        down_read(&file->port->mutex)
        ...
        up_read(&file->port->mutex)

ib_umad_kill_port() has a similar issue as ib_umad_close().

Does anyone know the reasoning for holding the mutex around
ib_unregister_mad_agent()?

- Sean

_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to