On Mon, 1 Nov 2004 16:38:03 -0800 (PST) Krishna Kumar <[EMAIL PROTECTED]> wrote:
> Hi Sean, > > I think it is reasonable to have current senders racing with > unregister. The unregister is waiting for all references to drop to > zero before freeing up the resources. It killed the ones waiting for > responses(mad_cancel), killed the ones who are executing in callback > handlers, and finally after dropping the loader's module refcnt, it > waits for the refcnt to drop to zero. These can only be threads which > are actively receiving mad packets and those threads in the process of > sending mad packets while the unregister was going on (and the ones > which fail is the only cause of the problem). Essentially I think the > unregister will hang and not free up the resource. The difference here is that a client is calling into the API at the same time that they are trying to unregister. The code, even with this change, cannot handle this condition. For example, if the thread calling ib_unregister_mad_agent executes completely before the thread calling ib_post_send_mad runs (or can take a reference on the mad_agent), the mad_agent is no longer valid, and the structure will have been freed. The thread executing ib_post_send_mad can crash the system at this point. If we want to allow a client to call ib_unregister_mad_agent and ib_post_send_mad simultaneously, then ib_post_send_mad would need to perform some sort of lookup (likely in some global map) to validate the mad_agent. - Sean _______________________________________________ openib-general mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
