On Mon, Mar 2, 2009 at 7:57 PM, Ralph Campbell <[email protected]> wrote: > On Sun, 2009-03-01 at 17:23 -0800, Hal Rosenstock wrote: >> On Fri, Feb 27, 2009 at 4:45 PM, Ralph Campbell >> <[email protected]> wrote: >> >> Good catches. >> >> > If ib_post_send_mad() returns zero, it guarantees that there will be >> > a callback to the send_buf->mad_agent->send_handler() so that the >> > sender can call ib_free_send_mad(). Otherwise, the ib_mad_send_buf >> > will be leaked and the mad_agent reference count will never go to zero >> > and the IB device module cannot be unloaded. >> > The above can happen without this patch if process_mad() returns >> > (IB_MAD_RESULT_SUCCESS | IB_MAD_RESULT_CONSUMED). >> >> This looks right although I'm not sure why this wasn't seen a long time ago. > > mthca returns IB_MAD_RESULT_CONSUMED for TRAP repress.
Also, locally generated traps (which recent OpenSMs use for SMAs not generating SM priority change traps). > I verified that using libibumad to send a direct routed TRAP repress with > hop=1/cnt=0 > does hang on mthca. I think opensm only sends LID routed TRAP repress > so that would explain why this wasn't seen earlier. AFAIK OpenSM responds with trap repress based on the style of the incoming trap (e.g. LID or direct routed). > For ipath, it was fairly recently that we made SubnSet(Portinfo) to > disabled to not send a reply and we have a program that uses libibumad > to disable ports. Without the patch, the program hangs. > >> > If process_mad() returns IB_MAD_RESULT_SUCCESS and there is no agent >> > registered to receive the mad being sent, handle_outgoing_dr_smp() >> > returns zero which causes a MAD packet which is at the end of the >> > directed route to be incorrectly sent on the wire but doesn't cause >> > a hang since the HCA generates a send completion. >> >> This also looks right. >> >> Does the packet really get out on the IB wire ? Was that with a >> Mellanox HCA ? I don't recall ever seeing these extra packets but >> maybe they end up swallowed by certain HCAs. > > Yes, a packet is sent with either mthca or ipath. What does it look like ? Is it responded to ? -- Hal >> Unfortunately, I have no current way of regressing these changes. I'm >> presuming you've done this with Mellanox HCAs and with OpenSM running. > > opensm runs fine with the patch applied on either mthca or ipath. > >> One other question is whether this should be separated into two increments. > > I think the changes are small enough and related enough to justify a > single patch but I will split it into two if someone feels I should. > - Show quoted text - >> -- Hal >> >> > Signed-off-by: Ralph Campbell <[email protected]> >> > >> > diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c >> > index dbcd285..62a99dc 100644 >> > --- a/drivers/infiniband/core/mad.c >> > +++ b/drivers/infiniband/core/mad.c >> > @@ -742,9 +742,7 @@ static int handle_outgoing_dr_smp(struct >> > ib_mad_agent_private *mad_agent_priv, >> > break; >> > case IB_MAD_RESULT_SUCCESS | IB_MAD_RESULT_CONSUMED: >> > kmem_cache_free(ib_mad_cache, mad_priv); >> > - kfree(local); >> > - ret = 1; >> > - goto out; >> > + break; >> > case IB_MAD_RESULT_SUCCESS: >> > /* Treat like an incoming receive MAD */ >> > port_priv = ib_get_mad_port(mad_agent_priv->agent.device, >> > @@ -755,10 +753,12 @@ static int handle_outgoing_dr_smp(struct >> > ib_mad_agent_private *mad_agent_priv, >> > &mad_priv->mad.mad); >> > } >> > if (!port_priv || !recv_mad_agent) { >> > + /* >> > + * No receiving agent so drop packet and >> > + * generate send completion. >> > + */ >> > kmem_cache_free(ib_mad_cache, mad_priv); >> > - kfree(local); >> > - ret = 0; >> > - goto out; >> > + break; >> > } >> > local->mad_priv = mad_priv; >> > local->recv_mad_agent = recv_mad_agent; >> > >> > >> > _______________________________________________ >> > general mailing list >> > [email protected] >> > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general >> > >> > To unsubscribe, please visit >> > http://openib.org/mailman/listinfo/openib-general >> > > > _______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
