On Fri, Feb 27, 2009 at 4:45 PM, Ralph Campbell <[email protected]> wrote:
Good catches. > If ib_post_send_mad() returns zero, it guarantees that there will be > a callback to the send_buf->mad_agent->send_handler() so that the > sender can call ib_free_send_mad(). Otherwise, the ib_mad_send_buf > will be leaked and the mad_agent reference count will never go to zero > and the IB device module cannot be unloaded. > The above can happen without this patch if process_mad() returns > (IB_MAD_RESULT_SUCCESS | IB_MAD_RESULT_CONSUMED). This looks right although I'm not sure why this wasn't seen a long time ago. > If process_mad() returns IB_MAD_RESULT_SUCCESS and there is no agent > registered to receive the mad being sent, handle_outgoing_dr_smp() > returns zero which causes a MAD packet which is at the end of the > directed route to be incorrectly sent on the wire but doesn't cause > a hang since the HCA generates a send completion. This also looks right. Does the packet really get out on the IB wire ? Was that with a Mellanox HCA ? I don't recall ever seeing these extra packets but maybe they end up swallowed by certain HCAs. Unfortunately, I have no current way of regressing these changes. I'm presuming you've done this with Mellanox HCAs and with OpenSM running. One other question is whether this should be separated into two increments. -- Hal > Signed-off-by: Ralph Campbell <[email protected]> > > diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c > index dbcd285..62a99dc 100644 > --- a/drivers/infiniband/core/mad.c > +++ b/drivers/infiniband/core/mad.c > @@ -742,9 +742,7 @@ static int handle_outgoing_dr_smp(struct > ib_mad_agent_private *mad_agent_priv, > break; > case IB_MAD_RESULT_SUCCESS | IB_MAD_RESULT_CONSUMED: > kmem_cache_free(ib_mad_cache, mad_priv); > - kfree(local); > - ret = 1; > - goto out; > + break; > case IB_MAD_RESULT_SUCCESS: > /* Treat like an incoming receive MAD */ > port_priv = ib_get_mad_port(mad_agent_priv->agent.device, > @@ -755,10 +753,12 @@ static int handle_outgoing_dr_smp(struct > ib_mad_agent_private *mad_agent_priv, > &mad_priv->mad.mad); > } > if (!port_priv || !recv_mad_agent) { > + /* > + * No receiving agent so drop packet and > + * generate send completion. > + */ > kmem_cache_free(ib_mad_cache, mad_priv); > - kfree(local); > - ret = 0; > - goto out; > + break; > } > local->mad_priv = mad_priv; > local->recv_mad_agent = recv_mad_agent; > > > _______________________________________________ > general mailing list > [email protected] > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > _______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
