>  I'm seeing an unusual problem when both halves of a connection
>actively disconnect at the same time. Each connection peer issues
>a DREQ at the same time, next each receive the DREQ and responds
>with a DREP, and finally each connection gets a callback for the
>transition to the idle state. However, at this point it appears
>that each CM keeps retransmitting DREQ requests, which then seems
>to interfere with new connection establishment.

I think that I understand what's happening.  Receiving the DREQ
changed the state of the cm_id, but did not cancel the previous send.

I'm actually out on vacation for a little over two weeks (and will
be totally away from e-mail after Friday), but something
like the patch below might fix the issue.  (Note that I didn't test /
compile this.)  If it does work for you, feel free to commit it.

Signed-off-by: Sean Hefty <[EMAIL PROTECTED]>


Index: cm.c
===================================================================
--- cm.c        (revision 2568)
+++ cm.c        (working copy)
@@ -1813,9 +1813,12 @@ static int cm_dreq_handler(struct cm_wor
 
        switch (cm_id_priv->id.state) {
        case IB_CM_REP_SENT:
-       case IB_CM_MRA_REP_RCVD:
-       case IB_CM_ESTABLISHED:
        case IB_CM_DREQ_SENT:
+               ib_cancel_mad(cm_id_priv->av.port->mad_agent,
+                             (unsigned long) cm_id_priv->msg);
+               break;
+       case IB_CM_ESTABLISHED:
+       case IB_CM_MRA_REP_RCVD:
                break;
        case IB_CM_TIMEWAIT:
                if (cm_alloc_response_msg(work->port, work->mad_recv_wc, &msg))



_______________________________________________
openib-general mailing list
[email protected]
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to