Michael S. Tsirkin wrote:
Could you please elaborate on why is an event driven model more reliable than
the state driven one?
It certainly seems to require more code: isnt cm_id->state set by CM to
a valid value? It seems that cm needs to track connection state
anyway as per "12.9.2 Invalid State Input Handling", to know which messages
are legal. Is this done? If so why isnt it a good idea to reuse that state
in SDP?

The state of the cm_id is controlled by the CM and can change at any time as a result of processing a received MAD. It's only exposed for debug purposes.

+       result = ib_cm_listen(hca->listen_id,
+                             cpu_to_be64(SDP_MSG_SERVICE_ID_VALUE),
+                             cpu_to_be64(SDP_MSG_SERVICE_ID_MASK));
+       if (result) {
+               sdp_warn("Error <%d> listening for SDP connections", result);
+               goto error;
+       }
+
        ib_set_client_data(device, &sdp_client, hca);

        return;


Can a listen event arrive before we call ib_set_client_data?

Yes.  We may need to swap those two statements.

error:
+       if (!IS_ERR(hca->listen_id))
+               ib_destroy_cm_id(hca->listen_id);
+
        list_for_each_entry_safe(port, tmp, &hca->port_list, list) {
                list_del(&port->list);
                kfree(port);
@@ -1838,6 +1856,9 @@ static void sdp_device_remove_one(struct
                return;
        }

+       if (!IS_ERR(hca->listen_id))
+               ib_destroy_cm_id(hca->listen_id);
+
        list_for_each_entry_safe(port, tmp, &hca->port_list, list) {
                list_del(&port->list);
                kfree(port);


I'd prefer separate labels for two errors, instead of testing IS_ERR twice.

So do I, but I was trying to match the coding style used throughout the SDP code. Fixing the error handling seems like another set of changes to me.

+       if (event->event != IB_CM_REQ_RECEIVED) {
+               conn = sdp_conn_table_lookup(hashent);
+               if (conn)
+                       sdp_conn_lock(conn);
+               else
                        return -EINVAL;
-               }
+               /* Can this fail?  Why not just set context = conn? */
+       }


Regarding the question: the first thing we do in sdp_conn_put
is sdp_conn_table_remove, so I think this lookup can fail.

I should have removed that comment. I'm not overly familiar with the SDP code, but it seems wrong to need to verify that a requested callback can be executed. By returning -EINVAL above, the cm_id associated with the callback will be destroyed. This indicates to me that either that cm_id will be destroyed twice by SDP (once when the SDP conn object is destroyed and again here), or that SDP does not usually destroy cm_id's as part of its normal cleanup. I'm also wondering about possible race conditions that could occur between calling table_lookup and conn_lock, but I don't know the code well enough to say if one exists.

-       switch (cm_id->state) {
-       case IB_CM_REQ_RCVD:
+       switch (event->event) {
+       case IB_CM_REQ_RECEIVED:
                result = sdp_cm_req_handler(cm_id, event);
                break;
-       case IB_CM_REP_RCVD:
+       case IB_CM_REP_RECEIVED:
                result = sdp_cm_rep_handler(cm_id, event, conn);
                break;
-       case IB_CM_IDLE:
+       case IB_CM_REQ_ERROR:
+       case IB_CM_REP_ERROR:
+       case IB_CM_REJ_RECEIVED:
+       case IB_CM_TIMEWAIT_EXIT:
                result = sdp_cm_idle(cm_id, event, conn);
                break;
-       case IB_CM_ESTABLISHED:
+       case IB_CM_RTU_RECEIVED:
+       case IB_CM_USER_ESTABLISHED:
                result = sdp_cm_established(cm_id, event, conn);
                break;
-       case IB_CM_DREQ_RCVD:
+       case IB_CM_DREQ_RECEIVED:
                result = sdp_cm_dreq_rcvd(cm_id, event, conn);
                if (result)
                        break;
                /* fall through on success to handle state transition */
-       case IB_CM_TIMEWAIT:
+       case IB_CM_DREQ_ERROR:
+       case IB_CM_DREP_RECEIVED:
                result = sdp_cm_timewait(cm_id, event, conn);
                break;
        default:
-               sdp_dbg_warn(conn, "Unexpected CM state <%d>", cm_id->state);
+               sdp_dbg_warn(conn, "Unhandled CM event <%d>", event->event);
                result = -EINVAL;
        }
        /*

Seems more code. Could you please elaborate on why is this more correct?

See comment above. Using the cm_id state is incorrect as it can change while a client's callback is running.

- Sean
_______________________________________________
openib-general mailing list
[email protected]
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to