Sean Hefty wrote:

James Lentini wrote:

void dapli_destroy_conn(struct dapl_cm_id *conn)
{
    int in_callback;
+    struct rdma_cm_id *cm_id;

dapl_dbg_log(DAPL_DBG_TYPE_CM, " destroy_conn: conn %p id %d\n",
             conn,conn->cm_id);
-
    dapl_os_lock(&conn->lock);
    conn->destroy = 1;
    in_callback = conn->in_callback;
-    dapl_os_unlock(&conn->lock);
-
-    if (!in_callback) {
-        if (conn->ep)
-            conn->ep->cm_handle = IB_INVALID_HANDLE;
-        if (conn->cm_id) {
-            if (conn->cm_id->qp)
-                rdma_destroy_qp(conn->cm_id);
-            rdma_destroy_id(conn->cm_id);
+    do {
+        if (in_callback) {
+            dapl_os_unlock(&conn->lock);
+            usleep(10);
+            dapl_os_lock(&conn->lock);


In general this doesn't work. The calling thread may be the callback thread, which would lead to deadlock. This is why we don't just call rdma_destroy_id() directly, and let it wait for the callback to complete.

Sorry, the callback names should be changed since it is really a async event processing thread and not a direct callback from CMA. The async thread can destroy the cm_id if we no longer hold any cm_id event references, we destroy the associated QP, and we are syncronized with any other thread that could be destroying at the same time. This is how the code currently works.

I did not see the original thread/patch from Steve so I don't have the entire context of this issue but it sounds like we need to fix the code so that the destroy QP (dat_ep_free) blocks until the event processing is complete, always destroy the QP and cm_id from this call, and remove cleanup from any async event processing threads. Is this what Steve was attempting to do with his patch?

I seemed to have missed the posting of the patch so could someone point me to the original patch so I can review and test any changes.

thanks

-arlin

_______________________________________________
openib-general mailing list
[email protected]
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to