RE: Possible event on cm_id after rdma_accept() fails?

Hefty, Sean Fri, 06 Jan 2012 14:23:56 -0800

> If I get a new cm_id from a RDMA_CM_EVENT_CONNECT_REQUEST
> event, and call rdma_accept(), and then rdma_accept() returns an error
> for whatever reason, is it safe to assume I won't get any other events
> for the cm_id I failed to accept?
> 
> ie there's no race where another thread might get a disconnect event
> on that cm_id too, right?  So it's safe to just start cleaning up after
> I get a failed status from rdma_accept().


I typed in the response below before it occurred to me to ask whether you're 
asking about calling rdma_accept from user space or the kernel.  I was assuming 
user space. 

It looks like there is a race.

In ucma.c : ucma_event_handler(), we have:

 .. if (!ctx->uid) {
        /* ... ignore events for new connections ... */
        ...
        goto out;
 }

But ucma_accept() does:

        ctx->uid = cmd.uid;
        ...
        ret = rdma_accept(ctx->cm_id ...)

I believe it's possible for the remote side to issue a reject.  ucma_accept() 
could set ctx->uid.  The rdma_cm could process the reject, change the state of 
the cm_id, and call ucma_event_handler, which would queue the event.  
rdma_accept() would then fail.

I think restructuring ucma_accept() like this would fix this race:

        mutex_lock(&file->mut);
        ret = rdma_accept(...)
        if (!ret)
                ctx->uid = cmd.uid;
        mutex_unlock(&file->mut);

I'll give this problem more thought.

- Sean
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

RE: Possible event on cm_id after rdma_accept() fails?

Reply via email to