On Tue, 24 Nov 2009, Sean Hefty wrote:

Thank you.  This worked for me.  However, there seems to be some kind of
race when the connection is first set up.  On the client if I call
ib_cm_send_lap() immediately after ib_cm_send_rtu() returns successfully I
get an EINVAL error.  If I delay for one second it works just fine.

ib_cm_send_lap() returning EINVAL should indicate an immediate error, so this
should be an issue with the local side.  It sounds like a possible bug in the
code, but I didn't see anything obvious from a quick look at the code.

That's what I suspected. I wonder if the connection state isn't set properly until later? I'm really not sure. Without a kernel debugger it'll be hard to determine. I guess I can throw some printfs in to track this down unless there are better suggestions.


According to the spec the passive/server side can not send the lap so I
can't send it in the rtu handler.  Presumably the call fails immediately
after send_rtu because the server hasn't received that message yet?  Is
this right?  Is there a way to do this cleanly without a delay?

I don't know that the code enforces that the passive side not send a LAP, (and
can't think of a reason why the protocol should have such a restriction.)  It
may work.  But, rather than sending a separate LAP immediately after connecting,
why not include the alternate path in the original REQ?

This creates a race for me. We have a discovery process that finds nodes and paths to nodes. If it discovers a new path while the connection is in the process of being created it won't see an existing connection and we won't add the alternate path. To close this race I have to check for an alternate path when the connection is complete anyway.


I notice that if I create the initial attributes for the connection
request with an alternate path specified the alt_path_state is still
MIGRATED when I send rtu.  If I load a path after the connection is
established I can fail back and forth without issue.

Can you clarify this a little more?  What specific field are you looking at and
what state are you seeing it set to?


This turned out to be a bug in my code. I'm very confident in the post-connection alternate path code however.

Thanks,
Jeff
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to