1) When you listen for connections, the event includes a new cm_id
struct attached to the listen event channel. Attempts to change this
channel make the cm_id unusable (rdma_create_qp fails). This is
suboptimal in situations where you want the listen channel to produce
listen events only. A function such as rdma_modify_channel(cm_id,
new_channel); would work to solve this.
2) When you create a new cm_id with the intent of connecting to another
machine, it is again desirable to get your events related to the
establishment of the connection in a separate channel from those events
related to already established connections (amongst other things, if you
are sharing a channel with a different thread that is responsible for
tearing down connections on error, then which thread gets the
ADDR_RESOLVED or ROUTE_RESOLVED events is up in the air...to make sure
it gets delivered properly, I currently have the connecting thread
pthread_mutex_lock the connection construct, set connection->cm_waiting
= 1, then issue the rdma_resolve_route, then pthread_mutex_lock again so
it deadlocks, and then other thread gets the event, checks
connection->cm_waiting == 1, and if true it places the event pointer in
connection->event, clears connection->cm_waiting, then
pthread_mutex_unlock's the connection...how gross is that). So, using a
separate event channel up until the connection is established, then
calling rdma_modify_channel() would also solve this problem.
Thanks for the feedback. I'll give this some thought and see how
difficult it is to add an rdma_modify_channel() routine.
3) The man pages on rdma_connect() and rdma_accept() aren't really
clear on the role of the connection parameters struct that gets passed
in. Specifically, it doesn't say whether or not the initiator_depth and
responder_resources in the parm struct present in the listen event are
what the other side set, or if they are already swapped to indicate the
minimum/maximum that we can set on our side of the connection. Also,
the initial message pointer is not detailed. When we call
rdma_accept/rdma_reject, does our parm struct need to have that same
pointer? Do we need to free that mem? Can we supply a new initial
message and not leak the memory associated with the incoming initial
message?
I'll update the man pages to answer your questions.
- Sean
_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general