With you below comment of "CM needs to know the connection model
selected by the app" I am somehow confused. With reading your other
comments, I see two options here based on whether the implementation
differentiate between peer-to-peer SIDs to client/server SIDs:
if there's no difference, then also in the peer-to-peer model, the
application must first tell the CM to listen on a SID and its up to the
CM to break the symmetry and decide who sends the REP and who ignores
the REQ.
if there is a diff, then peer-to-peer SIDs are in a different domain
then client/server SIDs.
I didn't follow this.
To add to my comments on the CM API, struct ib_cm_req_param, which is
used to send the REQ, includes service_id and peer_to_peer fields. The
latter is a boolean used by the CM to distinguish if incoming REQs can
be matched with the outgoing REQ.
Peer to peer SIDs are in a different domain than client/server SIDs, and
the peer_to_peer field is used to indicate which domain a SID is in.
Why there should be a difference between the rdma-cm to the cm? if in
the cm you have a model without API change, wouldn't it apply also to
the rdma-cm?
The rdma_cm does not know how to set the peer_to_peer field in the
ib_cm_req_param. It sets this field to 0 today.
I think that in the MPI world each rank gets a SID from the local CM and
they exchange the SIDs out-of-band, then connections are opened. If its
a connection-on-demand scheme, then when ever the rank process calls
mpi_send() to peer for which the local MPI library does not have a
connection, it tries to connect. So if this happens "at once" between
some pair of ranks, there should be a way to form one connection out of
these two connecting requests. My thinking/motivation is that support of
this scheme should be in the IB stack (cm and rdma-cm) level and not in
the specific MPI implementation level.
Are the out of band connections used by MPI formed using client/server
or peer to peer? I believe that Intel MPI has each rank listen for
connections from the ranks below it using client/server.
There are a couple of problems with the peer to peer model. First,
unless the connections occur at exactly the same time, they miss
connecting (rejected with invalid SID). Second, if multiple peer to
peer connections need to form between the same pair of nodes, things can
go screwy (that's the technical term) trying to match up the peer requests.
- Sean
_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general