Thnks.
CD

Quoting "Hefty, Sean" <[email protected]>:

The original post was to determine if it were possible to have server
app that managed the data required to establish multicast IB
communctions between 2 or more nodes.  Each node would initialize
itself as needed wrt IB and each node would request from the server,
as I now understand, the qpn, qkey and address handle for the
multicast group it desired to communicate with.  The server, having
created dynamically through the SA, the multicast group, would return
said data and then the node would be able to begin posting multicast
sends, or receives.  Alternatively, if I understand correctly, I can
create the multicast group on start up of the opensmd.

Every node that wants to participate in a multicast group must join the group. This is usually done by having the node that wishes to join send a multicast join request directly to the SA. (Note that the join request can also create the group.) There are 2 interfaces available to applications that result in sending join requests: umad and the rdma_cm.

Node X cannot join a multicast group and pass its multicast address to Node Y. Each join request must be for a specific node. I believe it's architecturally possible for node X to send the join request on behalf of node Y, I don't know if anyone has ever tried that or if the existing implementations support that.

Also, be aware that the SA manages multicast joins per node and not per request. I.e. If node X joins a group twice, followed by 1 leave request, the node will be removed from the group. The SA does not perform reference counting. (It cannot distinguish between 2 separate requests, versus a single request that may have been retried.)

In the rdma and multicast examples I have seen, each node sets up an
rdma cm event channel.  The node then polls for events.

The rdma_cm also supports synchronous operation. If an rdma_cm_id is created without an event channel, all calls will block until they complete. Any results (e.g. communication parameters) are returned in the rdma_cm_id. See rdma_client and rdma_server for examples of synchronous operation. (Those establish a connection, but the same principals apply.)

I had hoped to be able to avoid using the rdma_cm and avoid having to
monitor an rdma_cm event channel. What I think I would like to do is
have each node of my sim initialize it's side of the communiction,
which I think should include

rdma_bind_addr
rdma_resolve_addr
ibv_create_ah
rdma_join_multicast

You should be able to eliminate rdma_bind_addr and pass in the source address into rdma_resolve_addr. If your IP routing tables will resolve a multicast address to an IPoIB device, you can eliminate the source address completely. (rdma_resolve_addr calls rdma_bind_addr internally.)

The ibv_create_ah must come after rdma_join_multicast, after you have the join response.

then ibv_post_send/ibv_post_recv as required.

However, the rdma_* calls require an rdma_cm_id which I won't have if
I don't use the rdma cm.

correct

Can I bypass using the rdma cm and the polling of the event channel?
Or perhaps am I going to have to establish an event channel between my
management server and each individual node?  On the other hand, if I
can terminate the polling of the event channel once initialization is
done, maybe I don't mind the rdma cm....

See above to use synchronous operation and avoid polling the event channel.

Can I bypass the polling of the completion queue?... which would imply
I am simply trusting the data arrived at its destination?

You must poll the CQ to avoid overrunning the send queue. Multicast is unreliable, so a successful completion simply means that the data was transmitted without error. It does not guarantee that the receiver has it. Using QP based communication isn't trivial...

Sorry to ask so many questions.  Are they any good books on this
programming infiniband?

I'm not aware of any books, let alone good ones...
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html




--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to