Thnks.
CD
Quoting "Hefty, Sean" <[email protected]>:
The original post was to determine if it were possible to have server
app that managed the data required to establish multicast IB
communctions between 2 or more nodes. Each node would initialize
itself as needed wrt IB and each node would request from the server,
as I now understand, the qpn, qkey and address handle for the
multicast group it desired to communicate with. The server, having
created dynamically through the SA, the multicast group, would return
said data and then the node would be able to begin posting multicast
sends, or receives. Alternatively, if I understand correctly, I can
create the multicast group on start up of the opensmd.
Every node that wants to participate in a multicast group must join
the group. This is usually done by having the node that wishes to
join send a multicast join request directly to the SA. (Note that
the join request can also create the group.) There are 2 interfaces
available to applications that result in sending join requests: umad
and the rdma_cm.
Node X cannot join a multicast group and pass its multicast address
to Node Y. Each join request must be for a specific node. I
believe it's architecturally possible for node X to send the join
request on behalf of node Y, I don't know if anyone has ever tried
that or if the existing implementations support that.
Also, be aware that the SA manages multicast joins per node and not
per request. I.e. If node X joins a group twice, followed by 1
leave request, the node will be removed from the group. The SA does
not perform reference counting. (It cannot distinguish between 2
separate requests, versus a single request that may have been
retried.)
In the rdma and multicast examples I have seen, each node sets up an
rdma cm event channel. The node then polls for events.
The rdma_cm also supports synchronous operation. If an rdma_cm_id
is created without an event channel, all calls will block until they
complete. Any results (e.g. communication parameters) are returned
in the rdma_cm_id. See rdma_client and rdma_server for examples of
synchronous operation. (Those establish a connection, but the same
principals apply.)
I had hoped to be able to avoid using the rdma_cm and avoid having to
monitor an rdma_cm event channel. What I think I would like to do is
have each node of my sim initialize it's side of the communiction,
which I think should include
rdma_bind_addr
rdma_resolve_addr
ibv_create_ah
rdma_join_multicast
You should be able to eliminate rdma_bind_addr and pass in the
source address into rdma_resolve_addr. If your IP routing tables
will resolve a multicast address to an IPoIB device, you can
eliminate the source address completely. (rdma_resolve_addr calls
rdma_bind_addr internally.)
The ibv_create_ah must come after rdma_join_multicast, after you
have the join response.
then ibv_post_send/ibv_post_recv as required.
However, the rdma_* calls require an rdma_cm_id which I won't have if
I don't use the rdma cm.
correct
Can I bypass using the rdma cm and the polling of the event channel?
Or perhaps am I going to have to establish an event channel between my
management server and each individual node? On the other hand, if I
can terminate the polling of the event channel once initialization is
done, maybe I don't mind the rdma cm....
See above to use synchronous operation and avoid polling the event channel.
Can I bypass the polling of the completion queue?... which would imply
I am simply trusting the data arrived at its destination?
You must poll the CQ to avoid overrunning the send queue. Multicast
is unreliable, so a successful completion simply means that the data
was transmitted without error. It does not guarantee that the
receiver has it. Using QP based communication isn't trivial...
Sorry to ask so many questions. Are they any good books on this
programming infiniband?
I'm not aware of any books, let alone good ones...
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html