> The original post was to determine if it were possible to have server
> app that managed the data required to establish multicast IB
> communctions between 2 or more nodes.  Each node would initialize
> itself as needed wrt IB and each node would request from the server,
> as I now understand, the qpn, qkey and address handle for the
> multicast group it desired to communicate with.  The server, having
> created dynamically through the SA, the multicast group, would return
> said data and then the node would be able to begin posting multicast
> sends, or receives.  Alternatively, if I understand correctly, I can
> create the multicast group on start up of the opensmd.

Every node that wants to participate in a multicast group must join the group.  
This is usually done by having the node that wishes to join send a multicast 
join request directly to the SA.  (Note that the join request can also create 
the group.)  There are 2 interfaces available to applications that result in 
sending join requests: umad and the rdma_cm.

Node X cannot join a multicast group and pass its multicast address to Node Y.  
Each join request must be for a specific node.  I believe it's architecturally 
possible for node X to send the join request on behalf of node Y, I don't know 
if anyone has ever tried that or if the existing implementations support that.

Also, be aware that the SA manages multicast joins per node and not per 
request.  I.e. If node X joins a group twice, followed by 1 leave request, the 
node will be removed from the group.  The SA does not perform reference 
counting.  (It cannot distinguish between 2 separate requests, versus a single 
request that may have been retried.)

> In the rdma and multicast examples I have seen, each node sets up an
> rdma cm event channel.  The node then polls for events.

The rdma_cm also supports synchronous operation.  If an rdma_cm_id is created 
without an event channel, all calls will block until they complete.  Any 
results (e.g. communication parameters) are returned in the rdma_cm_id.  See 
rdma_client and rdma_server for examples of synchronous operation.  (Those 
establish a connection, but the same principals apply.)

> I had hoped to be able to avoid using the rdma_cm and avoid having to
> monitor an rdma_cm event channel. What I think I would like to do is
> have each node of my sim initialize it's side of the communiction,
> which I think should include
> 
> rdma_bind_addr
> rdma_resolve_addr
> ibv_create_ah
> rdma_join_multicast

You should be able to eliminate rdma_bind_addr and pass in the source address 
into rdma_resolve_addr.  If your IP routing tables will resolve a multicast 
address to an IPoIB device, you can eliminate the source address completely.  
(rdma_resolve_addr calls rdma_bind_addr internally.)

The ibv_create_ah must come after rdma_join_multicast, after you have the join 
response.

> then ibv_post_send/ibv_post_recv as required.
> 
> However, the rdma_* calls require an rdma_cm_id which I won't have if
> I don't use the rdma cm.

correct
 
> Can I bypass using the rdma cm and the polling of the event channel?
> Or perhaps am I going to have to establish an event channel between my
> management server and each individual node?  On the other hand, if I
> can terminate the polling of the event channel once initialization is
> done, maybe I don't mind the rdma cm....

See above to use synchronous operation and avoid polling the event channel.
 
> Can I bypass the polling of the completion queue?... which would imply
> I am simply trusting the data arrived at its destination?

You must poll the CQ to avoid overrunning the send queue.  Multicast is 
unreliable, so a successful completion simply means that the data was 
transmitted without error.  It does not guarantee that the receiver has it.  
Using QP based communication isn't trivial...
 
> Sorry to ask so many questions.  Are they any good books on this
> programming infiniband?

I'm not aware of any books, let alone good ones...
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to