Steve Wise wrote:
Sean Hefty wrote:

What do you think about adding an API to the rdma-cma allowing
applications to get a list of ip addresses associated with a particular
rdma device/port?

It seems kind of backwards from the design of the rdma_cm, but if it's useful to end users, I don't have any objections to having it.

Consider OMPI, which looks at each device on the system that can be used to connect to the other nodes. Each device is analyzed to see what method of communication should be used (tcp, ib, iwarp, whatever). Then these interfaces and their attributes are conveyed to all the nodes and the desired communication mesh is determined.

Steve, Sean, this approach assumes the MPI job scheduler associate a rank with HW using a <node, device, port, pkey, sl, etc> scheme, where I would like to let a <node, ip address/s> scheme be supported, since I believe that what you suggest is covered by such design.

How about a different approach that complies better to the nature of the rdma-cm and seems to support the requirement: have an API that would let apps to get a list of {interface name, ip addresses, device-attributes} containing all the "RDMA" interfaces, that is those whose ether-type is ARPHRD_INFINIBAND and what-ever key that identified iwarp interfaces.

This can easily be implemented at user space using netlink calls as done by the ip(8) command, for example for the following device

$ ip addr show ib0
25: ib0: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 128
    link/[32] 80:00:04:04:fe:80:00:00:00:00:00:00:00:08:f1:04:03:97:08:dd
    brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
    inet 192.168.3.61/24 brd 192.168.3.255 scope global ib0
    inet 192.168.3.71/32 scope global ib0
    inet6 fe80::208:f104:397:8dd/64 scope link valid_lft forever preferred_lft 
forever

the user space script/code would note that its an IPoIB device, with the IPv4 addresses being 192.168.3.61/24 and 192.168.3.71/32

Now, if you want to go deeper and expose the <device, port, pkey, sl, etc> to the job scheduler or to the rank, you can implement this code that uses netlink in librdmacm (no need for kernel changes) and it would reuse the code present now in rdma_resolve_addr for the local resolution, that is resolve the <device,port,pkey> from an local ip address.

Or.

_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to