Posting to openib-general list... >RDMA CM has multicast of course, though it seems no means of preventing >address collisions (to me, that means two separate MPI jobs using the >same multicast address). I know that part of the new multicast support >you had developed a few months ago was the ability to specify a '0' >MGID/MLID to indicate that an unused multicast address should be used >and returned. > >How hard would it be to add this functionality to RDMA CM?
I looked into this, and it seems doable. I hacked the kernel rdma_cm to join a multicast group with an mgid of 0, and it seemed to work as far as I could test it without more extensive changes. (My test didn't actually transfer data, but the join succeeded, the MGID/MLID was exported to userspace, and different applications joined different groups.) What would be needed is a way for the user to indicate that they need a unique address. An obvious way to accomplish this is for the user to specify an IP address of 0.0.0.0 when calling rdma_join_multicast(). The user would first need to bind to a specific device by calling rdma_bind_addr() with a local IP address. If more than one group is joined this way, then rdma_leave_multicast() would need someway to distinguish between the different groups joined by a single user. (rdma_leave_multicast takes the IP address of the group to leave.) Providing a "port number" with the sockaddr would work. The port number would need to match when joining/leaving, but is not part of the multicast address, essentially making it a join index specified by the user. Your code would look something like this: rdma_bind_addr(local IP address) rdma_join_multicast(0.0.0.0, port 0) <- exchange group info out of band rdma_join_multicast(0.0.0.0, port 1) <- exchange group info out of band send data to a lot of nodes at once rdma_leave_multicast(0.0.0.0, port 0) rdma_leave_multicast(0.0.0.0, port 1) If this sounds like it would work for you, let me know, and I can create a patch to test this idea more. - Sean _______________________________________________ openib-general mailing list [email protected] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
