>I'm not suggesting that you implement RMDA CM IP semantics in >userspace using the IB CM, I'm suggesting you expose the IB CM GID >semantics through the RDMA CM API exactly as they are. Your IBACM >would then become an enhanced path resolution module to the RDMA CM, >much like getaddrinfo is to socket()/bind()/connect().
There are 3 interfaces of interest here. The librdmacm API, the rdma_ucm user to kernel interface, and the rdma_cm interface. These patches are looking to change the rdma_ucm interface. I want to avoid changing the API or behavior of the librdmacm in a way that requires changes to existing applications in order to run on larger clusters. Adding support for AF_GID at the librdmacm level is fine, but it doesn't help existing apps. Adding support for AF_GID at the rdma_ucm and rdma_cm interfaces may help, provided that the behavior of librdmacm calls: rdma_resolve_addr, rdma_bind_addr, rdma_resolve_route, and rdma_connect are maintained. If you have a specific idea on a better way to change the rdma_ucm interface than 'set_option' calls, let me know. I will look at the details of what would happen if the librdmacm converted an AF_INET address into an AF_GID address and used that in the down call to the rdma_ucm:rdma_resolve_addr. I suspect that since the original AF_INET address is not carried in the IB CM REQ, there will be issues matching the REQ with listens on specific addresses. This may not be a big deal in practice. The rdma_cm would still need the path record data. >So the output from IBACM would specify on AF_GID address family and >include opaque data blobs that are passed through the RDMA CM API that >contain all the PR records, service ID, etc. If used on non-IB then >IBACM could just return AF_IP/AF_IPV6 and related blobs. Thus the >consumer of the API gets transparency and network protocol agility, >and all the mess can be hid in the address resolution API. This is just debating where the transport abstraction occurs, but IMO the IB ACM should be IB centric. Transport abstraction should occur somewhere above it. This has been the role of the librdmacm. Adding a new call similar to getaddrinfo to the librdmacm should be possible, and could actually take advantage of the IB ACM resolution that converts a host name directly into IB path data. This still leaves open the issue of how to communicate that data to the kernel so that the rdma_cm can format the IB CM REQ correctly and send it on its merry little way. >Yes, I see that, but the ARP request is an absolutely critical part of >the IP world, to eliminate it, but still pretend to be IP really is >cheating too much, IMHO. :) We aren't completely getting rid of ARP, we just support an alternate, non-standard, proprietary address resolution mechanism instead. >Another topic, but yes, ip route get just does a netlink >queury. I can give you all the details if you want to try it. Yes, please - see below >However as I explained in the thread, I highly skeptical about all of >this. That query needs to be done exactly once and the connection must >be bound to that result from then on. Currently too many route lookups >are done, and adding more to userspace does not seem to be the right >direction - unless the userspace one replaces all the kernel lookups.. The librdmacm rdma_resolve_addr() call allows a user to specify a destination address only. A suitable source address will be selected, and the rdma_cm_id will be bound to the corresponding RDMA device. If the librdmacm can efficiently determine the source address, it can call the IB ACM to resolve the addresses and obtain the path data. Otherwise, the call to librdmacm rdma_resolve_addr() drops into the kernel and operates as it does today, which can involve sending an ARP. I haven't been overly concerned about this yet, because the application I'm most concerned with always calls rdma_bind_addr(). - Sean -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html
