Note, even though this patch resolved the openmpi failure on my iwarp nodes, ucmatose -b 127.0.0.1 doesn't fail. I haven't looked at the src, but something funny must be happening.
So we still have a regression issue with ofed-1.5.1/upstream kernels and openmpi over IB with rdmacm. Steve. Steve Wise wrote: > >> rdma/cm: disallow loopback address for iwarp devices >> >> From: Sean Hefty <[email protected]> >> >> The current RDMA iWarp devices cannot be used to establish >> connections using the loopback address. Prevent rdma_bind_addr >> from associating the loopback address with an iWarp device. >> >> This fixes an issue with openmpi, where it tries to identify which >> IP addresses map to RDMA devices by calling rdma_bind_addr on >> each address and seeing if the bind succeeds. Prior to patch >> 6f8372b6 "RDMA/cm: fix loopback address support", this process >> worked. But the rdma_cm now allows rdma_bind_addr to bind to an >> RDMA device using the loopback address, and attaches the rdma_cm_id >> to the RDMA device as part of the bind. >> >> Signed-off-by: Sean Hefty <[email protected]> >> --- >> >> drivers/infiniband/core/cma.c | 14 ++++++++++---- >> 1 files changed, 10 insertions(+), 4 deletions(-) >> >> diff --git a/drivers/infiniband/core/cma.c >> b/drivers/infiniband/core/cma.c >> index cc9b594..5850411 100644 >> --- a/drivers/infiniband/core/cma.c >> +++ b/drivers/infiniband/core/cma.c >> @@ -1739,6 +1739,9 @@ err: >> } >> EXPORT_SYMBOL(rdma_resolve_route); >> >> +/* >> + * Only IB devices support loopback connections. >> + */ >> static int cma_bind_loopback(struct rdma_id_private *id_priv) >> { >> struct cma_device *cma_dev; >> @@ -1753,11 +1756,16 @@ static int cma_bind_loopback(struct >> rdma_id_private *id_priv) >> ret = -ENODEV; >> goto out; >> } >> - list_for_each_entry(cma_dev, &dev_list, list) >> + list_for_each_entry(cma_dev, &dev_list, list) { >> + if (rdma_node_get_transport(cma_dev->device->node_type) != >> + RDMA_TRANSPORT_IB) >> + continue; >> + >> for (p = 1; p <= cma_dev->device->phys_port_cnt; ++p) >> if (!ib_query_port(cma_dev->device, p, &port_attr) && >> port_attr.state == IB_PORT_ACTIVE) >> goto port_found; >> + } >> > > Here you need to: > ret = -ENODEV; > goto out; > > instead of: >> >> p = 1; >> cma_dev = list_entry(dev_list.next, struct cma_device, list); >> > > Otherwise it will still bind to the first device even if its iwarp... > > With this mod, it works. > >> @@ -1771,9 +1779,7 @@ port_found: >> if (ret) >> goto out; >> >> - id_priv->id.route.addr.dev_addr.dev_type = >> - (rdma_node_get_transport(cma_dev->device->node_type) == >> RDMA_TRANSPORT_IB) ? >> - ARPHRD_INFINIBAND : ARPHRD_ETHER; >> + id_priv->id.route.addr.dev_addr.dev_type = ARPHRD_INFINIBAND; >> >> rdma_addr_set_sgid(&id_priv->id.route.addr.dev_addr, &gid); >> ib_addr_set_pkey(&id_priv->id.route.addr.dev_addr, pkey); >> >> >> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in >> the body of a message to [email protected] >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > > _______________________________________________ ewg mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
