Re: [ewg] OFED-1.5.1 failure over iWarp
Sean Hefty wrote: If I look at what's there today, we're trying to find some way to match the net_device src_dev_addr with some sort of address associated with an ib_device. In the case of actual IB, the net_device src_dev_addr contains the SGID, which provides the mapping. Steve, can you please clarify the iWarp case for me? For iWarp, doesn't the src_dev_addr contain the MAC? So, the 'GID's reported for an iWarp device is really just the MAC. Is this correct? If this is the case, then couldn't rocee (I hate that name) report its MAC as one of its GIDs? This would ensure that the mapping between net_device and ib_device was correct. Sean, AFAIK, reporting the MAC as one of the GIDs was part of the IBoE (feel free not to use names which you don't like) design presented couple of time, isn't it, Eli, Liran? Or. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] OFED-1.5.1 failure over iWarp
Never mind. I see you already committed the change. I just pulled the latest and rping works over iwarp. Thanks, Steve. Steve Wise wrote: Hey Eli, This patch doesn't apply. If you give me one that applies and builds against RH5.3, I'll test it. Thanks, Steve. Eli Cohen wrote: Oops, you're right. Please try this one: commit 483fe703b03b1db99fa4a968fc3a918aa43f856f Author: Eli Cohen e...@mellanox.co.il Date: Wed Feb 3 13:10:14 2010 +0200 CMA: Fix iWarp failures to bind to a device rdma_addr_get_sgid() relies on dev_addr-transport to retrieve the correct GID based on the hardware address. However, when called from cma_acquire_dev(), the transport field is not yet valid. The solution is to avoid calling rdma_addr_get_sgid() from cma_acquire_dev() and find the device based on it's GID: for ethernet, assume first it is rocee and search the GID table, if not found generate the GID by copying it from the hardware address. Signed-off-by: Eli Cohen e...@mellanox.co.il diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c index a2d5aad..3c5c59f 100644 --- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -348,15 +348,29 @@ static int cma_acquire_dev(struct rdma_id_private *id_priv) union ib_gid gid; int ret = -ENODEV; -rdma_addr_get_sgid(dev_addr, gid); +if (dev_addr-dev_type != ARPHRD_INFINIBAND) { +rocee_addr_get_sgid(dev_addr, gid); +list_for_each_entry(cma_dev, dev_list, list) { +ret = ib_find_cached_gid(cma_dev-device, gid, + id_priv-id.port_num, NULL); +if (!ret) +goto out; +} +} + +memcpy(gid, dev_addr-src_dev_addr + + rdma_addr_gid_offset(dev_addr), sizeof gid); list_for_each_entry(cma_dev, dev_list, list) { ret = ib_find_cached_gid(cma_dev-device, gid, id_priv-id.port_num, NULL); -if (!ret) { -cma_attach_to_dev(id_priv, cma_dev); +if (!ret) break; -} } + +out: +if (!ret) +cma_attach_to_dev(id_priv, cma_dev); + return ret; } memcpy(gid, dev_addr-src_dev_addr + rdma_addr_gid_offset(dev_addr), sizeof gid); list_for_each_entry(cma_dev, dev_list, list) { ret = ib_find_cached_gid(cma_dev-device, gid, id_priv-id.port_num, NULL); if (!ret) break; } } if (!ret) cma_attach_to_dev(id_priv, cma_dev); return ret; } Eli Cohen wrote: On Wed, Feb 03, 2010 at 09:20:05AM -0600, Steve Wise wrote: diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c index a2d5aad..76dce2b 100644 --- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -348,15 +348,28 @@ static int cma_acquire_dev(struct rdma_id_private *id_priv) union ib_gid gid; int ret = -ENODEV; - rdma_addr_get_sgid(dev_addr, gid); - list_for_each_entry(cma_dev, dev_list, list) { - ret = ib_find_cached_gid(cma_dev-device, gid, - id_priv-id.port_num, NULL); - if (!ret) { - cma_attach_to_dev(id_priv, cma_dev); - break; + if (dev_addr-dev_type != ARPHRD_INFINIBAND) { + rocee_addr_get_sgid(dev_addr, gid); + list_for_each_entry(cma_dev, dev_list, list) { + ret = ib_find_cached_gid(cma_dev-device, gid, + id_priv-id.port_num, NULL); + if (!ret) + break; + } The above if statement is true for iwarp devices, so this patch is just wrong. rocee__addr_get_sgid() should only be used for ROCEE interfaces, correct? No, the idea is this: for non ARPHRD_INFINIBAND devices (e.g. rocee or iwarp) I assume first this rocee, get the rocee gid, and check if this gid appears in any device's gid table. It the mac address belongs to a rocee device then it will be found; if it belongs to an iwarp device then it won't be found. In the later case I build the gid in the pre rocee patches fashion and search again. + } else { + memcpy(gid, dev_addr-src_dev_addr + + rdma_addr_gid_offset(dev_addr), sizeof gid); + list_for_each_entry(cma_dev, dev_list, list) { +
Re: [ewg] OFED-1.5.1 failure over iWarp
On Thu, Feb 04, 2010 at 09:46:58AM -0600, Steve Wise wrote: Never mind. I see you already committed the change. I just pulled the latest and rping works over iwarp. Thanks for checking this. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] OFED-1.5.1 failure over iWarp
We should work to get this 'correct' when merging upstream. Following the spirit of the current code, it is probably cma_acquire_dev()'s job to fill in the missing ibdev type information after matching the netdev to an ibdev. This makes sense to me. P.S. - I really wish that we had a cleaner way to match an ibdev to a netdev without overloading the gid table entries. Basically, it should be the job of the entity that created the netdev to make this association, and stuff a pointer in the netdev. Do you have a specific idea here? So far, we've tried to keep the mapping the responsibility of the rdma_cm module. With rocee, we may need to re-architect the solution and have the ib_device driver make this association. Even if it's unlikely, we need to make sure that we don't make the wrong match. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] OFED-1.5.1 failure over iWarp
This patch didn't work. I still get an address resolution error event with status -2. Steve. Eli Cohen wrote: On Tue, Jan 19, 2010 at 04:42:16PM -0800, Woodruff, Robert J wrote: I am getting the following error when trying to run Intel MPI over nes iwarp cards on today's daily build of OFED-1.5.1. OFED-1.5 does not show this problem. mpdtrace det-17-eth2 det-16-eth2 [0] dapl fabric is not available and fallback fabric is not enabled det-17:cd2: open_hca: rdma_bind ERR No such file or directory. Is eth2 configured? All, Since I do not have iwarp cards, I can't check the following patch. Please try it and let me know if it solved your problem. If it does, I'll push it to tomorrow's build. commit 7490e1cce1a295219e23e90d09f78bcdba0977dd Author: Eli Cohen e...@mellanox.co.il Date: Wed Feb 3 13:10:14 2010 +0200 CMA: Fix iWarp failures to bind to a device rdma_addr_get_sgid() relies on dev_addr-transport to retrieve the correct GID based on the hardware address. However, when called from cma_acquire_dev(), the transport field is not yet valid. The solution is to avoid calling rdma_addr_get_sgid() from cma_acquire_dev() and find the device based on it's GID: for ethernet, assume first it is rocee and search the GID table, if not found generate the GID by copying it from the hardware address. Signed-off-by: Eli Cohen e...@mellanox.co.il diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c index a2d5aad..76dce2b 100644 --- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -348,15 +348,28 @@ static int cma_acquire_dev(struct rdma_id_private *id_priv) union ib_gid gid; int ret = -ENODEV; - rdma_addr_get_sgid(dev_addr, gid); - list_for_each_entry(cma_dev, dev_list, list) { - ret = ib_find_cached_gid(cma_dev-device, gid, - id_priv-id.port_num, NULL); - if (!ret) { - cma_attach_to_dev(id_priv, cma_dev); - break; + if (dev_addr-dev_type != ARPHRD_INFINIBAND) { + rocee_addr_get_sgid(dev_addr, gid); + list_for_each_entry(cma_dev, dev_list, list) { + ret = ib_find_cached_gid(cma_dev-device, gid, + id_priv-id.port_num, NULL); + if (!ret) + break; + } + } else { + memcpy(gid, dev_addr-src_dev_addr + +rdma_addr_gid_offset(dev_addr), sizeof gid); + list_for_each_entry(cma_dev, dev_list, list) { + ret = ib_find_cached_gid(cma_dev-device, gid, + id_priv-id.port_num, NULL); + if (!ret) + break; } } + + if (!ret) + cma_attach_to_dev(id_priv, cma_dev); + return ret; } ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] OFED-1.5.1 failure over iWarp
diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c index a2d5aad..76dce2b 100644 --- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -348,15 +348,28 @@ static int cma_acquire_dev(struct rdma_id_private *id_priv) union ib_gid gid; int ret = -ENODEV; - rdma_addr_get_sgid(dev_addr, gid); - list_for_each_entry(cma_dev, dev_list, list) { - ret = ib_find_cached_gid(cma_dev-device, gid, - id_priv-id.port_num, NULL); - if (!ret) { - cma_attach_to_dev(id_priv, cma_dev); - break; + if (dev_addr-dev_type != ARPHRD_INFINIBAND) { + rocee_addr_get_sgid(dev_addr, gid); + list_for_each_entry(cma_dev, dev_list, list) { + ret = ib_find_cached_gid(cma_dev-device, gid, + id_priv-id.port_num, NULL); + if (!ret) + break; + } The above if statement is true for iwarp devices, so this patch is just wrong. rocee__addr_get_sgid() should only be used for ROCEE interfaces, correct? + } else { + memcpy(gid, dev_addr-src_dev_addr + +rdma_addr_gid_offset(dev_addr), sizeof gid); + list_for_each_entry(cma_dev, dev_list, list) { + ret = ib_find_cached_gid(cma_dev-device, gid, + id_priv-id.port_num, NULL); + if (!ret) + break; } } + + if (!ret) + cma_attach_to_dev(id_priv, cma_dev); + return ret; } ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] OFED-1.5.1 failure over iWarp
On Wed, Feb 03, 2010 at 09:17:57AM -0600, Steve Wise wrote: This patch didn't work. I still get an address resolution error event with status -2. Can you tell whether the bind suceeded? Can you give some more details about the failure. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] OFED-1.5.1 failure over iWarp
I'm just now getting around to this, and iwarp seems to be 100% dead on ofed-1.5.1. I see the same error running mvapich2 and rping. Has anybody looked into why? I will be looking at this today. Has anyone tested iwarp against the upstream kernel lately? 2.6.32 or 2.6.33? Woody is suspecting that the IPv6 patches may be a contributing factor, which were pulled into 1.5.1. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] OFED-1.5.1 failure over iWarp
Hefty, Sean wrote: Are these changes upstream? I didn't see a problem on 2.6.33-rc4... Yes - the ipv6 fixes should be in 2.6.33-rc4. I'm not sure if what's upstream is the same as what's in OFED though, or if that's even the issue. i'm debugging this too. so far, it looks like addr_resolve() is failing... ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] OFED-1.5.1 failure over iWarp
On the server side, if I do not specify an address, the server starts [wo...@det-17 src]$ [wo...@det-17 src]$ /sbin/ifconfig eth0 Link encap:Ethernet HWaddr 00:04:23:AF:8E:CE inet addr:192.168.0.17 Bcast:192.168.0.255 Mask:255.255.255.0 inet6 addr: fe80::204:23ff:feaf:8ece/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:1544 errors:0 dropped:0 overruns:0 frame:0 TX packets:991 errors:0 dropped:0 overruns:0 carrier:0 collisions:41 txqueuelen:100 RX bytes:360766 (352.3 KiB) TX bytes:131310 (128.2 KiB) Base address:0xdc00 Memory:f9fa-f9fc eth1 Link encap:Ethernet HWaddr 00:04:23:AF:8E:CF inet addr:192.168.1.17 Bcast:192.168.1.255 Mask:255.255.255.0 UP BROADCAST MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 b) TX bytes:0 (0.0 b) Base address:0xdc80 Memory:f9fe-fa00 eth2 Link encap:Ethernet HWaddr 00:12:55:02:AE:CC inet addr:192.168.2.17 Bcast:192.168.2.255 Mask:255.255.255.0 inet6 addr: fe80::212:55ff:fe02:aecc/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:34136 errors:0 dropped:0 overruns:0 frame:0 TX packets:35012 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:2863188 (2.7 MiB) TX bytes:2907831 (2.7 MiB) Interrupt:169 loLink encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:930 errors:0 dropped:0 overruns:0 frame:0 TX packets:930 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:70925 (69.2 KiB) TX bytes:70925 (69.2 KiB) [wo...@det-17 src]$ [wo...@det-17 src]$ [wo...@det-17 src]$ ucmatose cmatose: starting server But on the client, it fails on any of the ip addresses, but differently on 192.168.1.17. ucmatose -s 192.168.0.17 cmatose: starting client cmatose: connecting cmatose: event: RDMA_CM_EVENT_ADDR_ERROR, error: -2 receiving data transfers sending replies data transfers complete test complete return status 0 ucmatose -s 192.168.1.17 cmatose: starting client cmatose: connecting cmatose: event: RDMA_CM_EVENT_ADDR_ERROR, error: -110 receiving data transfers sending replies data transfers complete test complete return status 0 ucmatose -s 192.168.2.17 cmatose: starting client cmatose: connecting cmatose: event: RDMA_CM_EVENT_ADDR_ERROR, error: -2 receiving data transfers sending replies data transfers complete test complete return status 0 [wo...@det-16 ~]$ -Original Message- From: Hefty, Sean Sent: Wednesday, January 20, 2010 9:30 PM To: Woodruff, Robert J; tzipo...@dev.mellanox.co.il; Tung, Chien Tin; Davis, Arlin R Cc: OpenFabrics EWG; Steve Wise; Eli Cohen; Vladimir Sokolovsky Subject: RE: [ewg] OFED-1.5.1 failure over iWarp [wo...@det-16 ~]$ ucmatose -s 192.168.2.17 cmatose: starting client Btw - this should be 'ucmatose -s 192.168.0.17', and needs to start after the server is running. But, this isn't going to work since... [wo...@det-17 src]$ ifconfig eth0 eth0 Link encap:Ethernet HWaddr 00:04:23:AF:8E:CE inet addr:192.168.0.17 Bcast:192.168.0.255 Mask:255.255.255.0 inet6 addr: fe80::204:23ff:feaf:8ece/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:1272 errors:0 dropped:0 overruns:0 frame:0 TX packets:909 errors:0 dropped:0 overruns:0 carrier:0 collisions:41 txqueuelen:100 RX bytes:309724 (302.4 KiB) TX bytes:123706 (120.8 KiB) Base address:0xdc00 Memory:f9fa-f9fc [wo...@det-17 src]$ ucmatose -b 192.168.0.17 cmatose: starting server cmatose: bind address failed: No such file or directory the server is failing. Can anyone test ucmatose over iwarp on the latest kernel.org kernel? - Sean ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] OFED-1.5.1 failure over iWarp
I am getting the following error when trying to run Intel MPI over nes iwarp cards on today's daily build of OFED-1.5.1. OFED-1.5 does not show this problem. I will pull down 1-19 build and see if I get the same issue. Chien ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] OFED-1.5.1 failure over iWarp
I am getting the following error when trying to run Intel MPI over nes iwarp cards on today's daily build of OFED-1.5.1. OFED-1.5 does not show this problem. I will pull down 1-19 build and see if I get the same issue. I got similar issue using OpenMPI and 1/19 daily (non-RDMAoE). -- No OpenFabrics connection schemes reported that they were able to be used on a specific port. As such, the openib BTL (OpenFabrics support) will be disabled for this port. Local host: sw30 Local device: nes0 Local port: 1 CPCs attempted: oob, xoob -- Chien ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] OFED-1.5.1 failure over iWarp
On 1/20/2010 6:50 PM, Tung, Chien Tin wrote: I am getting the following error when trying to run Intel MPI over nes iwarp cards on today's daily build of OFED-1.5.1. OFED-1.5 does not show this problem. I will pull down 1-19 build and see if I get the same issue. I got similar issue using OpenMPI and 1/19 daily (non-RDMAoE). Please work with Vlad and Eli to resolve Thanks Tziporet ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] OFED-1.5.1 failure over iWarp
Sean could this be a result of the new Ipv6 rdma_cm patches that were added to OFED-1.5.1 ? I guess it's possible. I thought DAPL only used ipv4 addresses though. Can you try running some simpler tests, like ucmatose or rping? on server side run: ucmatose [-b optional_local_ip_addres] on client side run: ucmatose -s server_ip_address If you can, run on the server side with and without the -b option. - Sean ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] OFED-1.5.1 failure over iWarp
[wo...@det-16 ~]$ ucmatose -s 192.168.2.17 cmatose: starting client [wo...@det-17 src]$ ifconfig eth0 eth0 Link encap:Ethernet HWaddr 00:04:23:AF:8E:CE inet addr:192.168.0.17 Bcast:192.168.0.255 Mask:255.255.255.0 inet6 addr: fe80::204:23ff:feaf:8ece/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:1272 errors:0 dropped:0 overruns:0 frame:0 TX packets:909 errors:0 dropped:0 overruns:0 carrier:0 collisions:41 txqueuelen:100 RX bytes:309724 (302.4 KiB) TX bytes:123706 (120.8 KiB) Base address:0xdc00 Memory:f9fa-f9fc [wo...@det-17 src]$ ucmatose -b 192.168.0.17 cmatose: starting server cmatose: bind address failed: No such file or directory test complete return status -1 -Original Message- From: Hefty, Sean Sent: Wednesday, January 20, 2010 4:19 PM To: Woodruff, Robert J; tzipo...@dev.mellanox.co.il; Tung, Chien Tin; Davis, Arlin R Cc: OpenFabrics EWG; Steve Wise; Eli Cohen; Vladimir Sokolovsky Subject: RE: [ewg] OFED-1.5.1 failure over iWarp Sean could this be a result of the new Ipv6 rdma_cm patches that were added to OFED-1.5.1 ? I guess it's possible. I thought DAPL only used ipv4 addresses though. Can you try running some simpler tests, like ucmatose or rping? on server side run: ucmatose [-b optional_local_ip_addres] on client side run: ucmatose -s server_ip_address If you can, run on the server side with and without the -b option. - Sean ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] OFED-1.5.1 failure over iWarp
[wo...@det-16 ~]$ ucmatose -s 192.168.2.17 cmatose: starting client Btw - this should be 'ucmatose -s 192.168.0.17', and needs to start after the server is running. But, this isn't going to work since... [wo...@det-17 src]$ ifconfig eth0 eth0 Link encap:Ethernet HWaddr 00:04:23:AF:8E:CE inet addr:192.168.0.17 Bcast:192.168.0.255 Mask:255.255.255.0 inet6 addr: fe80::204:23ff:feaf:8ece/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:1272 errors:0 dropped:0 overruns:0 frame:0 TX packets:909 errors:0 dropped:0 overruns:0 carrier:0 collisions:41 txqueuelen:100 RX bytes:309724 (302.4 KiB) TX bytes:123706 (120.8 KiB) Base address:0xdc00 Memory:f9fa-f9fc [wo...@det-17 src]$ ucmatose -b 192.168.0.17 cmatose: starting server cmatose: bind address failed: No such file or directory the server is failing. Can anyone test ucmatose over iwarp on the latest kernel.org kernel? - Sean ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] OFED-1.5.1 failure over iWarp
I am getting the following error when trying to run Intel MPI over nes iwarp cards on today's daily build of OFED-1.5.1. OFED-1.5 does not show this problem. mpdtrace det-17-eth2 det-16-eth2 [0] dapl fabric is not available and fallback fabric is not enabled det-17:cd2: open_hca: rdma_bind ERR No such file or directory. Is eth2 configured? ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg