[ewg] ofa_1_5_kernel 20091019-0200 daily build status

2009-10-19 Thread Vladimir Sokolovsky (Mellanox)
This email was generated automatically, please do not reply


git_url: git://git.openfabrics.org/ofed_1_5/linux-2.6.git
git_branch: ofed_kernel_1_5

Common build parameters: 

Passed:
Passed on i686 with linux-2.6.18
Passed on i686 with linux-2.6.19
Passed on i686 with linux-2.6.21.1
Passed on i686 with linux-2.6.26
Passed on i686 with linux-2.6.24
Passed on i686 with linux-2.6.22
Passed on i686 with linux-2.6.27
Passed on x86_64 with linux-2.6.16.60-0.21-smp
Passed on x86_64 with linux-2.6.18
Passed on x86_64 with linux-2.6.18-164.el5
Passed on x86_64 with linux-2.6.18-128.el5
Passed on x86_64 with linux-2.6.18-93.el5
Passed on x86_64 with linux-2.6.19
Passed on x86_64 with linux-2.6.20
Passed on x86_64 with linux-2.6.21.1
Passed on x86_64 with linux-2.6.24
Passed on x86_64 with linux-2.6.22
Passed on x86_64 with linux-2.6.25
Passed on x86_64 with linux-2.6.26
Passed on x86_64 with linux-2.6.27
Passed on x86_64 with linux-2.6.9-67.ELsmp
Passed on x86_64 with linux-2.6.9-78.ELsmp
Passed on ia64 with linux-2.6.18
Passed on ia64 with linux-2.6.21.1
Passed on ia64 with linux-2.6.19
Passed on ia64 with linux-2.6.23
Passed on ia64 with linux-2.6.24
Passed on ia64 with linux-2.6.22
Passed on ia64 with linux-2.6.26
Passed on ia64 with linux-2.6.25
Passed on ppc64 with linux-2.6.18
Passed on ppc64 with linux-2.6.19

Failed:
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] Agenda for EWG/OFED meeting on next Monday

2009-10-19 Thread Moni Shoua
Voltaire intends to test with OFED-RDMAoE release.

The following setup are used in our QA while
each setup consists of 2 machines.


Arch|OS / Kernel  | HCA|SM
+-+--+---
x86_64  | RH4 up6 *|  ConnectX QDR | OpenSM
x86_64  | RH4 up7  |  ConnectX DDR | VoltaireSM
x86_64  | RH4 up8  |  ConnectX QDR | OpenSM
x86_64  | RH5 up2  |  ConnectX DDR | VoltaireSM
x86_64  | RH5 up3  |  ConnectX QDR | OpenSM
x86_64  | RH5 up4  |  ConnectX DDR gen2| VoltaireSM
x86_64  | SLES10 sp2   |  Arbel mem free   | VoltaireSM
x86_64  | SLES10 sp3   |  ConnectX QDR | OpenSM
x86_64  | SLES11   |  ConnectX DDR | VoltaireSM
x86_64  | Centos5.3|  ConnectX QDR | OpenSM
ppc | SLES10 sp3   |  ConnectX DDR | VoltaireSM
ppc | RH5 up4  |  ConnectX DDR | VoltaireSM
i386| SLES11   |  ConnectX DDR | VoltaireSM
i386| RH5 up4  |  ConnectX DDR | VoltaireSM
ia64| SLES10 sp3   |  ConnectX DDR | VoltaireSM
ia64| RH4 up8  |  ConnectX DDR | VoltaireSM

* With kernel 2.6.30

We will perform all our tests with the latest available HCA firmware.

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] mvapich2 srpm uploaded

2009-10-19 Thread Jonathan Perkins
Hi all:
I've recently uploaded an mvapich2 srpm to the openfabrics
server.  The version uploaded is not the 1.4 release but can be used to
start the Q/A process.  The source rpm can be found at
~perkinjo/ofed_1_5/mvapich2-r3510-1.src.rpm and future uploads can be
identified by ~perkinjo/ofed_1_5/latest.txt.

You can take a look at our changelog at
http://mvapich.cse.ohio-state.edu/download/mvapich2/changes.shtml to see
what has changed since mvapich2-1.2p1.

Vlad:
Can you update the magic script to start pulling from our ofed_1_5
directory now?  Thanks in advance.

-- 
Jonathan Perkins
http://www.cse.ohio-state.edu/~perkinjo


pgp8TKqW9s14U.pgp
Description: PGP signature
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

[ewg] RE: Agenda for EWG/OFED meeting on next Monday

2009-10-19 Thread Tung, Chien Tin
We test nes content in OFED 1.5 with the following:

OS:

-  RHEL4: up6, up7, up8

-  RHEL5: up2, up3, up4

-  FC9

-  SLES10 SP2

-  SLES11

-  OEL4 up7

-  OEL5 up2

-  CentOS5: up2, up3

-  Kernel.org: 2.6.29, 2.6.30

Arch:


-  ia32

-  x86_64


ULPs:


-  uDAPL (2.0.23.0)

-  Mvapich2 (1.2p1) over OFA

-  MPI-Pallas:

oIntelMPI(3.2.1) over DAPL

oOpenMPI (1.3.2) over OFA

oHPMPI (2.02.07.00) over DAPL

-  RDS

-  VLAN

-  NFS

Chien
--
Chien Tung | chien.tin.t...@intel.com
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

[ewg] RE: Agenda for EWG/OFED meeting on next Monday

2009-10-19 Thread Tziporet Koren
Mellanox testing for OFED 1.5

==

 

Mellanox test OFED-RDMA package on most systems, and only few machines
on OFED.

 

We test All Mellanox HCAs with main focus on ConnectX and ConnectX-2
with QDR

 

 

OS:

-  RHEL4: up6, up7, up8

-  RHEL5: up2, up3, up4

-  SLES10 SP2

-  SLES10 SP3 (not started)

-  SLES11

-  OEL5 up2

-  CentOS5: up2, up3

-  Kernel.org: 2.6.29, 2.6.30

 

Arch:

-  X64

-  x86_64

-  ppc64

-  ia64 - partial testing only

 

ULPs:

-  mvapich

-  Open MPI

-  IPoIB (with bonding too)

-  SDP

-  SRP

-  RDS

-  NFS/RDMA

-  Performance tests

 

Management:

-  OpenSM on the host

-  Management utilities

-  ibutils

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] mvapich2 srpm uploaded

2009-10-19 Thread Vladimir Sokolovsky

Jonathan Perkins wrote:

Hi all:
I've recently uploaded an mvapich2 srpm to the openfabrics
server.  The version uploaded is not the 1.4 release but can be used to
start the Q/A process.  The source rpm can be found at
~perkinjo/ofed_1_5/mvapich2-r3510-1.src.rpm and future uploads can be
identified by ~perkinjo/ofed_1_5/latest.txt.

You can take a look at our changelog at
http://mvapich.cse.ohio-state.edu/download/mvapich2/changes.shtml to see
what has changed since mvapich2-1.2p1.

Vlad:
Can you update the magic script to start pulling from our ofed_1_5
directory now?  Thanks in advance.




Done,
Included in today's daily build (latest).

Regards,
Vladimir
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] mvapich2 srpm uploaded

2009-10-19 Thread Steve Wise
I'm getting a build failure with today's latest ofed build on centos 
5.3/x64:


Failed to build mvapich2 RPM
See /tmp/OFED.17691.logs/mvapich2.rpmbuild.log
[r...@hpc-cn3 OFED-1.5-20091019-0811]# more 
/tmp/OFED.17691.logs/mvapich2.rpmbuild.log
Running  rpmbuild --rebuild  --define '_topdir /var/tmp/OFED_topdir' 
--define 'd
ist %{nil}' --target x86_64 --define '_name mvapich2_gcc' --define 'impl 
ofa' --
define 'rdma --with-rdma=gen2' --define 'ib_include 
--with-ib-include=/usr/inclu
de' --define 'ib_libpath --with-ib-libpath=/usr/lib64' --define 
'shared_libs 1'
--define 'romio 1' --define 'comp_env CC=gcc CXX=g++ F77=gfortran 
F90=gfortran'
--define 'auto_req 0' --define 'mpi_selector /usr/bin/mpi-selector' 
--define '_p
refix /usr/mpi/gcc/mvapich2-r3510' 
/usr/local/src/OFED-1.5-20091019-0811/SRPMS/m

vapich2-r3510-1.src.rpm
warning: user jperkins does not exist - using root
warning: group jperkins does not exist - using root
error: unpacking of archive failed on file 
/var/tmp/OFED_topdir/SOURCES/mvapich2

-1.4.0rc2.tgz;4adc9869: cpio: MD5 sum mismatch
error: 
/usr/local/src/OFED-1.5-20091019-0811/SRPMS/mvapich2-r3510-1.src.rpm cann

ot be installed
Installing 
/usr/local/src/OFED-1.5-20091019-0811/SRPMS/mvapich2-r3510-1.src.rpm

[r...@hpc-cn3 OFED-1.5-20091019-0811]#



Vladimir Sokolovsky wrote:

Jonathan Perkins wrote:

Hi all:
I've recently uploaded an mvapich2 srpm to the openfabrics
server.  The version uploaded is not the 1.4 release but can be used to
start the Q/A process.  The source rpm can be found at
~perkinjo/ofed_1_5/mvapich2-r3510-1.src.rpm and future uploads can be
identified by ~perkinjo/ofed_1_5/latest.txt.

You can take a look at our changelog at
http://mvapich.cse.ohio-state.edu/download/mvapich2/changes.shtml to see
what has changed since mvapich2-1.2p1.

Vlad:
Can you update the magic script to start pulling from our ofed_1_5
directory now?  Thanks in advance.




Done,
Included in today's daily build (latest).

Regards,
Vladimir
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] RE: Agenda for EWG/OFED meeting on next Monday

2009-10-19 Thread Woodruff, Robert J
For my team,
we have been testing the following on
small clusters, 16 nodes or less.

OS
 - RHEL 5.3 and 5.4

Arch:
   - X86_64, ia64

ULPs

   OpenSM, Intel MPI over IPoIB, Intel MPI over uDAPL,  ibutils and 
management tools

IHVs

  Mellanox, mthca and mlx4
  Intel (NetEffect) iWarp


For uDAPL, we are testing the latest package on a cluster of  338 nodes with 
Intel MPI,
but that cluster is still runing the older base OFED.


From: ewg-boun...@lists.openfabrics.org 
[mailto:ewg-boun...@lists.openfabrics.org] On Behalf Of Tziporet Koren
Sent: Monday, October 19, 2009 8:31 AM
To: Tziporet Koren; ewg@lists.openfabrics.org
Subject: [ewg] RE: Agenda for EWG/OFED meeting on next Monday

Mellanox testing for OFED 1.5
==

Mellanox test OFED-RDMA package on most systems, and only few machines on OFED.

We test All Mellanox HCAs with main focus on ConnectX and ConnectX-2 with QDR


OS:

-  RHEL4: up6, up7, up8

-  RHEL5: up2, up3, up4

-  SLES10 SP2

-  SLES10 SP3 (not started)

-  SLES11

-  OEL5 up2

-  CentOS5: up2, up3

-  Kernel.org: 2.6.29, 2.6.30

Arch:

-  X64

-  x86_64

-  ppc64

-  ia64 - partial testing only

ULPs:

-  mvapich

-  Open MPI

-  IPoIB (with bonding too)

-  SDP

-  SRP

-  RDS

-  NFS/RDMA

-  Performance tests

Management:

-  OpenSM on the host

-  Management utilities

-  ibutils
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] mvapich2 srpm uploaded

2009-10-19 Thread Jonathan Perkins
On Mon, Oct 19, 2009 at 06:17:04PM +0200, Vladimir Sokolovsky wrote:
 Jonathan Perkins wrote:
 Hi all:
 I've recently uploaded an mvapich2 srpm to the openfabrics
 server.  The version uploaded is not the 1.4 release but can be used to
 start the Q/A process.  The source rpm can be found at
 ~perkinjo/ofed_1_5/mvapich2-r3510-1.src.rpm and future uploads can be
 identified by ~perkinjo/ofed_1_5/latest.txt.

 You can take a look at our changelog at
 http://mvapich.cse.ohio-state.edu/download/mvapich2/changes.shtml to see
 what has changed since mvapich2-1.2p1.

 Vlad:
 Can you update the magic script to start pulling from our ofed_1_5
 directory now?  Thanks in advance.



 Done,
 Included in today's daily build (latest).

Thanks!

-- 
Jonathan Perkins
http://www.cse.ohio-state.edu/~perkinjo


pgpBmh5l3IhYd.pgp
Description: PGP signature
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] mvapich2 srpm uploaded

2009-10-19 Thread Jonathan Perkins
On Mon, Oct 19, 2009 at 11:56:10AM -0500, Steve Wise wrote:
 I'm getting a build failure with today's latest ofed build on centos  
 5.3/x64:

 Failed to build mvapich2 RPM
 See /tmp/OFED.17691.logs/mvapich2.rpmbuild.log
 [r...@hpc-cn3 OFED-1.5-20091019-0811]# more  
 /tmp/OFED.17691.logs/mvapich2.rpmbuild.log
 Running  rpmbuild --rebuild  --define '_topdir /var/tmp/OFED_topdir'  
 --define 'd
 ist %{nil}' --target x86_64 --define '_name mvapich2_gcc' --define 'impl  
 ofa' --
 define 'rdma --with-rdma=gen2' --define 'ib_include  
 --with-ib-include=/usr/inclu
 de' --define 'ib_libpath --with-ib-libpath=/usr/lib64' --define  
 'shared_libs 1'
 --define 'romio 1' --define 'comp_env CC=gcc CXX=g++ F77=gfortran  
 F90=gfortran'
 --define 'auto_req 0' --define 'mpi_selector /usr/bin/mpi-selector'  
 --define '_p
 refix /usr/mpi/gcc/mvapich2-r3510'  
 /usr/local/src/OFED-1.5-20091019-0811/SRPMS/m
 vapich2-r3510-1.src.rpm
 warning: user jperkins does not exist - using root
 warning: group jperkins does not exist - using root
 error: unpacking of archive failed on file  
 /var/tmp/OFED_topdir/SOURCES/mvapich2
 -1.4.0rc2.tgz;4adc9869: cpio: MD5 sum mismatch
 error:  
 /usr/local/src/OFED-1.5-20091019-0811/SRPMS/mvapich2-r3510-1.src.rpm cann
 ot be installed
 Installing  
 /usr/local/src/OFED-1.5-20091019-0811/SRPMS/mvapich2-r3510-1.src.rpm
 [r...@hpc-cn3 OFED-1.5-20091019-0811]#

Thanks for the quick feedback.  I packaged this using a Fedora machine
which used a newer version of rpmlib.  I'll upload a new srpm that uses
the more compatible md5 hash algorithm shortly.

FYI, https://bugzilla.redhat.com/show_bug.cgi?id=490613.

-- 
Jonathan Perkins
http://www.cse.ohio-state.edu/~perkinjo


pgpg1YSNVKy8l.pgp
Description: PGP signature
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] mvapich2 srpm uploaded

2009-10-19 Thread Jonathan Perkins
On Mon, Oct 19, 2009 at 02:28:07PM -0400, Jonathan Perkins wrote:
 On Mon, Oct 19, 2009 at 11:56:10AM -0500, Steve Wise wrote:
  I'm getting a build failure with today's latest ofed build on centos  
  5.3/x64:
 
  Failed to build mvapich2 RPM
  See /tmp/OFED.17691.logs/mvapich2.rpmbuild.log
  [r...@hpc-cn3 OFED-1.5-20091019-0811]# more  
  /tmp/OFED.17691.logs/mvapich2.rpmbuild.log
  Running  rpmbuild --rebuild  --define '_topdir /var/tmp/OFED_topdir'  
  --define 'd
  ist %{nil}' --target x86_64 --define '_name mvapich2_gcc' --define 'impl  
  ofa' --
  define 'rdma --with-rdma=gen2' --define 'ib_include  
  --with-ib-include=/usr/inclu
  de' --define 'ib_libpath --with-ib-libpath=/usr/lib64' --define  
  'shared_libs 1'
  --define 'romio 1' --define 'comp_env CC=gcc CXX=g++ F77=gfortran  
  F90=gfortran'
  --define 'auto_req 0' --define 'mpi_selector /usr/bin/mpi-selector'  
  --define '_p
  refix /usr/mpi/gcc/mvapich2-r3510'  
  /usr/local/src/OFED-1.5-20091019-0811/SRPMS/m
  vapich2-r3510-1.src.rpm
  warning: user jperkins does not exist - using root
  warning: group jperkins does not exist - using root
  error: unpacking of archive failed on file  
  /var/tmp/OFED_topdir/SOURCES/mvapich2
  -1.4.0rc2.tgz;4adc9869: cpio: MD5 sum mismatch
  error:  
  /usr/local/src/OFED-1.5-20091019-0811/SRPMS/mvapich2-r3510-1.src.rpm cann
  ot be installed
  Installing  
  /usr/local/src/OFED-1.5-20091019-0811/SRPMS/mvapich2-r3510-1.src.rpm
  [r...@hpc-cn3 OFED-1.5-20091019-0811]#
 
 Thanks for the quick feedback.  I packaged this using a Fedora machine
 which used a newer version of rpmlib.  I'll upload a new srpm that uses
 the more compatible md5 hash algorithm shortly.
 
 FYI, https://bugzilla.redhat.com/show_bug.cgi?id=490613.

The new srpm is uploaded.

-- 
Jonathan Perkins
http://www.cse.ohio-state.edu/~perkinjo


pgpZ9xWYOJcGP.pgp
Description: PGP signature
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

RE: [ewg] Re: Possible process deadlock in RMPP flow

2009-10-19 Thread Sean Hefty
 Thanks Or. This one is already in OFED 1.4.2 but apparently this is a
 different problem. Once I have information whether the patch Roland
 posted fixed it I will update the list.
 Eli, did you find a commit that fixes the problem you reported on?

 Or.


Not yet :-(

I can't find anything off in the code for this.  It's odd, since
unregister_mad_agent() does:

flush_workqueue(port_priv-wq);
ib_cancel_rmpp_recvs(mad_agent_priv);

and ib_cancel_rmpp_recvs() does:

spin_lock_irqsave(agent-lock, flags);
list_for_each_entry(rmpp_recv, agent-rmpp_list, list) {
cancel_delayed_work(rmpp_recv-timeout_work);
cancel_delayed_work(rmpp_recv-cleanup_work);
}
spin_unlock_irqrestore(agent-lock, flags);

flush_workqueue(agent-qp_info-port_priv-wq);

which basically just flushes the same work queue.

I haven't been able to reproduce the problem, but I'm running the latest kernel
- not sure that matters in this case.  Does ibnetdiscover just hang forever at
the end of the test when this occurs?  Is there any more information available?

- Sean 

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] [PATCH] link-local address fix for rdma_resolve_addr

2009-10-19 Thread David J. Wilder
Sean, Roland

Here is the updated patch that Jason and I discussed last week.

rdma_resolve_addr() returns an error when attempting to resolve ipv6
link-local address.  This patch fixes the handling of link-local address.

The patch was tested using rping run as such:

Link-local with scope:
# /usr/bin/rping -c -a remote-system-link-local%ib0

Link-local w/out scope: (expect failure)
# /usr/bin/rping -c -a remote-system-link-local
rdma_resolve_addr error -1

Own interface link local:
# /usr/bin/rping -c -a remote-system-link-local%ib0

Other ipv6 address:
# /usr/bin/rping -c -a 2001:db8:1234::2

(server side started with rping -s -P -v -a ::0)

Tested against ofed build OFED-1.5-20091019-0811 and kernel 2.6.30.

Signed-off-by: David Wilder dwil...@us.ibm.com

--
 drivers/infiniband/core/addr.c |   25 ++---
 1 files changed, 22 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/core/addr.c b/drivers/infiniband/core/addr.c
index bd07803..3442256 100644
--- a/drivers/infiniband/core/addr.c
+++ b/drivers/infiniband/core/addr.c
@@ -278,6 +278,21 @@ static int addr6_resolve_remote(struct sockaddr_in6 
*src_in,
fl.nl_u.ip6_u.daddr = dst_in-sin6_addr;
fl.nl_u.ip6_u.saddr = src_in-sin6_addr;
 
+   if (ipv6_addr_type(src_in-sin6_addr)  IPV6_ADDR_LINKLOCAL) {
+   if (!src_in-sin6_scope_id)
+   return -EINVAL;
+   fl.oif = src_in-sin6_scope_id;
+   }
+   if (ipv6_addr_type(dst_in-sin6_addr)  IPV6_ADDR_LINKLOCAL) {
+   if (dst_in-sin6_scope_id) {
+   if (fl.oif  fl.oif != dst_in-sin6_scope_id)
+   return -EINVAL;
+   fl.oif = dst_in-sin6_scope_id;
+   }
+   if (!fl.oif)
+   return -EINVAL;
+   }
+
dst = ip6_route_output(init_net, NULL, fl);
if (!dst)
return ret;
@@ -390,14 +405,16 @@ static int addr_resolve_local(struct sockaddr *src_in,
case AF_INET6:
{
struct in6_addr *a;
+   int found = 0;
 
for_each_netdev(init_net, dev)
if (ipv6_chk_addr(init_net,
  ((struct sockaddr_in6 *) 
dst_in)-sin6_addr,
- dev, 1))
+ dev, 1)) {
+   found = 1;
break;
-
-   if (!dev)
+   }
+   if (!found)
return -EADDRNOTAVAIL;
 
a = ((struct sockaddr_in6 *) src_in)-sin6_addr;
@@ -406,6 +423,8 @@ static int addr_resolve_local(struct sockaddr *src_in,
src_in-sa_family = dst_in-sa_family;
((struct sockaddr_in6 *) src_in)-sin6_addr =
((struct sockaddr_in6 *) dst_in)-sin6_addr;
+   ((struct sockaddr_in6 *) src_in)-sin6_scope_id =
+   ((struct sockaddr_in6 *) dst_in)-sin6_scope_id;
ret = rdma_copy_addr(addr, dev, dev-dev_addr);
} else if (ipv6_addr_loopback(a)) {
ret = rdma_translate_ip(dst_in, addr);


___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] Re: [PATCH] link-local address fix for rdma_resolve_addr

2009-10-19 Thread Jason Gunthorpe
On Mon, Oct 19, 2009 at 03:47:09PM -0700, David J. Wilder wrote:

 +++ b/drivers/infiniband/core/addr.c
 @@ -278,6 +278,21 @@ static int addr6_resolve_remote(struct sockaddr_in6 
 *src_in,
   fl.nl_u.ip6_u.daddr = dst_in-sin6_addr;
   fl.nl_u.ip6_u.saddr = src_in-sin6_addr;
  
 + if (ipv6_addr_type(src_in-sin6_addr)  IPV6_ADDR_LINKLOCAL) {
 + if (!src_in-sin6_scope_id)
 + return -EINVAL;
 + fl.oif = src_in-sin6_scope_id;
 + }

Seeing it all together like this make it clear this test needs to move
up the call chain and test the sockaddr passed from userspace, not
the one created by addr_resolve_local. Probably somewhere along the
rdma_resolve_addr - cma_bind_addr - rmda_bind_addr -
rdma_translate_ip path. Maybe rdma_translate_ip should use and check
the scope as a temporary hack?

BTW, while researching the above comment, I'm not certain your last
patch is at all correct:

commit 85f20b39fd44310a163a9b33708fea57f08a4e40
RDMA/addr: Fix resolution of local IPv6 addresses

This patch allows a local IPv6 address to be resolved by rdma_cm.

To reproduce the problem:

 $ rping -s -v -a ::0  
 $ rping -c -v -a IPv6 address local to this system
--- a/drivers/infiniband/core/addr.c
+++ b/drivers/infiniband/core/addr.c
@@ -393,7 +393,7 @@ static int addr_resolve_local(struct sockaddr *src_in,
 
for_each_netdev(init_net, dev)
if (ipv6_chk_addr(init_net,
- ((struct sockaddr_in6 *) 
addr)-sin6_addr,
+ ((struct sockaddr_in6 *) 
dst_in)-sin6_addr,
  dev, 1))
break;

I can believe it fixes the case you describe (ie loopback) but
matching the *dest* IP against the local interface's IP list cannot
possibly be right.

The primary problem is that for_each_netdev/ipv6_chk_addr is NOT the
same as ip_dev_find. ip_dev_find is a routing lookup, ipv6_chk_addr
compares the local address list. Not at all the same. I don't see a
route lookup helper for ipv6, so you have to code full flowi lookup.

With your change I expect ipv6 is 100% broken now for non loop cases?

Regards,
Jason
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] RE: [PATCH] link-local address fix for rdma_resolve_addr

2009-10-19 Thread Sean Hefty
@@ -393,7 +393,7 @@ static int addr_resolve_local(struct sockaddr *src_in,

for_each_netdev(init_net, dev)
if (ipv6_chk_addr(init_net,
- ((struct sockaddr_in6 *) addr)-
sin6_addr,
+ ((struct sockaddr_in6 *) dst_in)-
sin6_addr,
  dev, 1))
break;

I can believe it fixes the case you describe (ie loopback) but
matching the *dest* IP against the local interface's IP list cannot
possibly be right.

The intent is to see if the destination address is local.  A source address may
not be given.

- Sean

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] Re: [PATCH] link-local address fix for rdma_resolve_addr

2009-10-19 Thread Jason Gunthorpe
On Mon, Oct 19, 2009 at 04:47:59PM -0700, Sean Hefty wrote:
 @@ -393,7 +393,7 @@ static int addr_resolve_local(struct sockaddr *src_in,
 
 for_each_netdev(init_net, dev)
 if (ipv6_chk_addr(init_net,
 - ((struct sockaddr_in6 *) addr)-
 sin6_addr,
 + ((struct sockaddr_in6 *) dst_in)-
 sin6_addr,
   dev, 1))
 break;
 
 I can believe it fixes the case you describe (ie loopback) but
 matching the *dest* IP against the local interface's IP list cannot
 possibly be right.
 
 The intent is to see if the destination address is local.  A source
 address may not be given.

Well, that makes more sense, but it still pretty strange to match the
IP list like that, the proper thing is to query RT6_TABLE_LOCAL, like
the IPv4 case does.

Anyhow, couldn't the whole addr_resolve_local routine be replaced with
something like this in addr_resolve_remote:
 if (rt-idev == init_net-loopback_dev)
rdma_translate_ip(rt-rt_src, dev_addr, NULL);

for IPv4 and similar for IPv6?

That does query the proper RT_TABLEs to determine if the IP is local
and then we get the searching and ip_dev_find only for the case where
the address is definitely looped back. Much closer to how the IP stack
works normally.

Jason
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg