[ewg] ofa_1_5_kernel 20100310-0200 daily build status

2010-03-10 Thread Vladimir Sokolovsky (Mellanox)
This email was generated automatically, please do not reply


git_url: git://git.openfabrics.org/ofed_1_5/linux-2.6.git
git_branch: ofed_kernel_1_5

Common build parameters: 

Passed:
Passed on i686 with linux-2.6.18
Passed on i686 with linux-2.6.19
Passed on i686 with linux-2.6.21.1
Passed on i686 with linux-2.6.26
Passed on i686 with linux-2.6.24
Passed on i686 with linux-2.6.22
Passed on i686 with linux-2.6.27
Passed on x86_64 with linux-2.6.16.60-0.54.5-smp
Passed on x86_64 with linux-2.6.16.60-0.21-smp
Passed on x86_64 with linux-2.6.18
Passed on x86_64 with linux-2.6.18-128.el5
Passed on x86_64 with linux-2.6.18-186.el5
Passed on x86_64 with linux-2.6.18-164.el5
Passed on x86_64 with linux-2.6.19
Passed on x86_64 with linux-2.6.18-93.el5
Passed on x86_64 with linux-2.6.21.1
Passed on x86_64 with linux-2.6.20
Passed on x86_64 with linux-2.6.22
Passed on x86_64 with linux-2.6.26
Passed on x86_64 with linux-2.6.24
Passed on x86_64 with linux-2.6.25
Passed on x86_64 with linux-2.6.27
Passed on x86_64 with linux-2.6.27.19-5-smp
Passed on x86_64 with linux-2.6.9-67.ELsmp
Passed on x86_64 with linux-2.6.9-78.ELsmp
Passed on x86_64 with linux-2.6.9-89.ELsmp
Passed on ia64 with linux-2.6.18
Passed on ia64 with linux-2.6.19
Passed on ia64 with linux-2.6.23
Passed on ia64 with linux-2.6.21.1
Passed on ia64 with linux-2.6.22
Passed on ia64 with linux-2.6.26
Passed on ia64 with linux-2.6.24
Passed on ia64 with linux-2.6.25
Passed on ppc64 with linux-2.6.18
Passed on ppc64 with linux-2.6.19

Failed:
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] Which IPv6 multicast group to be used for mckey ?

2010-03-10 Thread Vivek Satpute
Hi,

Please someone help me on this issue..

Thanks in advance,
Vivek

On Sat, Mar 6, 2010 at 10:30 PM, Vivek Satpute vivekonlin...@gmail.comwrote:

 Hi,

 I did google on this and found the following link:
 http://www.mail-archive.com/linux-r...@vger.kernel.org/msg00953.html

 As per above link:

 RDMA CM treats AF_INET6 addresses that are either 0 or prefixed with
 FF1x:A01B::/32 as MGIDs

 1) So, does it mean that mckey works with multicast addresses starting with
 FF1x:A01B only ?

 2) Again, I did some testing and found that if I use multicast address *
 FF12:A01B:0:0:0:0:0:A*
 with mckey, then multicast join fails with following error:

 #mckey -M FF12:A01B:0:0:0:0:0:A -b fe80::202:c903:0:d1e1
 mckey: starting server
 mckey: joining
 mckey: event: RDMA_CM_EVENT_MULTICAST_ERROR, error: -22
 test complete
 return status 0

 mckey fails if  X bit in FF1X:A01B:  , is 2. For any value of X
 other than 2, mckey works fine. Can anyone please tell me the reason of this
 ?

 Thanks in advance,
 Vivek

 On Sat, Mar 6, 2010 at 9:34 PM, Vivek Satpute vivekonlin...@gmail.comwrote:

 Hi,

 I am new to infiniband technology, so do not have much more exposure of
 it.

 I have installed OFED-1.5 on my machine. I was trying to run mckey
 application
 with following *two different multicast groups*.

 mckey -M *FF10:0:0:0:0:0:0:B* -b 10.10.10.1 (receiver)
 mckey -M *FF10:0:0:0:0:0:0:C* -b 10.10.10.2 -s (sender)

 Above both multicast groups are different, still data sent by sender is
 received

 by receiver on another machine. Why it happens ?

 Is there any special format of IPv6 multicast groups for Infiniband ?


 Thanks in advance,
 Vivek.



___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] problem with ipoib_mcast_fix_ip_ib_mc_map_to_2_6_24.patch

2010-03-10 Thread Jason Gunthorpe
On Wed, Mar 10, 2010 at 02:17:44PM +0200, Eli Cohen wrote:
 This text is from https://bugs.openfabrics.org/show_bug.cgi?id=1926
 
 The source of this problem is this commit by Jason Gunthorpe:
 c12481586c4ba09cb88dc2090c67fdce7c856cde
 
 This commit fixes a deficiency in ip_ib_mc_map() by changing the
 multicast address in dev-mc_list. However, doing so causes a
 subsequent call to dev_mc_delete() to fail to decrease dmi-dmi_users
 and when to the appearance
 of the following messages:
 Feb 11 10:27:17 sw226 kernel: dev_mc_discard: multicast leakage! dmi_users=1
 
 Jason,
 can you look at this?

Ah, this was just a suggested approach, I didn't really write it,
never compiled it..

I guess the problem is that the delete operation is a key lookup still
based on the broken ip_ib_mc_map and by changing dmi_addr we miss it.

And, I suppose this points to a larger problem than module unload, all
group unsubscribe is probably broken.

So you can't change the dmi_addr, which means that the ip maddr output
will be wrong on systems running this patch set.

I guess, the best fix is to revert c12481586c4ba09cb88dc2090c67fdce7c856cde,
alter ipoib_mcast_addr_is_valid to not compare bytes 5, 8 and 9, and
fixup the 'Add in the P_Key' hunk to also fixup the scope byte too.

Jason
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] problem with ipoib_mcast_fix_ip_ib_mc_map_to_2_6_24.patch

2010-03-10 Thread Eli Cohen
On Wed, Mar 10, 2010 at 10:42:22AM -0700, Jason Gunthorpe wrote:
 
 I guess, the best fix is to revert c12481586c4ba09cb88dc2090c67fdce7c856cde,
 alter ipoib_mcast_addr_is_valid to not compare bytes 5, 8 and 9,


 and
 fixup the 'Add in the P_Key' hunk to also fixup the scope byte too.

Can you elaborate on this? 
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] problem with ipoib_mcast_fix_ip_ib_mc_map_to_2_6_24.patch

2010-03-10 Thread Jason Gunthorpe
On Wed, Mar 10, 2010 at 08:57:17PM +0200, Eli Cohen wrote:
 On Wed, Mar 10, 2010 at 10:42:22AM -0700, Jason Gunthorpe wrote:
  
  I guess, the best fix is to revert c12481586c4ba09cb88dc2090c67fdce7c856cde,
  alter ipoib_mcast_addr_is_valid to not compare bytes 5, 8 and 9,
 
 
  and
  fixup the 'Add in the P_Key' hunk to also fixup the scope byte too.
 
 Can you elaborate on this? 

+ 
++  /* Work around broken ip_ib_mc_map */
++  if (mclist-dmi_addrlen == INFINIBAND_ALEN) {
++  mclist-dmi_addr[5] = 0x10 | (dev-broadcast[5]  0xF);
++  mclist-dmi_addr[8] = dev-broadcast[8];
++  mclist-dmi_addr[9] = dev-broadcast[9];
++  }

5 in the dmi_addr is the scope byte. The old patch:

-+  /* Add in the P_Key */
-+  mgid.raw[4] = (priv-pkey  8)  0xff;
-+  mgid.raw[5] = priv-pkey  0xff;
-+

Only includes the dmi_addr bytes 8 and 9. This is also a small bug.

The above should read something like:

mgid.raw[1] = 0x10 | (dev-broadcast[5]  0xF);
mgid.raw[4] = dev-broadcast[8];
mgid.raw[5] = dev-broadcast[9];

Jason
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] possible bug in rds?

2010-03-10 Thread Andy Grover
Eli Cohen wrote:
 Hi Andy,
 
 in our regression tests we've encountered a kernel oops with the
 following stack dump:

snip

 Examining the dump I see the failure results in trying to call
 hlist_del() twice on the same pointer (I can see that by the poisoned
 pointer RCX: 00200200).
 Could it be that rds will call rdma_destroy_id() which will result in
 the described behaviour?

I've opened a bug:

https://bugs.openfabrics.org/show_bug.cgi?id=1983

Did this just start happening?  What is the test doing when this
occurred? Please add to the bug if possible, and I'll try to diagnose
further.

Thanks -- Regards -- Andy

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] possible bug in rds?

2010-03-10 Thread Eli Cohen
On Wed, Mar 10, 2010 at 03:51:36PM -0800, Andy Grover wrote:
 
 I've opened a bug:
 
 https://bugs.openfabrics.org/show_bug.cgi?id=1983
 
 Did this just start happening?  What is the test doing when this
 occurred? Please add to the bug if possible, and I'll try to diagnose
 further.
 

Follow up response in bugzilla.
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] problem with ipoib_mcast_fix_ip_ib_mc_map_to_2_6_24.patch

2010-03-10 Thread Tziporet Koren
On 3/10/2010 9:05 PM, Jason Gunthorpe wrote:
 On Wed, Mar 10, 2010 at 08:57:17PM +0200, Eli Cohen wrote:

 On Wed, Mar 10, 2010 at 10:42:22AM -0700, Jason Gunthorpe wrote:
  
 I guess, the best fix is to revert c12481586c4ba09cb88dc2090c67fdce7c856cde,
 alter ipoib_mcast_addr_is_valid to not compare bytes 5, 8 and 9,


  
 and
 fixup the 'Add in the P_Key' hunk to also fixup the scope byte too.

 Can you elaborate on this?
  
 +
 ++  /* Work around broken ip_ib_mc_map */
 ++  if (mclist-dmi_addrlen == INFINIBAND_ALEN) {
 ++  mclist-dmi_addr[5] = 0x10 | (dev-broadcast[5]  0xF);
 ++  mclist-dmi_addr[8] = dev-broadcast[8];
 ++  mclist-dmi_addr[9] = dev-broadcast[9];
 ++  }

 5 in the dmi_addr is the scope byte. The old patch:

 -+  /* Add in the P_Key */
 -+  mgid.raw[4] = (priv-pkey  8)  0xff;
 -+  mgid.raw[5] = priv-pkey  0xff;
 -+

 Only includes the dmi_addr bytes 8 and 9. This is also a small bug.

 The above should read something like:

 mgid.raw[1] = 0x10 | (dev-broadcast[5]  0xF);
 mgid.raw[4] = dev-broadcast[8];
 mgid.raw[5] = dev-broadcast[9];

 Jason

Eli
Can you take care for it now or you need the complete pathc from Jason?

Vlad
Please revert the patch that causing the problem

Tziporet
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg