[ewg] ofa_1_5_kernel 20100310-0200 daily build status
This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_5/linux-2.6.git git_branch: ofed_kernel_1_5 Common build parameters: Passed: Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.26 Passed on i686 with linux-2.6.24 Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.27 Passed on x86_64 with linux-2.6.16.60-0.54.5-smp Passed on x86_64 with linux-2.6.16.60-0.21-smp Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.18-128.el5 Passed on x86_64 with linux-2.6.18-186.el5 Passed on x86_64 with linux-2.6.18-164.el5 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.18-93.el5 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.26 Passed on x86_64 with linux-2.6.24 Passed on x86_64 with linux-2.6.25 Passed on x86_64 with linux-2.6.27 Passed on x86_64 with linux-2.6.27.19-5-smp Passed on x86_64 with linux-2.6.9-67.ELsmp Passed on x86_64 with linux-2.6.9-78.ELsmp Passed on x86_64 with linux-2.6.9-89.ELsmp Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.22 Passed on ia64 with linux-2.6.26 Passed on ia64 with linux-2.6.24 Passed on ia64 with linux-2.6.25 Passed on ppc64 with linux-2.6.18 Passed on ppc64 with linux-2.6.19 Failed: ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] Which IPv6 multicast group to be used for mckey ?
Hi, Please someone help me on this issue.. Thanks in advance, Vivek On Sat, Mar 6, 2010 at 10:30 PM, Vivek Satpute vivekonlin...@gmail.comwrote: Hi, I did google on this and found the following link: http://www.mail-archive.com/linux-r...@vger.kernel.org/msg00953.html As per above link: RDMA CM treats AF_INET6 addresses that are either 0 or prefixed with FF1x:A01B::/32 as MGIDs 1) So, does it mean that mckey works with multicast addresses starting with FF1x:A01B only ? 2) Again, I did some testing and found that if I use multicast address * FF12:A01B:0:0:0:0:0:A* with mckey, then multicast join fails with following error: #mckey -M FF12:A01B:0:0:0:0:0:A -b fe80::202:c903:0:d1e1 mckey: starting server mckey: joining mckey: event: RDMA_CM_EVENT_MULTICAST_ERROR, error: -22 test complete return status 0 mckey fails if X bit in FF1X:A01B: , is 2. For any value of X other than 2, mckey works fine. Can anyone please tell me the reason of this ? Thanks in advance, Vivek On Sat, Mar 6, 2010 at 9:34 PM, Vivek Satpute vivekonlin...@gmail.comwrote: Hi, I am new to infiniband technology, so do not have much more exposure of it. I have installed OFED-1.5 on my machine. I was trying to run mckey application with following *two different multicast groups*. mckey -M *FF10:0:0:0:0:0:0:B* -b 10.10.10.1 (receiver) mckey -M *FF10:0:0:0:0:0:0:C* -b 10.10.10.2 -s (sender) Above both multicast groups are different, still data sent by sender is received by receiver on another machine. Why it happens ? Is there any special format of IPv6 multicast groups for Infiniband ? Thanks in advance, Vivek. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] problem with ipoib_mcast_fix_ip_ib_mc_map_to_2_6_24.patch
On Wed, Mar 10, 2010 at 02:17:44PM +0200, Eli Cohen wrote: This text is from https://bugs.openfabrics.org/show_bug.cgi?id=1926 The source of this problem is this commit by Jason Gunthorpe: c12481586c4ba09cb88dc2090c67fdce7c856cde This commit fixes a deficiency in ip_ib_mc_map() by changing the multicast address in dev-mc_list. However, doing so causes a subsequent call to dev_mc_delete() to fail to decrease dmi-dmi_users and when to the appearance of the following messages: Feb 11 10:27:17 sw226 kernel: dev_mc_discard: multicast leakage! dmi_users=1 Jason, can you look at this? Ah, this was just a suggested approach, I didn't really write it, never compiled it.. I guess the problem is that the delete operation is a key lookup still based on the broken ip_ib_mc_map and by changing dmi_addr we miss it. And, I suppose this points to a larger problem than module unload, all group unsubscribe is probably broken. So you can't change the dmi_addr, which means that the ip maddr output will be wrong on systems running this patch set. I guess, the best fix is to revert c12481586c4ba09cb88dc2090c67fdce7c856cde, alter ipoib_mcast_addr_is_valid to not compare bytes 5, 8 and 9, and fixup the 'Add in the P_Key' hunk to also fixup the scope byte too. Jason ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] problem with ipoib_mcast_fix_ip_ib_mc_map_to_2_6_24.patch
On Wed, Mar 10, 2010 at 10:42:22AM -0700, Jason Gunthorpe wrote: I guess, the best fix is to revert c12481586c4ba09cb88dc2090c67fdce7c856cde, alter ipoib_mcast_addr_is_valid to not compare bytes 5, 8 and 9, and fixup the 'Add in the P_Key' hunk to also fixup the scope byte too. Can you elaborate on this? ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] problem with ipoib_mcast_fix_ip_ib_mc_map_to_2_6_24.patch
On Wed, Mar 10, 2010 at 08:57:17PM +0200, Eli Cohen wrote: On Wed, Mar 10, 2010 at 10:42:22AM -0700, Jason Gunthorpe wrote: I guess, the best fix is to revert c12481586c4ba09cb88dc2090c67fdce7c856cde, alter ipoib_mcast_addr_is_valid to not compare bytes 5, 8 and 9, and fixup the 'Add in the P_Key' hunk to also fixup the scope byte too. Can you elaborate on this? + ++ /* Work around broken ip_ib_mc_map */ ++ if (mclist-dmi_addrlen == INFINIBAND_ALEN) { ++ mclist-dmi_addr[5] = 0x10 | (dev-broadcast[5] 0xF); ++ mclist-dmi_addr[8] = dev-broadcast[8]; ++ mclist-dmi_addr[9] = dev-broadcast[9]; ++ } 5 in the dmi_addr is the scope byte. The old patch: -+ /* Add in the P_Key */ -+ mgid.raw[4] = (priv-pkey 8) 0xff; -+ mgid.raw[5] = priv-pkey 0xff; -+ Only includes the dmi_addr bytes 8 and 9. This is also a small bug. The above should read something like: mgid.raw[1] = 0x10 | (dev-broadcast[5] 0xF); mgid.raw[4] = dev-broadcast[8]; mgid.raw[5] = dev-broadcast[9]; Jason ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] possible bug in rds?
Eli Cohen wrote: Hi Andy, in our regression tests we've encountered a kernel oops with the following stack dump: snip Examining the dump I see the failure results in trying to call hlist_del() twice on the same pointer (I can see that by the poisoned pointer RCX: 00200200). Could it be that rds will call rdma_destroy_id() which will result in the described behaviour? I've opened a bug: https://bugs.openfabrics.org/show_bug.cgi?id=1983 Did this just start happening? What is the test doing when this occurred? Please add to the bug if possible, and I'll try to diagnose further. Thanks -- Regards -- Andy ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] possible bug in rds?
On Wed, Mar 10, 2010 at 03:51:36PM -0800, Andy Grover wrote: I've opened a bug: https://bugs.openfabrics.org/show_bug.cgi?id=1983 Did this just start happening? What is the test doing when this occurred? Please add to the bug if possible, and I'll try to diagnose further. Follow up response in bugzilla. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] problem with ipoib_mcast_fix_ip_ib_mc_map_to_2_6_24.patch
On 3/10/2010 9:05 PM, Jason Gunthorpe wrote: On Wed, Mar 10, 2010 at 08:57:17PM +0200, Eli Cohen wrote: On Wed, Mar 10, 2010 at 10:42:22AM -0700, Jason Gunthorpe wrote: I guess, the best fix is to revert c12481586c4ba09cb88dc2090c67fdce7c856cde, alter ipoib_mcast_addr_is_valid to not compare bytes 5, 8 and 9, and fixup the 'Add in the P_Key' hunk to also fixup the scope byte too. Can you elaborate on this? + ++ /* Work around broken ip_ib_mc_map */ ++ if (mclist-dmi_addrlen == INFINIBAND_ALEN) { ++ mclist-dmi_addr[5] = 0x10 | (dev-broadcast[5] 0xF); ++ mclist-dmi_addr[8] = dev-broadcast[8]; ++ mclist-dmi_addr[9] = dev-broadcast[9]; ++ } 5 in the dmi_addr is the scope byte. The old patch: -+ /* Add in the P_Key */ -+ mgid.raw[4] = (priv-pkey 8) 0xff; -+ mgid.raw[5] = priv-pkey 0xff; -+ Only includes the dmi_addr bytes 8 and 9. This is also a small bug. The above should read something like: mgid.raw[1] = 0x10 | (dev-broadcast[5] 0xF); mgid.raw[4] = dev-broadcast[8]; mgid.raw[5] = dev-broadcast[9]; Jason Eli Can you take care for it now or you need the complete pathc from Jason? Vlad Please revert the patch that causing the problem Tziporet ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg