This is https://bugs.openfabrics.org/show_bug.cgi?id=795, it appears to be fixed in the 1128 build.
> -----Original Message----- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Vu Pham > Sent: Wednesday, December 05, 2007 10:01 AM > To: Moni Shoua > Cc: OpenFabrics General > Subject: [ofa-general] ipoib bonding problems in 1.3-beta2 > and 1.2.5.4, > > Hi Moni, > > My systems are RHEL 5.1 x86-64, 2 Sinai hcas, fw 1.2.0 > > I setup bonding as follow: > IPOIBBOND_ENABLE=yes > IPOIB_BONDS=bond0 > bond0_IP=11.1.1.1 > bond0_SLAVEs=ib0,ib1 > in /etc/infiniband/openib.conf in order to start ib-bond automatically > > Testing with OFED-1.3-beta2, I got the following crash while > system is > booting up > > Stack: ffffffff883429d0 fff810428519d30 ................ > Call Trace: > [<ffffffff883429d0>] :bonding:bond_get_stats+0x4a/0x131 > [< 8020e9cd>] rtnetlink_fill_ifinfo+0x4ba/0x5c4 > ee19>] rtmsg_if info+0x44/0x8d > eea2>] rtnetlink_event+0x40/0x44 > 8006492a>] notifier_call_chain+0x20/0x32 > 80208b5e>] dev_open+0x68/0x6e > 72e8>] dev_change_flags+0x5a/0x119 > 80239762>] devinet_ioctl+0x235/0x59c > 801ffcf6>] sock_ioctl+0x1c1/0x1e5 > 8003fc3f>] do_ioctl+0x21/0x6b > 8002fa45>] vfs_ioctl+0x248/0261 > 8004a24b>] sys_ioctl+0x59/0x78 > 8005b14e>] system_call+0x7e/0x83 > > Code: Bad RIP value > RIP [0000000000000000000000] _stext+0x7ffff000/0x1000 > RSP <ffff10428519cc0> > CR2: 000000000000000000000 > <0>Kernel panic - not syncing: Fatal exception > > I open bug #812 for this issue. > > I moved our systems back to ofed-1.2.5.4 and tested ib-bond again. We > tested it with ib0 and ib1 (connected to different > switch/fabric) been > on the same subnet (10.2.1.x, 255.255.255.0) and on different subnets > (10.2.1.x and 10.3.1.x, 255.255.255.0). In both cases there > is the issue > of loosing communication between the servers if nodes have > not been on > the same primary ib interface. > > Example: > 1. original state: ib0's are the primary on both servers - > pinging bond0 > between the servers is fine > 2. fail ib0 on one of the servers (ib1 become primary on this > server) - > pinging bond0 between the servers fails > 3. fail ib0 on the second server (ib1 become primary) - pinging bond0 > between the servers is fine again > > Is this behavior expected? > > thanks, > -vu > _______________________________________________ > general mailing list > [email protected] > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit > http://openib.org/mailman/listinfo/openib-general > _______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
