Sure, let me give you more detail. I'm looping a script that does this:
shutdown ib0 port of host #1 (via switch CLI) sleep 10 bring up ib0 port of host #1 sleep 10 shutdown ib1 port of host #1 sleep 10 bring up ib1 port of host #1 sleep 10 shutdown ib0 portof host #2 sleep 10 bring up ib0 port of host #12 sleep 10 shutdown ib1 port of host #2 sleep 10 bring up ib1 port of host #2 sleep 10 While this port failover script is running, I'm running netperf over IPoIB between the 2 hosts. Because of bug 455 (https://bugs.openfabrics.org/show_bug.cgi?id=455), there is output in dmesg every time there is IPoIB HA failover, so that gives a rough sense for the rate of failover. Right now I am having a hard time getting failures to happen, I'll keep trying. Here's an example of several minutes of dmesg output: ib0: enabling connected mode will cause multicast packet drops ib1: enabling connected mode will cause multicast packet drops ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib1: enabling connected mode will cause multicast packet drops ib0: enabling connected mode will cause multicast packet drops ib0: enabling connected mode will cause multicast packet drops ib1: enabling connected mode will cause multicast packet drops ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib1: enabling connected mode will cause multicast packet drops ib0: enabling connected mode will cause multicast packet drops ib0: dev_queue_xmit failed to requeue packet ib0: dev_queue_xmit failed to requeue packet ib0: enabling connected mode will cause multicast packet drops ib1: enabling connected mode will cause multicast packet drops ib1: enabling connected mode will cause multicast packet drops ib0: enabling connected mode will cause multicast packet drops ib0: enabling connected mode will cause multicast packet drops ib1: enabling connected mode will cause multicast packet drops ib1: enabling connected mode will cause multicast packet drops ib0: enabling connected mode will cause multicast packet drops ib0: dev_queue_xmit failed to requeue packet ib0: dev_queue_xmit failed to requeue packet ib0: dev_queue_xmit failed to requeue packet ib0: enabling connected mode will cause multicast packet drops ib1: enabling connected mode will cause multicast packet drops ib1: enabling connected mode will cause multicast packet drops ib0: enabling connected mode will cause multicast packet drops ib0: enabling connected mode will cause multicast packet drops ib1: enabling connected mode will cause multicast packet drops ib1: enabling connected mode will cause multicast packet drops ib0: enabling connected mode will cause multicast packet drops ib0: enabling connected mode will cause multicast packet drops ib1: enabling connected mode will cause multicast packet drops ib1: enabling connected mode will cause multicast packet drops ib0: enabling connected mode will cause multicast packet drops ib0: enabling connected mode will cause multicast packet drops ib1: enabling connected mode will cause multicast packet drops ib1: enabling connected mode will cause multicast packet drops ib0: enabling connected mode will cause multicast packet drops ib0: enabling connected mode will cause multicast packet drops ib1: enabling connected mode will cause multicast packet drops ib1: enabling connected mode will cause multicast packet drops The IPoIB failover is very slow sometimes, shown below is netperf -D output. IPoIB failover should ideally only take a second or two. I'll be filing a bug for that. Interim result: 4355.09 10^6bits/s over 1.00 seconds Interim result: 4371.07 10^6bits/s over 1.00 seconds Interim result: 4370.95 10^6bits/s over 1.00 seconds Interim result: 162.41 10^6bits/s over 26.91 seconds Interim result: 4360.14 10^6bits/s over 1.00 seconds Interim result: 4354.94 10^6bits/s over 1.00 seconds Interim result: 4353.08 10^6bits/s over 1.00 seconds Interim result: 4343.94 10^6bits/s over 1.00 seconds Interim result: 4356.98 10^6bits/s over 1.00 seconds Interim result: 4357.00 10^6bits/s over 1.00 seconds Interim result: 1735.68 10^6bits/s over 2.51 seconds Interim result: 4357.86 10^6bits/s over 1.00 seconds Interim result: 4358.63 10^6bits/s over 1.00 seconds Interim result: 4352.05 10^6bits/s over 1.00 seconds Interim result: 4355.14 10^6bits/s over 1.00 seconds Interim result: 4350.74 10^6bits/s over 1.00 seconds Interim result: 4363.25 10^6bits/s over 1.00 seconds Interim result: 41.46 10^6bits/s over 105.24 seconds Interim result: 297.83 10^6bits/s over 14.50 seconds Interim result: 4332.43 10^6bits/s over 1.00 seconds Interim result: 4345.48 10^6bits/s over 1.00 seconds Interim result: 4365.19 10^6bits/s over 1.00 seconds Interim result: 4354.96 10^6bits/s over 1.00 seconds Interim result: 4346.54 10^6bits/s over 1.00 seconds Interim result: 4339.78 10^6bits/s over 1.00 seconds Interim result: 1730.77 10^6bits/s over 2.51 seconds Interim result: 4346.55 10^6bits/s over 1.00 seconds Interim result: 4358.37 10^6bits/s over 1.00 seconds Interim result: 4357.15 10^6bits/s over 1.00 seconds Interim result: 4362.43 10^6bits/s over 1.00 seconds Interim result: 4342.37 10^6bits/s over 1.00 seconds Interim result: 4339.25 10^6bits/s over 1.00 seconds Interim result: 4337.89 10^6bits/s over 1.00 seconds Interim result: 4328.02 10^6bits/s over 1.00 seconds Interim result: 4352.09 10^6bits/s over 1.00 seconds Interim result: 4344.81 10^6bits/s over 1.00 seconds Interim result: 4354.92 10^6bits/s over 1.00 seconds Interim result: 4354.71 10^6bits/s over 1.00 seconds Interim result: 1732.11 10^6bits/s over 2.51 seconds Interim result: 4334.02 10^6bits/s over 1.00 seconds Interim result: 4340.94 10^6bits/s over 1.00 seconds Scott > -----Original Message----- > From: Sean Hefty [mailto:[EMAIL PROTECTED] > Sent: Tuesday, March 13, 2007 5:06 PM > To: Scott Weitzenkamp (sweitzen); [email protected] > Subject: bug 400: ipoib error messages > > {snippet from bug 400 report because I don't want to try to > have a discussion on > this inside a bug report...} > > IPoIB CM HA is working much better in OFED-1.2-20070311-0600. > I have been > running for a few hours flipping an IB port every 10 seconds. > > I do still see some junk in dmesg, let me know if I should > open a new bug or > reopen this bug. > > ib1: dev_queue_xmit failed to requeue packet > ib_mthca 0000:04:00.0: QP 000404 not found in MGM > ib0: ib_detach_mcast failed (result = -22) > ib0: ipoib_mcast_detach failed (result = -22) > ib1: dev_queue_xmit failed to requeue packet > ib1: dev_queue_xmit failed to requeue packet > ib1: dev_queue_xmit failed to requeue packet > ib0: dev_queue_xmit failed to requeue packet > ib0: dev_queue_xmit failed to requeue packet > ib0: dev_queue_xmit failed to requeue packet > ib0: dev_queue_xmit failed to requeue packet > ib0: dev_queue_xmit failed to requeue packet > ib0: dev_queue_xmit failed to requeue packet > > Scott, is this the start of the message log, or just a > snapshot? Specifically, > do you see ib_detach_mcast failures for ib1? Is the > dev_queue_xmit the first > error? > > - Sean > _______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
