[ofa-general] Re: [Bug 465] IPoIB CM HA fails after several hours of failovers
> I've tried this with RHEL4 U3 x86_64 LionMini SDR, SLES10 x86_64 LionCub DDR, > and RHEL4 U3 x86_64 LionMini DDR so far. You reported an oops. One which OS/HW/FW did you observe it? -- MST ___ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] Re: [Bug 465] IPoIB CM HA fails after several hours of failovers
Scott, pls provide data about the crash as requested by previous comments (note you don't have to reproduce it to provide that data). -- MST ___ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] RE: [Bug 465] IPoIB CM HA fails after several hours of failovers
Yes, I'm toggline one port at a time, among 4 ports (2 for each host). I've only seen a crash once, every other time IPoIB CM just stops working. ___ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] Re: [Bug 465] IPoIB CM HA fails after several hours of failovers
Is that what you are doing? I can't easily do this (this requires a 3'd system) and I can't see how toggling ports connected to host A would crash host B. -- MST ___ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] RE: [Bug 465] IPoIB CM HA fails after several hours of failovers
You have: ports="7 8"; This is only toggling ports on one host, try adding the ports for the other host to the list too. Scott ___ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] Re: [Bug 465] IPoIB CM HA fails after several hours of failovers
> Regarding comment #34, can you add details on how you are doing port failover > (using ibportstate?) > and what traffic you are running (what is the netperf > command line?)? I have opensm running on host 11.4.3.175. Host 11.4.3.178 is connected to switch lid 5 ports 7 and 8. I run on 11.4.3.175: #!/bin/bash while ./netperf -D -H 11.4.3.178 do date sleep 5 done and at the same time, also on 11.4.3.175: #!/bin/bash lid=5; ports="7 8"; while true do for port in $ports do echo ibportstate $lid $port disable ibportstate $lid $port disable sleep 5 echo ibportstate $lid $port enable ibportstate $lid $port enable sleep 5 done done ___ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] Re: [Bug 465] IPoIB CM HA fails after several hours of failovers
BTW, could you pls add more detail? What OS/HW/FW does this happen with? ___ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] Re: [Bug 465] IPoIB CM HA fails after several hours of failovers
> Unable to handle kernel NULL pointer dereference at 0020 RIP:
> {:ib_ipoib:ipoib_mcast_join_finish+69}
...
> Code: 8b 48 20 89 c8 89 ca 25 00 ff 00 00 c1 e2 18 c1 e0 08 09 c2
> RIP {:ib_ipoib:ipoib_mcast_join_finish+69} RSP
> <0101bc83bc
> 38>
> CR2: 0020
> <0>Kernel panic - not syncing: Oops
Can you please check which code line does ipoib_mcast_join_finish+69 point at?
You can either use plain objdump for this, or use the script from
http://www.openfabrics.org/~mst/oops
give it the path to the ib_ipoib.ko file and ipoib_mcast_join_finish+69
--
MST
___
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
