Re: FBSD 10.3 + ZFS + Sun x4500 = utter lock up.
In what was arguably a silly attempt I changed all IRQ interrupts to go to CPU0 and .. the host has stayed up through multiple attempts to crash it. I'm not calling it fixed yet but there appears to be hope. Right now I have a script -- /usr/local/etc/rc.d/cpuset.sh -- that's doing the work. This seems a sub-optimal place to do it as there is a possibility of crash before the script is executed on boot. Is there any option in bootloader or related for setting these or is cpuset(1) my only option? thanks, nomad >> FreeBSD [redacted] 10.3-STABLE FreeBSD 10.3-STABLE #2 r313008: Tue Jan >> 31 01:50:49 PST 2017 lvd@[redacted]:/usr/obj/usr/src/sys/GENERIC >> amd64 >> >> I'm trying to get FBSD 10.3 working on a Sun x4500 (don't ask) for use >> as a ZFS-based backup server. However, whenever any amount of data is >> put into a zpool and then zpool scrub is run the host locks up hard. On >> reboot it complains that a "Hyper transport sync flood occurred". >> >> I found >> https://lists.freebsd.org/pipermail/freebsd-stable/2012-January/065542.html >> >> which seems to match but when I try the cpuset command mentioned there I >> get an error: >> >> ; sudo cpuset -c -l 0 -x 58 >> cpuset: setaffinity: Invalid argument >> >> Looks like the -c was invalid. After removing that I was informed -x 58 >> wasn't valid. Sure enough, there's no mpt0 or IRQ 58 on the host: >> >> ; vmstat -i >> interrupt total rate >> irq17: ohci28578 2 >> irq18: ohci3 473 0 >> irq19: ohci0 ohci1+ 4924 1 >> irq24: mvs0 457 0 >> irq32: mvs1 453 0 >> irq38: mvs2 451 0 >> irq46: mvs3 8063 1 >> irq52: em0152354 35 >> irq53: em1 140 0 >> irq68: mvs4 450 0 >> irq76: mvs5 454 0 >> cpu0:timer208311 48 >> cpu1:timer 98318 23 >> cpu2:timer105704 24 >> cpu3:timer106202 24 >> Total 695332162 >> >> Looking around with some help from #freebsd on efnet I found mvs0-5 >> which are connected to the Marvel drive controllers on the host. I then >> used >> ; sudo cpuset -l 0 -x ## >> where I replaced ## with 24, 32, 38, 46, 68, and 76. >> >> After rebuilding the zpool I started writing to it. It took a lot less >> time to crash - I didn't even need to run zpool scrub - but instead of >> completely locking up it just rebooted. I did not see reference to the >> hyper transport problem while watching it boot but given the poor >> performance of the serial console I can't be 100% sure it wasn't there. >> >> So now I turn here to ask for guidance. Is anyone currently successfully >> running 10.x on a x4500 and if so, how are you doing it? If not, how can >> I get this working? >> >> thanks, >> nomad >> ___ >> freebsd-stable@freebsd.org mailing list >> https://lists.freebsd.org/mailman/listinfo/freebsd-stable >> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org" ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: FBSD 10.3 + ZFS + Sun x4500 = utter lock up.
On 2/8/17 05:53 , Ronald Klop wrote: > ... > Any reason not to try 11? I don't know if it fixes anything, but it > would be a nice data point for comparison. I'll probably give that a try shortly but given the problem was seen in previous releases I'm not optimistic. I'm only in that location on Monday and Tuesday so can't try anything until next week. > Ronald. > > > On Tue, 07 Feb 2017 22:44:01 +0100, Lee Damon wrote: > >> FreeBSD [redacted] 10.3-STABLE FreeBSD 10.3-STABLE #2 r313008: Tue Jan >> 31 01:50:49 PST 2017 lvd@[redacted]:/usr/obj/usr/src/sys/GENERIC >> amd64 >> >> I'm trying to get FBSD 10.3 working on a Sun x4500 (don't ask) for use >> as a ZFS-based backup server. However, whenever any amount of data is >> put into a zpool and then zpool scrub is run the host locks up hard. On >> reboot it complains that a "Hyper transport sync flood occurred". >> >> I found >> https://lists.freebsd.org/pipermail/freebsd-stable/2012-January/065542.html >> >> which seems to match but when I try the cpuset command mentioned there I >> get an error: >> >> ; sudo cpuset -c -l 0 -x 58 >> cpuset: setaffinity: Invalid argument >> >> Looks like the -c was invalid. After removing that I was informed -x 58 >> wasn't valid. Sure enough, there's no mpt0 or IRQ 58 on the host: >> >> ; vmstat -i >> interrupt total rate >> irq17: ohci28578 2 >> irq18: ohci3 473 0 >> irq19: ohci0 ohci1+ 4924 1 >> irq24: mvs0 457 0 >> irq32: mvs1 453 0 >> irq38: mvs2 451 0 >> irq46: mvs3 8063 1 >> irq52: em0152354 35 >> irq53: em1 140 0 >> irq68: mvs4 450 0 >> irq76: mvs5 454 0 >> cpu0:timer208311 48 >> cpu1:timer 98318 23 >> cpu2:timer105704 24 >> cpu3:timer106202 24 >> Total 695332162 >> >> Looking around with some help from #freebsd on efnet I found mvs0-5 >> which are connected to the Marvel drive controllers on the host. I then >> used >> ; sudo cpuset -l 0 -x ## >> where I replaced ## with 24, 32, 38, 46, 68, and 76. >> >> After rebuilding the zpool I started writing to it. It took a lot less >> time to crash - I didn't even need to run zpool scrub - but instead of >> completely locking up it just rebooted. I did not see reference to the >> hyper transport problem while watching it boot but given the poor >> performance of the serial console I can't be 100% sure it wasn't there. >> >> So now I turn here to ask for guidance. Is anyone currently successfully >> running 10.x on a x4500 and if so, how are you doing it? If not, how can >> I get this working? >> >> thanks, >> nomad >> ___ >> freebsd-stable@freebsd.org mailing list >> https://lists.freebsd.org/mailman/listinfo/freebsd-stable >> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org" ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: FBSD 10.3 + ZFS + Sun x4500 = utter lock up.
At work we used to run such a machine with 9.1 and 10.2. My colleague tells me 10.3 gave errors, but he does not remember what. The machine is not in use anymore, because of other upgrades, so I can't verify for you. Any reason not to try 11? I don't know if it fixes anything, but it would be a nice data point for comparison. Ronald. On Tue, 07 Feb 2017 22:44:01 +0100, Lee Damon wrote: FreeBSD [redacted] 10.3-STABLE FreeBSD 10.3-STABLE #2 r313008: Tue Jan 31 01:50:49 PST 2017 lvd@[redacted]:/usr/obj/usr/src/sys/GENERIC amd64 I'm trying to get FBSD 10.3 working on a Sun x4500 (don't ask) for use as a ZFS-based backup server. However, whenever any amount of data is put into a zpool and then zpool scrub is run the host locks up hard. On reboot it complains that a "Hyper transport sync flood occurred". I found https://lists.freebsd.org/pipermail/freebsd-stable/2012-January/065542.html which seems to match but when I try the cpuset command mentioned there I get an error: ; sudo cpuset -c -l 0 -x 58 cpuset: setaffinity: Invalid argument Looks like the -c was invalid. After removing that I was informed -x 58 wasn't valid. Sure enough, there's no mpt0 or IRQ 58 on the host: ; vmstat -i interrupt total rate irq17: ohci28578 2 irq18: ohci3 473 0 irq19: ohci0 ohci1+ 4924 1 irq24: mvs0 457 0 irq32: mvs1 453 0 irq38: mvs2 451 0 irq46: mvs3 8063 1 irq52: em0152354 35 irq53: em1 140 0 irq68: mvs4 450 0 irq76: mvs5 454 0 cpu0:timer208311 48 cpu1:timer 98318 23 cpu2:timer105704 24 cpu3:timer106202 24 Total 695332162 Looking around with some help from #freebsd on efnet I found mvs0-5 which are connected to the Marvel drive controllers on the host. I then used ; sudo cpuset -l 0 -x ## where I replaced ## with 24, 32, 38, 46, 68, and 76. After rebuilding the zpool I started writing to it. It took a lot less time to crash - I didn't even need to run zpool scrub - but instead of completely locking up it just rebooted. I did not see reference to the hyper transport problem while watching it boot but given the poor performance of the serial console I can't be 100% sure it wasn't there. So now I turn here to ask for guidance. Is anyone currently successfully running 10.x on a x4500 and if so, how are you doing it? If not, how can I get this working? thanks, nomad ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org" ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
FBSD 10.3 + ZFS + Sun x4500 = utter lock up.
FreeBSD [redacted] 10.3-STABLE FreeBSD 10.3-STABLE #2 r313008: Tue Jan 31 01:50:49 PST 2017 lvd@[redacted]:/usr/obj/usr/src/sys/GENERIC amd64 I'm trying to get FBSD 10.3 working on a Sun x4500 (don't ask) for use as a ZFS-based backup server. However, whenever any amount of data is put into a zpool and then zpool scrub is run the host locks up hard. On reboot it complains that a "Hyper transport sync flood occurred". I found https://lists.freebsd.org/pipermail/freebsd-stable/2012-January/065542.html which seems to match but when I try the cpuset command mentioned there I get an error: ; sudo cpuset -c -l 0 -x 58 cpuset: setaffinity: Invalid argument Looks like the -c was invalid. After removing that I was informed -x 58 wasn't valid. Sure enough, there's no mpt0 or IRQ 58 on the host: ; vmstat -i interrupt total rate irq17: ohci28578 2 irq18: ohci3 473 0 irq19: ohci0 ohci1+ 4924 1 irq24: mvs0 457 0 irq32: mvs1 453 0 irq38: mvs2 451 0 irq46: mvs3 8063 1 irq52: em0152354 35 irq53: em1 140 0 irq68: mvs4 450 0 irq76: mvs5 454 0 cpu0:timer208311 48 cpu1:timer 98318 23 cpu2:timer105704 24 cpu3:timer106202 24 Total 695332162 Looking around with some help from #freebsd on efnet I found mvs0-5 which are connected to the Marvel drive controllers on the host. I then used ; sudo cpuset -l 0 -x ## where I replaced ## with 24, 32, 38, 46, 68, and 76. After rebuilding the zpool I started writing to it. It took a lot less time to crash - I didn't even need to run zpool scrub - but instead of completely locking up it just rebooted. I did not see reference to the hyper transport problem while watching it boot but given the poor performance of the serial console I can't be 100% sure it wasn't there. So now I turn here to ask for guidance. Is anyone currently successfully running 10.x on a x4500 and if so, how are you doing it? If not, how can I get this working? thanks, nomad ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"