Re: FBSD 10.3 + ZFS + Sun x4500 = utter lock up.

2017-02-13 Thread Lee Damon
In what was arguably a silly attempt I changed all IRQ interrupts to go
to CPU0 and .. the host has stayed up through multiple attempts to crash
it. I'm not calling it fixed yet but there appears to be hope.

Right now I have a script -- /usr/local/etc/rc.d/cpuset.sh -- that's
doing the work. This seems a sub-optimal place to do it as there is a
possibility of crash before the script is executed on boot. Is there any
option in bootloader or related for setting these or is cpuset(1) my
only option?

thanks,
nomad

>> FreeBSD [redacted] 10.3-STABLE FreeBSD 10.3-STABLE #2 r313008: Tue Jan
>> 31 01:50:49 PST 2017 lvd@[redacted]:/usr/obj/usr/src/sys/GENERIC 
>> amd64
>>
>> I'm trying to get FBSD 10.3 working on a Sun x4500 (don't ask) for use
>> as a ZFS-based backup server. However, whenever any amount of data is
>> put into a zpool and then zpool scrub is run the host locks up hard. On
>> reboot it complains that a "Hyper transport sync flood occurred".
>>
>> I found
>> https://lists.freebsd.org/pipermail/freebsd-stable/2012-January/065542.html
>>
>> which seems to match but when I try the cpuset command mentioned there I
>> get an error:
>>
>> ; sudo cpuset -c -l 0 -x 58
>> cpuset: setaffinity: Invalid argument
>>
>> Looks like the -c was invalid. After removing that I was informed -x 58
>> wasn't valid. Sure enough, there's no mpt0 or IRQ 58 on the host:
>>
>> ; vmstat -i
>> interrupt  total   rate
>> irq17: ohci28578  2
>> irq18: ohci3 473  0
>> irq19: ohci0 ohci1+ 4924  1
>> irq24: mvs0  457  0
>> irq32: mvs1  453  0
>> irq38: mvs2  451  0
>> irq46: mvs3 8063  1
>> irq52: em0152354 35
>> irq53: em1   140  0
>> irq68: mvs4  450  0
>> irq76: mvs5  454  0
>> cpu0:timer208311 48
>> cpu1:timer 98318 23
>> cpu2:timer105704 24
>> cpu3:timer106202 24
>> Total 695332162
>>
>> Looking around with some help from #freebsd on efnet I found mvs0-5
>> which are connected to the Marvel drive controllers on the host. I then
>> used
>>   ; sudo cpuset -l 0 -x ##
>> where I replaced ## with 24, 32, 38, 46, 68, and 76.
>>
>> After rebuilding the zpool I started writing to it. It took a lot less
>> time to crash - I didn't even need to run zpool scrub - but instead of
>> completely locking up it just rebooted. I did not see reference to the
>> hyper transport problem while watching it boot but given the poor
>> performance of the serial console I can't be 100% sure it wasn't there.
>>
>> So now I turn here to ask for guidance. Is anyone currently successfully
>> running 10.x on a x4500 and if so, how are you doing it? If not, how can
>> I get this working?
>>
>> thanks,
>> nomad
>> ___
>> freebsd-stable@freebsd.org mailing list
>> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
>> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: FBSD 10.3 + ZFS + Sun x4500 = utter lock up.

2017-02-08 Thread Lee Damon
On 2/8/17 05:53 , Ronald Klop wrote:
> ...  
> Any reason not to try 11? I don't know if it fixes anything, but it
> would be a nice data point for comparison.

I'll probably give that a try shortly but given the problem was seen in
previous releases I'm not optimistic. I'm only in that location on
Monday and Tuesday so can't try anything until next week.

> Ronald.
> 
> 
> On Tue, 07 Feb 2017 22:44:01 +0100, Lee Damon  wrote:
> 
>> FreeBSD [redacted] 10.3-STABLE FreeBSD 10.3-STABLE #2 r313008: Tue Jan
>> 31 01:50:49 PST 2017 lvd@[redacted]:/usr/obj/usr/src/sys/GENERIC 
>> amd64
>>
>> I'm trying to get FBSD 10.3 working on a Sun x4500 (don't ask) for use
>> as a ZFS-based backup server. However, whenever any amount of data is
>> put into a zpool and then zpool scrub is run the host locks up hard. On
>> reboot it complains that a "Hyper transport sync flood occurred".
>>
>> I found
>> https://lists.freebsd.org/pipermail/freebsd-stable/2012-January/065542.html
>>
>> which seems to match but when I try the cpuset command mentioned there I
>> get an error:
>>
>> ; sudo cpuset -c -l 0 -x 58
>> cpuset: setaffinity: Invalid argument
>>
>> Looks like the -c was invalid. After removing that I was informed -x 58
>> wasn't valid. Sure enough, there's no mpt0 or IRQ 58 on the host:
>>
>> ; vmstat -i
>> interrupt  total   rate
>> irq17: ohci28578  2
>> irq18: ohci3 473  0
>> irq19: ohci0 ohci1+ 4924  1
>> irq24: mvs0  457  0
>> irq32: mvs1  453  0
>> irq38: mvs2  451  0
>> irq46: mvs3 8063  1
>> irq52: em0152354 35
>> irq53: em1   140  0
>> irq68: mvs4  450  0
>> irq76: mvs5  454  0
>> cpu0:timer208311 48
>> cpu1:timer 98318 23
>> cpu2:timer105704 24
>> cpu3:timer106202 24
>> Total 695332162
>>
>> Looking around with some help from #freebsd on efnet I found mvs0-5
>> which are connected to the Marvel drive controllers on the host. I then
>> used
>>   ; sudo cpuset -l 0 -x ##
>> where I replaced ## with 24, 32, 38, 46, 68, and 76.
>>
>> After rebuilding the zpool I started writing to it. It took a lot less
>> time to crash - I didn't even need to run zpool scrub - but instead of
>> completely locking up it just rebooted. I did not see reference to the
>> hyper transport problem while watching it boot but given the poor
>> performance of the serial console I can't be 100% sure it wasn't there.
>>
>> So now I turn here to ask for guidance. Is anyone currently successfully
>> running 10.x on a x4500 and if so, how are you doing it? If not, how can
>> I get this working?
>>
>> thanks,
>> nomad
>> ___
>> freebsd-stable@freebsd.org mailing list
>> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
>> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: FBSD 10.3 + ZFS + Sun x4500 = utter lock up.

2017-02-08 Thread Ronald Klop
At work we used to run such a machine with 9.1 and 10.2. My colleague  
tells me 10.3 gave errors, but he does not remember what.
The machine is not in use anymore, because of other upgrades, so I can't  
verify for you.


Any reason not to try 11? I don't know if it fixes anything, but it would  
be a nice data point for comparison.


Ronald.


On Tue, 07 Feb 2017 22:44:01 +0100, Lee Damon  wrote:


FreeBSD [redacted] 10.3-STABLE FreeBSD 10.3-STABLE #2 r313008: Tue Jan
31 01:50:49 PST 2017 lvd@[redacted]:/usr/obj/usr/src/sys/GENERIC   
amd64


I'm trying to get FBSD 10.3 working on a Sun x4500 (don't ask) for use
as a ZFS-based backup server. However, whenever any amount of data is
put into a zpool and then zpool scrub is run the host locks up hard. On
reboot it complains that a "Hyper transport sync flood occurred".

I found
https://lists.freebsd.org/pipermail/freebsd-stable/2012-January/065542.html
which seems to match but when I try the cpuset command mentioned there I
get an error:

; sudo cpuset -c -l 0 -x 58
cpuset: setaffinity: Invalid argument

Looks like the -c was invalid. After removing that I was informed -x 58
wasn't valid. Sure enough, there's no mpt0 or IRQ 58 on the host:

; vmstat -i
interrupt  total   rate
irq17: ohci28578  2
irq18: ohci3 473  0
irq19: ohci0 ohci1+ 4924  1
irq24: mvs0  457  0
irq32: mvs1  453  0
irq38: mvs2  451  0
irq46: mvs3 8063  1
irq52: em0152354 35
irq53: em1   140  0
irq68: mvs4  450  0
irq76: mvs5  454  0
cpu0:timer208311 48
cpu1:timer 98318 23
cpu2:timer105704 24
cpu3:timer106202 24
Total 695332162

Looking around with some help from #freebsd on efnet I found mvs0-5
which are connected to the Marvel drive controllers on the host. I then
used
  ; sudo cpuset -l 0 -x ##
where I replaced ## with 24, 32, 38, 46, 68, and 76.

After rebuilding the zpool I started writing to it. It took a lot less
time to crash - I didn't even need to run zpool scrub - but instead of
completely locking up it just rebooted. I did not see reference to the
hyper transport problem while watching it boot but given the poor
performance of the serial console I can't be 100% sure it wasn't there.

So now I turn here to ask for guidance. Is anyone currently successfully
running 10.x on a x4500 and if so, how are you doing it? If not, how can
I get this working?

thanks,
nomad
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


FBSD 10.3 + ZFS + Sun x4500 = utter lock up.

2017-02-07 Thread Lee Damon
FreeBSD [redacted] 10.3-STABLE FreeBSD 10.3-STABLE #2 r313008: Tue Jan
31 01:50:49 PST 2017 lvd@[redacted]:/usr/obj/usr/src/sys/GENERIC  amd64

I'm trying to get FBSD 10.3 working on a Sun x4500 (don't ask) for use
as a ZFS-based backup server. However, whenever any amount of data is
put into a zpool and then zpool scrub is run the host locks up hard. On
reboot it complains that a "Hyper transport sync flood occurred".

I found
https://lists.freebsd.org/pipermail/freebsd-stable/2012-January/065542.html
which seems to match but when I try the cpuset command mentioned there I
get an error:

; sudo cpuset -c -l 0 -x 58
cpuset: setaffinity: Invalid argument

Looks like the -c was invalid. After removing that I was informed -x 58
wasn't valid. Sure enough, there's no mpt0 or IRQ 58 on the host:

; vmstat -i
interrupt  total   rate
irq17: ohci28578  2
irq18: ohci3 473  0
irq19: ohci0 ohci1+ 4924  1
irq24: mvs0  457  0
irq32: mvs1  453  0
irq38: mvs2  451  0
irq46: mvs3 8063  1
irq52: em0152354 35
irq53: em1   140  0
irq68: mvs4  450  0
irq76: mvs5  454  0
cpu0:timer208311 48
cpu1:timer 98318 23
cpu2:timer105704 24
cpu3:timer106202 24
Total 695332162

Looking around with some help from #freebsd on efnet I found mvs0-5
which are connected to the Marvel drive controllers on the host. I then
used
  ; sudo cpuset -l 0 -x ##
where I replaced ## with 24, 32, 38, 46, 68, and 76.

After rebuilding the zpool I started writing to it. It took a lot less
time to crash - I didn't even need to run zpool scrub - but instead of
completely locking up it just rebooted. I did not see reference to the
hyper transport problem while watching it boot but given the poor
performance of the serial console I can't be 100% sure it wasn't there.

So now I turn here to ask for guidance. Is anyone currently successfully
running 10.x on a x4500 and if so, how are you doing it? If not, how can
I get this working?

thanks,
nomad
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"