Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-12-04 Thread Kurt Jaeger
Hi!

> Well, another lockup. This time after ~ 10 days of uptime. Box was idle
> at the time and just a solid freeze.

Here's another datapoint: No lockups/problems with:

AMD Ryzen Threadripper 2950X
Board: PRIME X399-A (not the -PRO)
BIOS from 10/12/2018
running: 13.0-CURRENT r340445
up 19 days, running as a package builder

-- 
p...@freebsd.org +49 171 3101372  2 years to go !
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-12-04 Thread Pete French



On 04/12/2018 18:26, Mike Tancsa wrote:


I think I will downgrade the box that was having issues to RELENG11 to
see if the problem is there too.  Unfortunately, the issue took ~ 10
days to show itself


My issues showed up on releng 11 as well as releng 12 - I didnt ry the 
last 11 stable for long though. will see how the new BIOS goes, and then 
underclock the RAM if that locks up.


-pete.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-12-04 Thread Mike Tancsa
On 12/4/2018 12:39 PM, Nimrod Levy wrote:
> Just to throw another datapoint out there, I've been running on my
> Asus prime b350+ ryzen 5 and I've been stable since the microcode
> updates over the summer. I believe I returned all the bios settings
> (cstates, memory timing, etc..) back to stock. I've had to shutdown
> and relocate (and reconfigure some networks bits) since then. I've
> also shifted to RELENG 11.2 from the STABLE code that I'd been
> running. It's been very stable for me with a current uptime of 60 days
> and before that, it was rebooted for an update. So whatever was going
> on seems to be fixed for me.


I think I will downgrade the box that was having issues to RELENG11 to
see if the problem is there too.  Unfortunately, the issue took ~ 10
days to show itself

    ---Mike


-- 
---
Mike Tancsa, tel +1 519 651 3400 x203
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-12-04 Thread Nimrod Levy
Just to throw another datapoint out there, I've been running on my Asus
prime b350+ ryzen 5 and I've been stable since the microcode updates over
the summer. I believe I returned all the bios settings (cstates, memory
timing, etc..) back to stock. I've had to shutdown and relocate (and
reconfigure some networks bits) since then. I've also shifted to RELENG
11.2 from the STABLE code that I'd been running. It's been very stable for
me with a current uptime of 60 days and before that, it was rebooted for an
update. So whatever was going on seems to be fixed for me.

On Tue, Dec 4, 2018 at 10:44 AM Pete French 
wrote:

>
>
> On 04/12/2018 15:04, Mike Tancsa wrote:
> > Well, another lockup. This time after ~ 10 days of uptime. Box was idle
> > at the time and just a solid freeze. This is an ASUS PRIME X370-PRO
> > running BIOS from 09/07/2018 (latest).  Not sure if BIOS related or OS
> > related. I have another motherboard (MSI) that I will install 12.0 on
> > and see if its any different.  There is also a micro code update from
> > AMD last week. However, there are no release notes as to what it
> > changes. I will hold off on that for now and see if I can get lockups on
> > the MSI in the next week or two.
>
> I did the opposite yesterday - I updated my microcode to the latest one
> from MSI (though the AMD update it contains is from the summer I belive,
> depsite the BIOS only comming out a couple of weeks ago). Am running
> 12-STABLE on there, and will see how long it lasts.
>
> Only got to do this yesterday as havent been near the machine for weeks.
> I didnt adjust any RAM timings and left everything at the default after
> the BIOS update.
>
> but until yesterday it has locked up about every couple of days :-(
>
> -pete.
> ___
> freebsd-stable@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-12-04 Thread Pete French



On 04/12/2018 15:04, Mike Tancsa wrote:

Well, another lockup. This time after ~ 10 days of uptime. Box was idle
at the time and just a solid freeze. This is an ASUS PRIME X370-PRO
running BIOS from 09/07/2018 (latest).  Not sure if BIOS related or OS
related. I have another motherboard (MSI) that I will install 12.0 on
and see if its any different.  There is also a micro code update from
AMD last week. However, there are no release notes as to what it
changes. I will hold off on that for now and see if I can get lockups on
the MSI in the next week or two.


I did the opposite yesterday - I updated my microcode to the latest one 
from MSI (though the AMD update it contains is from the summer I belive, 
depsite the BIOS only comming out a couple of weeks ago). Am running 
12-STABLE on there, and will see how long it lasts.


Only got to do this yesterday as havent been near the machine for weeks. 
I didnt adjust any RAM timings and left everything at the default after 
the BIOS update.


but until yesterday it has locked up about every couple of days :-(

-pete.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-12-04 Thread Mike Tancsa
On 11/26/2018 11:30 AM, Mike Tancsa wrote:
> On 11/26/2018 11:25 AM, Pete French wrote:
>> Foolwing up an old thread I know, but my ssystem ahs been pretty
>> stable until recently, when it started locking up about one a week at
>> least. This co-incided with me doing two things to it:
>>
> Interesting, I too have noticed my one test box have the odd lockup.  I
> think it started around the BETA series. Is it possible something
> "undid" one of the fixes ? I brought the box in question upto
> 12.0-PRERELEASE FreeBSD 12.0-PRERELEASE r340724 and its been fine for 5
> days. Sadly, when it was locking up (no panic, just a solid lockup) it
> took some random amount of time and generally did not matter if the box
> was loaded or not.

Well, another lockup. This time after ~ 10 days of uptime. Box was idle
at the time and just a solid freeze. This is an ASUS PRIME X370-PRO
running BIOS from 09/07/2018 (latest).  Not sure if BIOS related or OS
related. I have another motherboard (MSI) that I will install 12.0 on
and see if its any different.  There is also a micro code update from
AMD last week. However, there are no release notes as to what it
changes. I will hold off on that for now and see if I can get lockups on
the MSI in the next week or two.

    ---Mike



> -- 
> ---
> Mike Tancsa, tel +1 519 651 3400 x203
> Sentex Communications, m...@sentex.net
> Providing Internet services since 1994 www.sentex.net
> Cambridge, Ontario Canada   

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-11-26 Thread Pete French



On 26/11/2018 16:30, Mike Tancsa wrote:

Interesting, I too have noticed my one test box have the odd lockup.  I
think it started around the BETA series. Is it possible something
"undid" one of the fixes ? I brought the box in question upto
12.0-PRERELEASE FreeBSD 12.0-PRERELEASE r340724 and its been fine for 5
days. Sadly, when it was locking up (no panic, just a solid lockup) it
took some random amount of time and generally did not matter if the box
was loaded or not.


Same with mine loading seesm immaterial - I am running r340430 on mine. 
Will update when I am can, though am slightly reuctant to do remotely as 
if the  reboot fails I then need to get on a train and go fix it ;)


thanks,

-pete.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-11-26 Thread Pete French




On 26/11/2018 16:28, Eric van Gyzen wrote:

My Ryzen has never run 11, but I have never seen a single problem on 12.
  As you suggest, it's probably due to the particular hardware
combination.  If updating the BIOS doesn't help, I agree that lowering
the memory clock is the best next step.


I was going to say that I am on the latest BIOS, but then I noticed a 
new one out on the 15th, so will patch that when I get back to the 
office. Am working remotely a lot at the moment, so cant tweak much. I 
ended up putting an IP controlld power supply under y dsk to I can hard 
reboot it remotely if it locks up. Which works nicely, but would rather 
it was stable :-)


Will see about downclocking the RAM too when I am physically in front of 
it. Nice to see that someone has had no problems at all with 12 on this. 
 did run some Epyc machines in the cloud under both 11 and 12 for a 
time and never got those to lockup, so I know it not that Zen is 
unstable in general. Just have had a few issues with this machine 
spevificly.


-pete.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-11-26 Thread Mike Tancsa
On 11/26/2018 11:25 AM, Pete French wrote:
> Foolwing up an old thread I know, but my ssystem ahs been pretty
> stable until recently, when it started locking up about one a week at
> least. This co-incided with me doing two things to it:
>
Interesting, I too have noticed my one test box have the odd lockup.  I
think it started around the BETA series. Is it possible something
"undid" one of the fixes ? I brought the box in question upto
12.0-PRERELEASE FreeBSD 12.0-PRERELEASE r340724 and its been fine for 5
days. Sadly, when it was locking up (no panic, just a solid lockup) it
took some random amount of time and generally did not matter if the box
was loaded or not.

    ---Mike



-- 
---
Mike Tancsa, tel +1 519 651 3400 x203
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-11-26 Thread Eric van Gyzen
On 11/26/18 10:25 AM, Pete French wrote:
> Foolwing up an old thread I know, but my ssystem ahs been pretty stable
> until recently, when it started locking up about one a week at least.
> This co-incided with me doing two things to it:
> 
> 1) Doubling the amount of RAM in it to 16 gig, using RAM which runs a
> bit faster than the original sticks (2667 instead of 2400)
> 
> 2) Upgrading to FreeBSD 12 BET4
> 
> Now, I am really hoping the lockups are down to the RAM and that I just
> need to underclock it a bit, but I am a bit worried by the fact it
> happened the ame time as I went to 12. Has anyone else seen any issues
> under 12 when stable under 11?

My Ryzen has never run 11, but I have never seen a single problem on 12.
 As you suggest, it's probably due to the particular hardware
combination.  If updating the BIOS doesn't help, I agree that lowering
the memory clock is the best next step.

Eric
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-11-26 Thread Pete French
Foolwing up an old thread I know, but my ssystem ahs been pretty stable 
until recently, when it started locking up about one a week at least. 
This co-incided with me doing two things to it:


1) Doubling the amount of RAM in it to 16 gig, using RAM which runs a 
bit faster than the original sticks (2667 instead of 2400)


2) Upgrading to FreeBSD 12 BET4

Now, I am really hoping the lockups are down to the RAM and that I just 
need to underclock it a bit, but I am a bit worried by the fact it 
happened the ame time as I went to 12. Has anyone else seen any issues 
under 12 when stable under 11?


cheers,

-pete.

PS: I went to 12 to get Linux emulation back, as this is still broken on 
Ryzen under 11 I believe.

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-08-10 Thread Pete French



So, having been stable for quite sme time, I have had three lockups on 
my Ryzen in the last week. two were whilst I was in X11, but one 
happened over the weekend whilst I was looged out, and thus I could get 
a screenshot of what was on the console when I got in this morning:


https://www.twisted.org.uk/~pete/ryzen_crash_img_20180810_104231.jpg

Don't know if thats of any help. I noticed a new BIOS came out very 
recently, and I have updated my motherboard to that today, but these 
lockups did happen after quite a long period of stability, which 
surprises me.


-pete.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-07-30 Thread Pete French




I just brought everything upto date and re-tested and same issue.
FreeBSD 11.2-STABLE #1 r336761

# mount | grep -v zfs
devfs on /dev (devfs, local, multilabel)
linprocfs on /compat/linux/proc (linprocfs, local)
linsysfs on /compat/linux/sys (linsysfs, local)
tmpfs on /compat/linux/dev/shm (tmpfs, local)

# /compat/linux/bin/bash
Segmentation fault (core dumped)
#



I opened a bug report for this here:

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=230144

It definitely processor dependent (booting same drive on two different 
CPUs shows it up) and it does not happen on CURRENT. Very odd indeed!


-pete.


___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ?

2018-07-29 Thread Lukasz
I had the same issues - hard lockups after couple hours of work
(different loads).

My platform is:
OS: 11.1 and 11.2 (amd64)
CPU: AMD Ryzen 5 1400 Quad-Core
Motherboard: AX370-Gaming K5
ZFS pool: 10TB
BIOS: default settings (HT on)

After only upgrade to the newest BIOS everyhting works as expected - no
lockups at all. Now the machine has 47 hours uptime with zfs snapshot
received and  zfs scrub completed.

Regards,
Lukasz
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-07-27 Thread Mike Tancsa
On 7/27/2018 10:38 AM, Konstantin Belousov wrote:
>>
>> This is stock FreeBSD image r335560.
> By stock you mean that no patches were applied, right ?
> 

Correct, just the regular tree, no patches and nothing in
/etc/sysctl.conf and loader.conf just has

# cat /boot/loader.conf
kern.geom.label.disk_ident.enable="0"
kern.geom.label.gptid.enable="0"
vfs.zfs.min_auto_ashift=12
zfs_load="YES"
amdtemp_load="YES"
vmm_load="YES"
aesni_load="YES"
pf_load="YES"




I just brought everything upto date and re-tested and same issue.
FreeBSD 11.2-STABLE #1 r336761

# mount | grep -v zfs
devfs on /dev (devfs, local, multilabel)
linprocfs on /compat/linux/proc (linprocfs, local)
linsysfs on /compat/linux/sys (linsysfs, local)
tmpfs on /compat/linux/dev/shm (tmpfs, local)

# /compat/linux/bin/bash
Segmentation fault (core dumped)
#
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-07-27 Thread Konstantin Belousov
On Fri, Jul 27, 2018 at 10:09:35AM -0400, Mike Tancsa wrote:
> On 7/27/2018 10:00 AM, Pete French wrote:
> > pkg install linux_base-c7
> 
> Same deal here
> 
> 0{ryzenbsd11}# /compat/linux/bin/bash
> Segmentation fault (core dumped)
> 139{ryzenbsd11}#
> 
> 
> This is stock FreeBSD image r335560.
By stock you mean that no patches were applied, right ?

> 
> 0{ryzenbsd11}# /compat/linux/bin/bash
> Segmentation fault (core dumped)
> 139{ryzenbsd11}#
> 
> 
> pid 58901 (gio-querymodules-64), uid 0: exited on signal 11 (core dumped)
> pid 58915 (bash), uid 0: exited on signal 11 (core dumped)
> pid 58997 (gio-querymodules-64), uid 0: exited on signal 11 (core dumped)
> pid 59027 (bash), uid 0: exited on signal 11 (core dumped)
> 
> 
>   ---Mike
> 
> 
> -- 
> ---
> Mike Tancsa, tel +1 519 651 3400 x203
> Sentex Communications, m...@sentex.net
> Providing Internet services since 1994 www.sentex.net
> Cambridge, Ontario Canada
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-07-27 Thread Mike Tancsa
On 7/27/2018 10:00 AM, Pete French wrote:
> pkg install linux_base-c7

Same deal here

0{ryzenbsd11}# /compat/linux/bin/bash
Segmentation fault (core dumped)
139{ryzenbsd11}#


This is stock FreeBSD image r335560.

0{ryzenbsd11}# /compat/linux/bin/bash
Segmentation fault (core dumped)
139{ryzenbsd11}#


pid 58901 (gio-querymodules-64), uid 0: exited on signal 11 (core dumped)
pid 58915 (bash), uid 0: exited on signal 11 (core dumped)
pid 58997 (gio-querymodules-64), uid 0: exited on signal 11 (core dumped)
pid 59027 (bash), uid 0: exited on signal 11 (core dumped)


---Mike


-- 
---
Mike Tancsa, tel +1 519 651 3400 x203
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-07-27 Thread Pete French




On 27/07/2018 14:10, Mike Tancsa wrote:

I havent used the linux emulator in ages. What is the easiest way to try
this out ?


in rc.conf have

linux_enable="YES"

when you boot. then

pkg install linux_base-c7

/compat/linux/bin/bash

and you should see a bash pprompt from the Linux shell.

-pete.



___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-07-27 Thread Pete French




I highly doubt that this can be related.


Well, yes, it suroprised me too :-) Admittedly I dont have a very big 
samle set - I nly have one Ryzen box and a pair of Epyc boxes in Azure. 
but I cant make it run on any of them. I take the same OS and run it on 
Intel and its fine (literally I am cross-mounting /usr/sr and /usr/obj 
and doing an installworld on the Intel machine - its the same build).


Will try and get this onto a USB key and test further though, so that
I know the enivironment is the same.


BTW, I forgot about the patch.  If nothing happens, I will commit it
today.


Ah, thankyou, that would be great.

-pete.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-07-27 Thread Mike Tancsa
On 7/27/2018 7:01 AM, Pete French wrote:
> So, I have been running the patched kernel for quiet a while now, and it
> works fine for me, but last night I did hit a surprising issue - the
> Linux emulator does not work on Ryzen / Epyc. I tried this on two
> machines (both with the patches) and it coredumps when simply running
> bash on both of them. I copied the OS over to an Intel machine, and that
> works fine.
> 
> I have not tried running with an unpatched kernel on the Ryzen machine
> (I dont have one to hand) but I did try applying the sysctls to the
> Intel box to see if that wuld cause the Linux binaries to crash. It didn't.
I havent used the linux emulator in ages. What is the easiest way to try
this out ?

---Mike


-- 
---
Mike Tancsa, tel +1 519 651 3400 x203
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-07-27 Thread Konstantin Belousov
On Fri, Jul 27, 2018 at 12:01:09PM +0100, Pete French wrote:
> So, I have been running the patched kernel for quiet a while now, and it 
> works fine for me, but last night I did hit a surprising issue - the 
> Linux emulator does not work on Ryzen / Epyc. I tried this on two 
> machines (both with the patches) and it coredumps when simply running 
> bash on both of them. I copied the OS over to an Intel machine, and that 
> works fine.
> 
> I have not tried running with an unpatched kernel on the Ryzen machine 
> (I dont have one to hand) but I did try applying the sysctls to the 
> Intel box to see if that wuld cause the Linux binaries to crash. It didn't.

I highly doubt that this can be related.

BTW, I forgot about the patch.  If nothing happens, I will commit it
today.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-07-27 Thread Pete French
So, I have been running the patched kernel for quiet a while now, and it 
works fine for me, but last night I did hit a surprising issue - the 
Linux emulator does not work on Ryzen / Epyc. I tried this on two 
machines (both with the patches) and it coredumps when simply running 
bash on both of them. I copied the OS over to an Intel machine, and that 
works fine.


I have not tried running with an unpatched kernel on the Ryzen machine 
(I dont have one to hand) but I did try applying the sysctls to the 
Intel box to see if that wuld cause the Linux binaries to crash. It didn't.


-pete.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-07-05 Thread Konstantin Belousov
On Thu, Jul 05, 2018 at 02:58:29PM +0100, Pete French wrote:
> > Which other files ?
> 
> sys/x86/include/specialreg.h and sys/x86/x86/cpu_machdep.c
> 
> Those are in your original patch as well as the change
> to sys/amd64/amd64/initcpu.c, but your email earlier only
> patches sys/amd64/amd64/initcpu.c and not the others.
> 
> So I assumed I would keep the changes to the other two files ?


Right, I forgot about mwait. specialreg.h is cosmetics which I already
committed.

diff --git a/sys/amd64/amd64/initcpu.c b/sys/amd64/amd64/initcpu.c
index ccc5e64d0c4..bb342f42dec 100644
--- a/sys/amd64/amd64/initcpu.c
+++ b/sys/amd64/amd64/initcpu.c
@@ -130,6 +130,30 @@ init_amd(void)
}
}
 
+   /* Ryzen erratas. */
+   if (CPUID_TO_FAMILY(cpu_id) == 0x17 && CPUID_TO_MODEL(cpu_id) == 0x1 &&
+   (cpu_feature2 & CPUID2_HV) == 0) {
+   /* 1021 */
+   msr = rdmsr(0xc0011029);
+   msr |= 0x2000;
+   wrmsr(0xc0011029, msr);
+
+   /* 1033 */
+   msr = rdmsr(0xc0011020);
+   msr |= 0x10;
+   wrmsr(0xc0011020, msr);
+
+   /* 1049 */
+   msr = rdmsr(0xc0011028);
+   msr |= 0x10;
+   wrmsr(0xc0011028, msr);
+
+   /* 1095 */
+   msr = rdmsr(0xc0011020);
+   msr |= 0x200;
+   wrmsr(0xc0011020, msr);
+   }
+
/*
 * Work around a problem on Ryzen that is triggered by executing
 * code near the top of user memory, in our case the signal
diff --git a/sys/x86/x86/cpu_machdep.c b/sys/x86/x86/cpu_machdep.c
index d897d518cbc..3416f949686 100644
--- a/sys/x86/x86/cpu_machdep.c
+++ b/sys/x86/x86/cpu_machdep.c
@@ -709,6 +709,13 @@ cpu_idle_tun(void *unused __unused)
 
if (TUNABLE_STR_FETCH("machdep.idle", tunvar, sizeof(tunvar)))
cpu_idle_selector(tunvar);
+   else if (cpu_vendor_id == CPU_VENDOR_AMD &&
+   CPUID_TO_FAMILY(cpu_id) == 0x17 && CPUID_TO_MODEL(cpu_id) == 0x1) {
+   /* Ryzen erratas 1057, 1109. */
+   cpu_idle_selector("hlt");
+   idle_mwait = 0;
+   }
+
if (cpu_vendor_id == CPU_VENDOR_INTEL && cpu_id == 0x506c9) {
/*
 * Apollo Lake errata APL31 (public errata APL30).
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-07-05 Thread Pete French
> This is true, but absolutely irrelevant.
>
> Modern CPUs have hundreds, if not thousands, MSR registers.  Only some of
> them define architectural state, and saved/restored on the context switches.
> Chicken bits are global knobs not relevant to the vmm entry.

That actually makes far more sense. I was kind of puzzled as to how it
would work if they were per VM :-)

> Which other files ?

sys/x86/include/specialreg.h and sys/x86/x86/cpu_machdep.c

Those are in your original patch as well as the change
to sys/amd64/amd64/initcpu.c, but your email earlier only
patches sys/amd64/amd64/initcpu.c and not the others.

So I assumed I would keep the changes to the other two files ?

-pete.

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-07-05 Thread Konstantin Belousov
On Thu, Jul 05, 2018 at 02:23:15PM +0100, Pete French wrote:
> 
> 
> On 05/07/2018 11:47, Konstantin Belousov wrote:
> > Why do you state that they are saved/restored ?  What is the evidence ?
> 
> 
> https://software.intel.com/en-us/blogs/2009/06/25/virtualization-and-performance-understanding-vm-exits
> 
> specificly...
> 
> 3) "Save MSRs in the VM-exit MSR-store area."
> 
> and
> 
> 5) "Load MSRs from the VM-exit MSR-load area."
> 
> but maybe thats not actyually true, I assumed it was given its an Intel 
> document, but admittedly its not an actual specification.
This is true, but absolutely irrelevant.

Modern CPUs have hundreds, if not thousands, MSR registers.  Only some of
them define architectural state, and saved/restored on the context switches.
Chicken bits are global knobs not relevant to the vmm entry.

> 
> 
> > On VM the patch should be NOP, testing it is a waste of time IMO.
> 
> 
> OK, will ignore that then. I am running the new patch on my workstation 
> now - I still need the old patch for the other files, yes ?
Which other files ?
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-07-05 Thread Pete French




On 05/07/2018 11:47, Konstantin Belousov wrote:

Why do you state that they are saved/restored ?  What is the evidence ?



https://software.intel.com/en-us/blogs/2009/06/25/virtualization-and-performance-understanding-vm-exits

specificly...

3) "Save MSRs in the VM-exit MSR-store area."

and

5) "Load MSRs from the VM-exit MSR-load area."

but maybe thats not actyually true, I assumed it was given its an Intel 
document, but admittedly its not an actual specification.




On VM the patch should be NOP, testing it is a waste of time IMO.



OK, will ignore that then. I am running the new patch on my workstation 
now - I still need the old patch for the other files, yes ?


-pete.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-07-05 Thread Konstantin Belousov
On Thu, Jul 05, 2018 at 11:43:29AM +0100, Pete French wrote:
> > It does not make any sense to even try to access the chicken bits
> > MSRs when running under virtualization.  It is the duty of the
> > hypervisor to configure hardware. 
> 
> I would tend to agree with you :-) I was kind of surprised to read that they
> are actually saved and restored as part of a VM context switch in fact.
Why do you state that they are saved/restored ?  What is the evidence ?

> 
> > I updated the patch.
> 
> Thanks I shall try this now on my workstation and the Epyc virtual machine

On VM the patch should be NOP, testing it is a waste of time IMO.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-07-05 Thread Pete French
> It does not make any sense to even try to access the chicken bits
> MSRs when running under virtualization.  It is the duty of the
> hypervisor to configure hardware. 

I would tend to agree with you :-) I was kind of surprised to read that they
are actually saved and restored as part of a VM context switch in fact.

> I updated the patch.

Thanks I shall try this now on my workstation and the Epyc virtual machine

-pete.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-07-05 Thread Konstantin Belousov
On Thu, Jul 05, 2018 at 11:13:10AM +0100, Pete French wrote:
> So, I got my first lockup in weeks, testing with the latest stable
> and the patch which sets the kernel bits. But I cant say it its
> Ryzen related or not.
> 
> Meanwhile I also got access to an Epyc server in Azure. Am also
> runing the latest STABLE on that tp see how it goes. Interesting
> thing there is that there appears to be no access to the MSR's.
> They all appear as zerousing cpucontrol. I am not entirely surprised
> by this as the are very low level, but I di think they were saved
> and restored during context switches between virtual machines so I
> was hoping to be able to set them. Is this normal ?

It does not make any sense to even try to access the chicken bits
MSRs when running under virtualization.  It is the duty of the
hypervisor to configure hardware.  I updated the patch.

diff --git a/sys/amd64/amd64/initcpu.c b/sys/amd64/amd64/initcpu.c
index ccc5e64d0c4..bb342f42dec 100644
--- a/sys/amd64/amd64/initcpu.c
+++ b/sys/amd64/amd64/initcpu.c
@@ -130,6 +130,30 @@ init_amd(void)
}
}
 
+   /* Ryzen erratas. */
+   if (CPUID_TO_FAMILY(cpu_id) == 0x17 && CPUID_TO_MODEL(cpu_id) == 0x1 &&
+   (cpu_feature2 & CPUID2_HV) == 0) {
+   /* 1021 */
+   msr = rdmsr(0xc0011029);
+   msr |= 0x2000;
+   wrmsr(0xc0011029, msr);
+
+   /* 1033 */
+   msr = rdmsr(0xc0011020);
+   msr |= 0x10;
+   wrmsr(0xc0011020, msr);
+
+   /* 1049 */
+   msr = rdmsr(0xc0011028);
+   msr |= 0x10;
+   wrmsr(0xc0011028, msr);
+
+   /* 1095 */
+   msr = rdmsr(0xc0011020);
+   msr |= 0x200;
+   wrmsr(0xc0011020, msr);
+   }
+
/*
 * Work around a problem on Ryzen that is triggered by executing
 * code near the top of user memory, in our case the signal
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-07-05 Thread Pete French
So, I got my first lockup in weeks, testing with the latest stable
and the patch which sets the kernel bits. But I cant say it its
Ryzen related or not.

Meanwhile I also got access to an Epyc server in Azure. Am also
runing the latest STABLE on that tp see how it goes. Interesting
thing there is that there appears to be no access to the MSR's.
They all appear as zerousing cpucontrol. I am not entirely surprised
by this as the are very low level, but I di think they were saved
and restored during context switches between virtual machines so I
was hoping to be able to set them. Is this normal ?

-pete.

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-07-03 Thread Pete French




On 03/07/2018 11:09, Konstantin Belousov wrote:

On Tue, Jul 03, 2018 at 10:27:06AM +0100, Pete French wrote:



It is very likely that the latest microcode sets the chicken bits for the
known erratas already.  AFAIK, this is the best that a ucode update
can typically do anyway.



I just did some testing - it does do these bits:

By 'it' you mean the microcode update/BIOS on your board ?


Yes, sorry. As I was testing without the patch I looked to see what the 
values were that it had set.


I may have got my hex wrong on the last two though I have to say - it 
may be setting all four bits.


-pete.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-07-03 Thread Konstantin Belousov
On Tue, Jul 03, 2018 at 10:27:06AM +0100, Pete French wrote:
> 
> > It is very likely that the latest microcode sets the chicken bits for the
> > known erratas already.  AFAIK, this is the best that a ucode update
> > can typically do anyway.
> >
> 
> I just did some testing - it does do these bits:
By 'it' you mean the microcode update/BIOS on your board ?

> 
> 
>  cpucontrol -m '0xc0011029|=0x2000' $x
>  cpucontrol -m '0xc0011020|=0x10' $x
> 
> but it does not do these bits:
> 
>  cpucontrol -m '0xc0011028|=0x10' $x
>  cpucontrol -m '0xc0011020|=0x200' $x
> 
> (though someone else might want to doubel check that as I may have 
> miscounted the bits!)
> 
> am going to trey your patch today
> 
> -pete.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-07-03 Thread Pete French




It is very likely that the latest microcode sets the chicken bits for the
known erratas already.  AFAIK, this is the best that a ucode update
can typically do anyway.



I just did some testing - it does do these bits:


cpucontrol -m '0xc0011029|=0x2000' $x
cpucontrol -m '0xc0011020|=0x10' $x

but it does not do these bits:

cpucontrol -m '0xc0011028|=0x10' $x
cpucontrol -m '0xc0011020|=0x200' $x

(though someone else might want to doubel check that as I may have 
miscounted the bits!)


am going to trey your patch today

-pete.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-07-01 Thread Konstantin Belousov
On Sun, Jul 01, 2018 at 11:15:56AM +0100, Pete French wrote:
> > This should be the kernel patch equivalent to the script.
> 
> Ah, thankyou. I shall give this a try on tuesday when I am
> physically back in front of the machine. I have been trying without
> the oath as you asked by the way, and with the latest microcode
> update (0x8001137) it also seems stable, without these tweaks. But I
> havent stressed it too much - if the errata says to set the bits then
> we should set the bits.

It is very likely that the latest microcode sets the chicken bits for the
known erratas already.  AFAIK, this is the best that a ucode update
can typically do anyway.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-07-01 Thread Pete French
> This should be the kernel patch equivalent to the script.

Ah, thankyou. I shall give this a try on tuesday when I am
physically back in front of the machine. I have been trying without
the oath as you asked by the way, and with the latest microcode
update (0x8001137) it also seems stable, without these tweaks. But I
havent stressed it too much - if the errata says to set the bits then
we should set the bits.

> According to the revision document, some of the erratas are applicable
> to the Ryzen 2, but I do not want to do the bit tweaking without a
> confirmation.

I was wndeting about that - but I dont have a Ryzen 2 to hand to test
with unfortunately.

Will let you know how the patch goes next week, thanks,

-pete.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-07-01 Thread Pete French
> % sudo x86info -a | grep Microcode
> Microcode patch level: 0x8001136
>
> Without that script, the system would lockup up to 5-6 times a day.
> Now running without any lockup at all for 3 days, with all kinds
> of workload from idle to torture tests. Too early to tell, but it
> looks good for now.

This is interseting to me because (as per previous emial) I
have tried *without* the script, and it seems stable to me with that
too!

root@dilbert:/home/petefrench # x86info -a | grep Microcode
Microcode patch level: 0x8001137

So, I am running a slightly later microcode than you are - want to give
that a try without the atch maybe ? Though I see its going into the kernel
which makes me very happy, as I prefer stability over pretty much anything
else on my machines :-) 

-pete.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-06-30 Thread Konstantin Belousov
On Tue, Jun 26, 2018 at 01:32:01PM +0100, Pete French wrote:
> the dmesg wraps around if I boot verbosely, but heres the contnets of
> /var/log/messages from the time it starts to where it stops
> talking about CPU specific stuff... if you need something else then
> let me know - this is an easy machine to reboot and play about with.

This should be the kernel patch equivalent to the script.
According to the revision document, some of the erratas are applicable
to the Ryzen 2, but I do not want to do the bit tweaking without a
confirmation.

diff --git a/sys/amd64/amd64/initcpu.c b/sys/amd64/amd64/initcpu.c
index ccc5e64d0c4..aac3ccb7c73 100644
--- a/sys/amd64/amd64/initcpu.c
+++ b/sys/amd64/amd64/initcpu.c
@@ -130,6 +130,29 @@ init_amd(void)
}
}
 
+   /* Ryzen erratas. */
+   if (CPUID_TO_FAMILY(cpu_id) == 0x17 && CPUID_TO_MODEL(cpu_id) == 0x1) {
+   /* 1021 */
+   msr = rdmsr(0xc0011029);
+   msr |= 0x2000;
+   wrmsr(0xc0011029, msr);
+
+   /* 1033 */
+   msr = rdmsr(0xc0011020);
+   msr |= 0x10;
+   wrmsr(0xc0011020, msr);
+
+   /* 1049 */
+   msr = rdmsr(0xc0011028);
+   msr |= 0x10;
+   wrmsr(0xc0011028, msr);
+
+   /* 1095 */
+   msr = rdmsr(0xc0011020);
+   msr |= 0x200;
+   wrmsr(0xc0011020, msr);
+   }
+
/*
 * Work around a problem on Ryzen that is triggered by executing
 * code near the top of user memory, in our case the signal
diff --git a/sys/x86/include/specialreg.h b/sys/x86/include/specialreg.h
index 0ea6e61e652..c3900dadf05 100644
--- a/sys/x86/include/specialreg.h
+++ b/sys/x86/include/specialreg.h
@@ -998,18 +998,18 @@
 #defineMSR_TOP_MEM 0xc001001a  /* boundary for ram below 4G */
 #defineMSR_TOP_MEM20xc001001d  /* boundary for ram above 4G */
 #defineMSR_NB_CFG1 0xc001001f  /* NB configuration 1 */
+#defineMSR_K8_UCODE_UPDATE 0xc0010020  /* update microcode */
+#defineMSR_MC0_CTL_MASK 0xc0010044
 #defineMSR_P_STATE_LIMIT 0xc0010061/* P-state Current Limit 
Register */
 #defineMSR_P_STATE_CONTROL 0xc0010062  /* P-state Control Register */
 #defineMSR_P_STATE_STATUS 0xc0010063   /* P-state Status Register */
 #defineMSR_P_STATE_CONFIG(n) (0xc0010064 + (n)) /* P-state Config */
 #defineMSR_SMM_ADDR0xc0010112  /* SMM TSEG base address */
 #defineMSR_SMM_MASK0xc0010113  /* SMM TSEG address mask */
+#defineMSR_VM_CR   0xc0010114  /* SVM: feature control */
+#defineMSR_VM_HSAVE_PA 0xc0010117  /* SVM: host save area address 
*/
 #defineMSR_EXTFEATURES 0xc0011005  /* Extended CPUID Features 
override */
 #defineMSR_IC_CFG  0xc0011021  /* Instruction Cache 
Configuration */
-#defineMSR_K8_UCODE_UPDATE 0xc0010020  /* update microcode */
-#defineMSR_MC0_CTL_MASK0xc0010044
-#defineMSR_VM_CR   0xc0010114 /* SVM: feature control */
-#defineMSR_VM_HSAVE_PA 0xc0010117 /* SVM: host save area 
address */
 
 /* MSR_VM_CR related */
 #defineVM_CR_SVMDIS0x10/* SVM: disabled by BIOS */
diff --git a/sys/x86/x86/cpu_machdep.c b/sys/x86/x86/cpu_machdep.c
index d897d518cbc..3416f949686 100644
--- a/sys/x86/x86/cpu_machdep.c
+++ b/sys/x86/x86/cpu_machdep.c
@@ -709,6 +709,13 @@ cpu_idle_tun(void *unused __unused)
 
if (TUNABLE_STR_FETCH("machdep.idle", tunvar, sizeof(tunvar)))
cpu_idle_selector(tunvar);
+   else if (cpu_vendor_id == CPU_VENDOR_AMD &&
+   CPUID_TO_FAMILY(cpu_id) == 0x17 && CPUID_TO_MODEL(cpu_id) == 0x1) {
+   /* Ryzen erratas 1057, 1109. */
+   cpu_idle_selector("hlt");
+   idle_mwait = 0;
+   }
+
if (cpu_vendor_id == CPU_VENDOR_INTEL && cpu_id == 0x506c9) {
/*
 * Apollo Lake errata APL31 (public errata APL30).
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-06-29 Thread cpghost
On 06/26/18 13:29, Konstantin Belousov wrote:
> On Tue, Jun 26, 2018 at 11:31:26AM +0100, Pete French wrote:
>>> On 06/18/2018 09:34, Pete French wrote:
 Preseumably in the slightly longer term these workarounds go into the
 actual kernel if it detects Ryzen ?
>>>
>>> Yes, Kostik said he would code this into the kernel after he gets enough
>>> feedback.
>>
>> So, I've been running with the sysctl and cputl fixes from
>> https://lists.freebsd.org/pipermail/freebsd-current/2018-June/069799.html
>> for a couple of weeks now, with all the default settings back on (including
>> SMT) and it now completely stable, so consider this one more point of 
>> feedback

Same here on

FreeBSD monster 11.2-BETA2 FreeBSD 11.2-BETA2 #1 r334062: Tue May 22 23:46:29 
CEST 2018 root@monster:/usr/obj/usr/src/sys/GENERIC  amd64

with a Threadripper 1950X and

% sudo x86info -a | grep Microcode
Microcode patch level: 0x8001136

Without that script, the system would lockup up to 5-6 times a day.
Now running without any lockup at all for 3 days, with all kinds
of workload from idle to torture tests. Too early to tell, but it
looks good for now.

CPU: AMD Ryzen Threadripper 1950X 16-Core Processor  (3393.71-MHz K8-class CPU)
  Origin="AuthenticAMD"  Id=0x800f11  Family=0x17  Model=0x1  Stepping=1
Features=0x178bfbff
Features2=0x7ed8320b
AMD Features=0x2e500800
AMD Features2=0x35c233ff
Structured Extended 
Features=0x209c01a9
XSAVE Features=0xf
AMD Extended Feature Extensions ID EBX=0x1007
SVM: (disabled in BIOS) NP,NRIP,VClean,AFlush,DAssist,NAsids=32768
TSC: P-state invariant, performance statistics

Thanks,
-cpghost.



smime.p7s
Description: S/MIME Cryptographic Signature


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-06-26 Thread Pete French

Gah! I my memory said it was /var/boot - so close :-) Thanks!

On 26/06/2018 15:22, Freddie Cash wrote:



On Tue, Jun 26, 2018, 5:33 AM Pete French, > wrote:



 > Also, please show the 100 first lines of the verbose boot dmesg
on this
 > machine.

the dmesg wraps around if I boot verbosely, but heres the contnets of
/var/log/messages from the time it starts to where it stops
talking about CPU specific stuff... if you need something else then
let me know - this is an easy machine to reboot and play about with.


/var/run/dmesg.boot is there for this various reason (dmesg buffer 
rolling over). :) It's the dmesg output for the current boot.


Cheers,
Freddie


___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-06-26 Thread Freddie Cash
On Tue, Jun 26, 2018, 5:33 AM Pete French, 
wrote:

>
> > Also, please show the 100 first lines of the verbose boot dmesg on this
> > machine.
>
> the dmesg wraps around if I boot verbosely, but heres the contnets of
> /var/log/messages from the time it starts to where it stops
> talking about CPU specific stuff... if you need something else then
> let me know - this is an easy machine to reboot and play about with.
>

/var/run/dmesg.boot is there for this various reason (dmesg buffer rolling
over). :) It's the dmesg output for the current boot.

Cheers,
Freddie

>
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-06-26 Thread Pete French
> If you run without the script, with the same settings, do you experience
> problems ?

I have not tried this since the last BIOPS update which brought with it
the latest AEGSA version. Previously the machine would lock uop
if I enabled SMT though which is why looking at those settings
and cross referencing them to the erata was very interesting.

I just rebooted the machine to get you the log, and am going to
run without applyingthose fixes for the rest of the week to see how 
it goes though. Will let you know one way or the other on Friday,
and will do my best to lock it up before then.

> Also, please show the 100 first lines of the verbose boot dmesg on this
> machine.

the dmesg wraps around if I boot verbosely, but heres the contnets of
/var/log/messages from the time it starts to where it stops
talking about CPU specific stuff... if you need something else then
let me know - this is an easy machine to reboot and play about with.

-pete.

Jun 26 13:13:26 dilbert kernel: Table 'FACP' at 0xdd0fb6b8
Jun 26 13:13:26 dilbert kernel: Table 'APIC' at 0xdd0fb7d0
Jun 26 13:13:26 dilbert kernel: APIC: Found table at 0xdd0fb7d0
Jun 26 13:13:26 dilbert kernel: APIC: Using the MADT enumerator.
Jun 26 13:13:26 dilbert kernel: MADT: Found CPU APIC ID 0 ACPI ID 1: enabled
Jun 26 13:13:26 dilbert kernel: SMP: Added CPU 0 (AP)
Jun 26 13:13:26 dilbert kernel: MADT: Found CPU APIC ID 1 ACPI ID 2: enabled
Jun 26 13:13:26 dilbert kernel: SMP: Added CPU 1 (AP)
Jun 26 13:13:26 dilbert kernel: MADT: Found CPU APIC ID 2 ACPI ID 3: enabled
Jun 26 13:13:26 dilbert kernel: SMP: Added CPU 2 (AP)
Jun 26 13:13:26 dilbert kernel: MADT: Found CPU APIC ID 3 ACPI ID 4: enabled
Jun 26 13:13:26 dilbert kernel: SMP: Added CPU 3 (AP)
Jun 26 13:13:26 dilbert kernel: MADT: Found CPU APIC ID 4 ACPI ID 5: enabled
Jun 26 13:13:26 dilbert kernel: SMP: Added CPU 4 (AP)
Jun 26 13:13:26 dilbert kernel: MADT: Found CPU APIC ID 5 ACPI ID 6: enabled
Jun 26 13:13:26 dilbert kernel: SMP: Added CPU 5 (AP)
Jun 26 13:13:26 dilbert kernel: MADT: Found CPU APIC ID 6 ACPI ID 7: enabled
Jun 26 13:13:26 dilbert kernel: SMP: Added CPU 6 (AP)
Jun 26 13:13:26 dilbert kernel: MADT: Found CPU APIC ID 7 ACPI ID 8: enabled
Jun 26 13:13:26 dilbert kernel: SMP: Added CPU 7 (AP)
Jun 26 13:13:26 dilbert kernel: MADT: Found CPU APIC ID 8 ACPI ID 9: enabled
Jun 26 13:13:26 dilbert kernel: SMP: Added CPU 8 (AP)
Jun 26 13:13:26 dilbert kernel: MADT: Found CPU APIC ID 9 ACPI ID 10: enabled
Jun 26 13:13:26 dilbert kernel: SMP: Added CPU 9 (AP)
Jun 26 13:13:26 dilbert kernel: MADT: Found CPU APIC ID 10 ACPI ID 11: enabled
Jun 26 13:13:26 dilbert kernel: SMP: Added CPU 10 (AP)
Jun 26 13:13:26 dilbert kernel: MADT: Found CPU APIC ID 11 ACPI ID 12: enabled
Jun 26 13:13:26 dilbert kernel: SMP: Added CPU 11 (AP)
Jun 26 13:13:26 dilbert kernel: MADT: Found CPU APIC ID 12 ACPI ID 13: enabled
Jun 26 13:13:26 dilbert kernel: SMP: Added CPU 12 (AP)
Jun 26 13:13:26 dilbert kernel: MADT: Found CPU APIC ID 13 ACPI ID 14: enabled
Jun 26 13:13:26 dilbert kernel: SMP: Added CPU 13 (AP)
Jun 26 13:13:26 dilbert kernel: MADT: Found CPU APIC ID 14 ACPI ID 15: enabled
Jun 26 13:13:26 dilbert kernel: SMP: Added CPU 14 (AP)
Jun 26 13:13:26 dilbert kernel: MADT: Found CPU APIC ID 15 ACPI ID 16: enabled
Jun 26 13:13:26 dilbert kernel: SMP: Added CPU 15 (AP)
Jun 26 13:13:26 dilbert kernel: Copyright (c) 1992-2018 The FreeBSD Project.
Jun 26 13:13:26 dilbert kernel: Copyright (c) 1979, 1980, 1983, 1986, 1988, 
1989, 1991, 1992, 1993, 1994
Jun 26 13:13:26 dilbert kernel: The Regents of the University of 
California. All rights reserved.
Jun 26 13:13:26 dilbert kernel: FreeBSD is a registered trademark of The 
FreeBSD Foundation.
Jun 26 13:13:26 dilbert kernel: FreeBSD 11.2-STABLE #0 r335659: Tue Jun 26 
10:47:47 BST 2018
Jun 26 13:13:26 dilbert kernel: 
petefre...@dilbert.london-internal.ingresso.co.uk:/usr/obj/usr/src/sys/GENERIC 
amd64
Jun 26 13:13:26 dilbert kernel: FreeBSD clang version 6.0.0 
(tags/RELEASE_600/final 326565) (based on LLVM 6.0.0)
Jun 26 13:13:26 dilbert kernel: Table 'FACP' at 0xdd0fb6b8
Jun 26 13:13:26 dilbert kernel: Table 'APIC' at 0xdd0fb7d0
Jun 26 13:13:26 dilbert kernel: Table 'FPDT' at 0xdd0fb8b0
Jun 26 13:13:26 dilbert kernel: Table 'FIDT' at 0xdd0fb8f8
Jun 26 13:13:26 dilbert kernel: Table 'SSDT' at 0xdd0fb998
Jun 26 13:13:26 dilbert kernel: Table 'SSDT' at 0xdd104630
Jun 26 13:13:26 dilbert kernel: Table 'CRAT' at 0xdd106948
Jun 26 13:13:26 dilbert kernel: Table 'CDIT' at 0xdd107898
Jun 26 13:13:26 dilbert kernel: Table 'SSDT' at 0xdd1078c8
Jun 26 13:13:26 dilbert kernel: Table 'MCFG' at 0xdd10a660
Jun 26 13:13:26 dilbert kernel: Table 'HPET' at 0xdd10a6a0
Jun 26 13:13:26 dilbert kernel: Table 'SSDT' at 0xdd10a6d8
Jun 26 13:13:26 dilbert kernel: Table 'UEFI' at 0xdd10a700
Jun 26 13:13:26 dilbert kernel: Table 'IVRS' at 0xdd10a748
Jun 26 13:13:26 dilbert kernel: Table 'SSDT' at 0xdd10a818
Jun 26 13:13:26 dilbert kernel: Table 'SSDT' at 0xdd10a910
Jun 26 13:13:26 

Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-06-26 Thread Konstantin Belousov
On Tue, Jun 26, 2018 at 11:31:26AM +0100, Pete French wrote:
> > On 06/18/2018 09:34, Pete French wrote:
> > > Preseumably in the slightly longer term these workarounds go into the
> > > actual kernel if it detects Ryzen ?
> >
> > Yes, Kostik said he would code this into the kernel after he gets enough
> > feedback.
> 
> So, I've been running with the sysctl and cputl fixes from
> https://lists.freebsd.org/pipermail/freebsd-current/2018-June/069799.html
> for a couple of weeks now, with all the default settings back on (including
> SMT) and it now completely stable, so consider this one more point of feedback

If you run without the script, with the same settings, do you experience
problems ?

Also, please show the 100 first lines of the verbose boot dmesg on this
machine.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-06-26 Thread Pete French
> On 06/18/2018 09:34, Pete French wrote:
> > Preseumably in the slightly longer term these workarounds go into the
> > actual kernel if it detects Ryzen ?
>
> Yes, Kostik said he would code this into the kernel after he gets enough
> feedback.

So, I've been running with the sysctl and cputl fixes from
https://lists.freebsd.org/pipermail/freebsd-current/2018-June/069799.html
for a couple of weeks now, with all the default settings back on (including
SMT) and it now completely stable, so consider this one more point of feedback
for incorporating thois in the the kernel :-)

cheers,

-pete.

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-06-18 Thread Eric van Gyzen
On 06/18/2018 09:34, Pete French wrote:
> Preseumably in the slightly longer term these workarounds go into the
> actual kernel if it detects Ryzen ?

Yes, Kostik said he would code this into the kernel after he gets enough
feedback.

Eric
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-06-18 Thread Pete French




On 18/06/2018 14:59, Mike Tancsa wrote:

FYI, both my Epyc and Ryzen system have been running 2+ days with the
tests that would normally hard lock the system in 5-60 min. The combo of
Microcode updates and system settings


Thats for the update - I turned all the default motherboard settings 
back on when I got into work this morning, and it has been running fine 
so far. WIll run like this all week.



https://lists.freebsd.org/pipermail/freebsd-current/2018-June/069799.html

seem to have fixed the last issue I was seeing


Yes, those are the chnages I have applied. I went through that list over 
the weekend cross-referncing them to the errata document, and there are 
a couple of things in there which made me think 'aha!' - particularly 
the MWWAIT causing lockups on SMT, as it was disbling SMT that made it 
stable for me.


I also have access to a cpuple of (virtualised) Epyc machines which I am 
going to run up later in the week and try the workarounds there. But it 
does look very much like its fixed, which is excellent.


Preseumably in the slightly longer term these workarounds go into the 
actual kernel if it detects Ryzen ?


-pete.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-06-18 Thread Mike Tancsa
On 6/14/2018 6:30 AM, Pete French wrote:
>> Check out this thread on current. I re-ran the tests I did in Feb to
>> lockup the Ryzen box, and they are ok now with the latest micro code
>> updates from AMD.  I will let the tests run a good 48hrs, but in the
>> past it would only take 5-10 min to cause a hard lockup
>>
>> https://lists.freebsd.org/pipermail/freebsd-current/2018-June/069799.html
> 
> Ah, excellent, tanksf for the pinter. I was already running the latest
> microcode, but have also now flashed my BIOS and applied the shell
> script fixes from that thread. Will try running with SMT re-enabled
> next week, as I need the machine not to lockup the next few days
> when I am not in the office, but it looks very promising.

FYI, both my Epyc and Ryzen system have been running 2+ days with the
tests that would normally hard lock the system in 5-60 min. The combo of
Microcode updates and system settings

https://lists.freebsd.org/pipermail/freebsd-current/2018-June/069799.html

seem to have fixed the last issue I was seeing

---Mike

-- 
---
Mike Tancsa, tel +1 519 651 3400 x203
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-06-14 Thread Pete French
> Check out this thread on current. I re-ran the tests I did in Feb to
> lockup the Ryzen box, and they are ok now with the latest micro code
> updates from AMD.  I will let the tests run a good 48hrs, but in the
> past it would only take 5-10 min to cause a hard lockup
>
> https://lists.freebsd.org/pipermail/freebsd-current/2018-June/069799.html

Ah, excellent, tanksf for the pinter. I was already running the latest
microcode, but have also now flashed my BIOS and applied the shell
script fixes from that thread. Will try running with SMT re-enabled
next week, as I need the machine not to lockup the next few days
when I am not in the office, but it looks very promising.

-pete.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-06-13 Thread Nimrod Levy
Any idea when that microcode was published? I updated to the latest BIOS
last time I saw a lockup and I've been stable for 30 days which I think is
a new record for this system. I installed the port to update the microcode
and it showed that I already had the latest version, so I assume it came
with the BIOS update.

In any case, this is encouraging information.

On Wed, Jun 13, 2018 at 4:48 PM Mike Tancsa  wrote:

> On 6/2/2018 4:53 AM, Pete French wrote:
> > So,I notice that
> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=225584 was
> > closes as fixed. I cant remember if there was another bug report for
> ongoing
> > Ryzen issues at all - I have been experimenting and have re-enabled most
> of
> > the BIOS setting and tweaks fne, but I still need SMP disbled or it
> > locks up. Havent tried for a week or so with that, and maybe some
> > of the latest chnages in STABLE will help. is this fixed for
> > everyone else, or are you all stll running with SMP off ?
>
> Check out this thread on current. I re-ran the tests I did in Feb to
> lockup the Ryzen box, and they are ok now with the latest micro code
> updates from AMD.  I will let the tests run a good 48hrs, but in the
> past it would only take 5-10 min to cause a hard lockup
>
> https://lists.freebsd.org/pipermail/freebsd-current/2018-June/069799.html
>
> ---Mike
>
>
> --
> ---
> Mike Tancsa, tel +1 519 651 3400 x203
> Sentex Communications, m...@sentex.net
> Providing Internet services since 1994 www.sentex.net
> Cambridge, Ontario Canada
> ___
> freebsd-stable@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-06-13 Thread Mike Tancsa
On 6/2/2018 4:53 AM, Pete French wrote:
> So,I notice that https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=225584 was
> closes as fixed. I cant remember if there was another bug report for ongoing
> Ryzen issues at all - I have been experimenting and have re-enabled most of
> the BIOS setting and tweaks fne, but I still need SMP disbled or it
> locks up. Havent tried for a week or so with that, and maybe some
> of the latest chnages in STABLE will help. is this fixed for
> everyone else, or are you all stll running with SMP off ?

Check out this thread on current. I re-ran the tests I did in Feb to
lockup the Ryzen box, and they are ok now with the latest micro code
updates from AMD.  I will let the tests run a good 48hrs, but in the
past it would only take 5-10 min to cause a hard lockup

https://lists.freebsd.org/pipermail/freebsd-current/2018-June/069799.html

---Mike


-- 
---
Mike Tancsa, tel +1 519 651 3400 x203
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-06-03 Thread Pete French




On 03/06/2018 18:49, Mike Tancsa wrote:


I have c-states disabled on the ryzen both for FreeBSD and Linux.  To
disable SMP on the Epyc doesnt seem to be possible. But then again kill
off 31 cores is a heavy cost to pay for stability :(  When I am back at
the office, I will see if a recent checkout of HEAD still freezes the Epyc.


Agreed about the performance hit turning off SMP! Am hoping its a 
temporary thing, and if we can show it *is* SMP then maybe that will 
hepl with a fix, as there cant be that many places where SMP is handled 
differently to real cores I would think. If you could test then that 
would be great!


cheers,

-pete.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-06-03 Thread Mike Tancsa
On 6/3/2018 1:10 PM, Pete French wrote:
> 
>> The compile bug has been fixed for me.  However, last I checked I can
>> still freeze a system by generating a lot of network traffic between VMs
>> in either bhyve or virtbox.  Its been a while since I tested (couple of
>> months) but I dont recall anything obviously committed that highlighted
>> that issue.  Note this is on Epyc and Ryzen boxes
> 
> Yes, it was your iperf3 tests which enable me to freeze the system with
> SMP enabled. With SMP disabled it works fine, however. have you tried
> that at all ? The only other settings I ahve are the global C-states
> and cool-n-quiet being disabled. I dont really care about either so
> havent tested re-enabling them.

I have c-states disabled on the ryzen both for FreeBSD and Linux.  To
disable SMP on the Epyc doesnt seem to be possible. But then again kill
off 31 cores is a heavy cost to pay for stability :(  When I am back at
the office, I will see if a recent checkout of HEAD still freezes the Epyc.

---Mike


-- 
---
Mike Tancsa, tel +1 519 651 3400 x203
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-06-03 Thread Pete French




The compile bug has been fixed for me.  However, last I checked I can
still freeze a system by generating a lot of network traffic between VMs
in either bhyve or virtbox.  Its been a while since I tested (couple of
months) but I dont recall anything obviously committed that highlighted
that issue.  Note this is on Epyc and Ryzen boxes


Yes, it was your iperf3 tests which enable me to freeze the system with
SMP enabled. With SMP disabled it works fine, however. have you tried
that at all ? The only other settings I ahve are the global C-states
and cool-n-quiet being disabled. I dont really care about either so
havent tested re-enabling them.

-pete.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-06-03 Thread Nimrod Levy
For my case, it's been a while since I fiddled with any thing. From memory,
the last time I worked on this, I updated to the latest BIOS on the Asus
prime b350+ leaving the settings we've talked about here at stock
(c-states, memory clock, etc...), updated to 11-stable, and let things run
for a while. This time around, I'm up for almost 3 weeks which is about
what I've been getting, so I'm looking for a lockup soon.

I think the last lockup I saw was a little different from before. In the
past, the console went blank and unresponsive with the monitor showing no
signal from the computer. The last time, the screen showed the normal
output with a login prompt, but was otherwise unresponsive both on the
console and from the network.


On Sun, Jun 3, 2018 at 9:49 AM Mike Tancsa  wrote:

> On 6/2/2018 7:48 PM, Don Lewis wrote:
> > On  2 Jun, Pete French wrote:
> >> So,I notice that
> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=225584 was
> >> closes as fixed. I cant remember if there was another bug report for
> ongoing
> >> Ryzen issues at all - I have been experimenting and have re-enabled
> most of
> >> the BIOS setting and tweaks fne, but I still need SMP disbled or it
> >> locks up. Havent tried for a week or so with that, and maybe some
> >> of the latest chnages in STABLE will help. is this fixed for
> >> everyone else, or are you all stll running with SMP off ?
> >
> > With that bug fix, I get pretty much the same behavior on my Ryzen
> > machine as on my AMD FX-8320E.  BIOS settings are pretty much the just
> > the defaults.  I'm running 12.0-CURRENT, so I can't really comment on
> > 11-STABLE.
>
> The compile bug has been fixed for me.  However, last I checked I can
> still freeze a system by generating a lot of network traffic between VMs
> in either bhyve or virtbox.  Its been a while since I tested (couple of
> months) but I dont recall anything obviously committed that highlighted
> that issue.  Note this is on Epyc and Ryzen boxes
>
> ---Mike
>
>
> --
> ---
> Mike Tancsa, tel +1 519 651 3400 x203
> Sentex Communications, m...@sentex.net
> Providing Internet services since 1994 www.sentex.net
> Cambridge, Ontario Canada
> ___
> freebsd-stable@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-06-03 Thread Mike Tancsa
On 6/2/2018 7:48 PM, Don Lewis wrote:
> On  2 Jun, Pete French wrote:
>> So,I notice that https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=225584 was
>> closes as fixed. I cant remember if there was another bug report for ongoing
>> Ryzen issues at all - I have been experimenting and have re-enabled most of
>> the BIOS setting and tweaks fne, but I still need SMP disbled or it
>> locks up. Havent tried for a week or so with that, and maybe some
>> of the latest chnages in STABLE will help. is this fixed for
>> everyone else, or are you all stll running with SMP off ?
> 
> With that bug fix, I get pretty much the same behavior on my Ryzen
> machine as on my AMD FX-8320E.  BIOS settings are pretty much the just
> the defaults.  I'm running 12.0-CURRENT, so I can't really comment on
> 11-STABLE.

The compile bug has been fixed for me.  However, last I checked I can
still freeze a system by generating a lot of network traffic between VMs
in either bhyve or virtbox.  Its been a while since I tested (couple of
months) but I dont recall anything obviously committed that highlighted
that issue.  Note this is on Epyc and Ryzen boxes

---Mike


-- 
---
Mike Tancsa, tel +1 519 651 3400 x203
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-06-02 Thread Don Lewis
On  2 Jun, Pete French wrote:
> So,I notice that https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=225584 was
> closes as fixed. I cant remember if there was another bug report for ongoing
> Ryzen issues at all - I have been experimenting and have re-enabled most of
> the BIOS setting and tweaks fne, but I still need SMP disbled or it
> locks up. Havent tried for a week or so with that, and maybe some
> of the latest chnages in STABLE will help. is this fixed for
> everyone else, or are you all stll running with SMP off ?

With that bug fix, I get pretty much the same behavior on my Ryzen
machine as on my AMD FX-8320E.  BIOS settings are pretty much the just
the defaults.  I'm running 12.0-CURRENT, so I can't really comment on
11-STABLE.



___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-06-02 Thread Pete French
So,I notice that https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=225584 was
closes as fixed. I cant remember if there was another bug report for ongoing
Ryzen issues at all - I have been experimenting and have re-enabled most of
the BIOS setting and tweaks fne, but I still need SMP disbled or it
locks up. Havent tried for a week or so with that, and maybe some
of the latest chnages in STABLE will help. is this fixed for
everyone else, or are you all stll running with SMP off ?

-pete.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-04-25 Thread Mike Tancsa
On 4/24/2018 8:04 PM, Don Lewis wrote:
>>
>> I was able to lock it up with vbox. bhyve was just a little easier to
>> script and also I figured would be good to get VBox out of the mix in
>> case it was something specific to VBox.  I dont recall if I tried it
>> with SMT disabled.  Regardless, on Intel based systems I ran these tests
>> for 72hrs straight without issue.   I can sort of believe hardware issue
>> or flaky motherboard BIOSes (2 ASUS MBs, 1 MSI MB, 3 Ryzen chips), but
>> the fact that two server class MBs from SuperMicro along with an Epyc
>> chip also does the same thing makes me think something specific to
>> FreeBSD and this class of AMD CPU :(
> 
> It would be interesting to test other AMD CPUs.  I think AMD and Intel
> have some differences in the virtualization implementations.

I dont have any Opterons handy unfortunately. However, I have a feeling
its not so much due to virtualization, as I am pretty sure I had a crash
when I wasnt doing any VM testing as well.
Peter Grehan speculated it might have something to do with a lot of IPIs
being generated.  Are there any non VM workloads that will generate many
IPIs ?

---Mike


-- 
---
Mike Tancsa, tel +1 519 651 3400 x203
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-04-25 Thread Pete French



It would be interesting to test other AMD CPUs.  I think AMD and Intel
have some differences in the virtualization implementations.


I've been using AMD CPU's pretty extensively since the early 90's - back 
then I was running NeXTStep and out of the three Intel alternatives it 
was the only one which ran properly. Never seen anything like this 
before now though. In particular this Ryzen replaces the CPU on an old 
Phenom II - i.e. its the same disc and hence the same OS and VirtualBox 
config - and that ran fine.


So theres something odd about the Ryzen here.

Am also not sure its to do with the virtualisation - I got my first 
lockup with no virtualisation going on at all. Disablign SMT and the 
system hasnt locked, despite heavy virtualbox use.


-pete.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-04-24 Thread Don Lewis
On 24 Apr, Mike Tancsa wrote:
> On 4/24/2018 10:01 AM, Pete French wrote:
>> 
>> Well, I ranh the iperf tests between real machine for 24 hours and that
>> worked fine. I also then spun up a Virtualbox with Win10 in it, and ran
>> iuperf to there at the same time as doing it betwene real machines, and
>> also did a full virus scan to exercise the disc. the idea being to
>> replicate Mondays lockup.
> 
> I was able to lock it up with vbox. bhyve was just a little easier to
> script and also I figured would be good to get VBox out of the mix in
> case it was something specific to VBox.  I dont recall if I tried it
> with SMT disabled.  Regardless, on Intel based systems I ran these tests
> for 72hrs straight without issue.   I can sort of believe hardware issue
> or flaky motherboard BIOSes (2 ASUS MBs, 1 MSI MB, 3 Ryzen chips), but
> the fact that two server class MBs from SuperMicro along with an Epyc
> chip also does the same thing makes me think something specific to
> FreeBSD and this class of AMD CPU :(

It would be interesting to test other AMD CPUs.  I think AMD and Intel
have some differences in the virtualization implementations.

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-04-24 Thread Pete French
yes, I too think its definitely a Ryzen isssue with FreeBSD - is there a 
way you could try it (at least on the Desktop) with SMT off ? I havbe 
the following options tunred off:


SMT
Glocal C states
Cool-n-quiet
Core boost

The last few obviously tweak processor clocks and thus dont surprise me 
too much as a soource of instabuloty, but I am surprised that SMT causes 
issues, as that should (I belive) simply present as two cores.


On 24/04/2018 15:22, Mike Tancsa wrote:

On 4/24/2018 10:01 AM, Pete French wrote:


Well, I ranh the iperf tests between real machine for 24 hours and that
worked fine. I also then spun up a Virtualbox with Win10 in it, and ran
iuperf to there at the same time as doing it betwene real machines, and
also did a full virus scan to exercise the disc. the idea being to
replicate Mondays lockup.


I was able to lock it up with vbox. bhyve was just a little easier to
script and also I figured would be good to get VBox out of the mix in
case it was something specific to VBox.  I dont recall if I tried it
with SMT disabled.  Regardless, on Intel based systems I ran these tests
for 72hrs straight without issue.   I can sort of believe hardware issue
or flaky motherboard BIOSes (2 ASUS MBs, 1 MSI MB, 3 Ryzen chips), but
the fact that two server class MBs from SuperMicro along with an Epyc
chip also does the same thing makes me think something specific to
FreeBSD and this class of AMD CPU :(

---Mike



___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-04-24 Thread Mike Tancsa
On 4/24/2018 10:01 AM, Pete French wrote:
> 
> Well, I ranh the iperf tests between real machine for 24 hours and that
> worked fine. I also then spun up a Virtualbox with Win10 in it, and ran
> iuperf to there at the same time as doing it betwene real machines, and
> also did a full virus scan to exercise the disc. the idea being to
> replicate Mondays lockup.

I was able to lock it up with vbox. bhyve was just a little easier to
script and also I figured would be good to get VBox out of the mix in
case it was something specific to VBox.  I dont recall if I tried it
with SMT disabled.  Regardless, on Intel based systems I ran these tests
for 72hrs straight without issue.   I can sort of believe hardware issue
or flaky motherboard BIOSes (2 ASUS MBs, 1 MSI MB, 3 Ryzen chips), but
the fact that two server class MBs from SuperMicro along with an Epyc
chip also does the same thing makes me think something specific to
FreeBSD and this class of AMD CPU :(

---Mike


-- 
---
Mike Tancsa, tel +1 519 651 3400 x203
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-04-24 Thread Pete French



On 24/04/2018 14:56, Mike Tancsa wrote:

I was doing the tests with bhyve, and the iperf tests were between VMs
on the same box.  That seems to trigger it fairly quickly. The Epyc took
a bit more work, but I could reliable do it there too.


Well, I ranh the iperf tests between real machine for 24 hours and that 
worked fine. I also then spun up a Virtualbox with Win10 in it, and ran 
iuperf to there at the same time as doing it betwene real machines, and 
also did a full virus scan to exercise the disc. the idea being to 
replicate Mondays lockup.


But its all fine - the only differece being that I have disabled SMT 
again. Do you get lockup-s with SMT disabled ?


I dint have any experince with bhyve and dont have the tiime to start 
learning right now, hence me testing with VB, which I already have pa nd 
running. I can repeat with SMt on again to veiry that it does then lock up.


All very odd though - am pleased its stable, but dissapinted at the 
amout of stuff I have had to turn off to get it to that state.


-pete.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-04-24 Thread Mike Tancsa
On 4/23/2018 8:19 AM, Pete French wrote:
> 
> All worked fine until just no,w when it did lockup. But the lockup
> happened when I fired up VirtualBox as well on the Ryzen machine, had that
> doing a lot of disc activity and also did a lot of ZFS activity on the box
> itself. Which I find interesting, because your lockups were also happenng
> when VirtualBox was running were they not ?
> 
> So, am going to try that again (the iperf test and simultaneous Win10
> in VirtualBox), but with SMT off, to see if I can reproduce it.
> 


I was doing the tests with bhyve, and the iperf tests were between VMs
on the same box.  That seems to trigger it fairly quickly. The Epyc took
a bit more work, but I could reliable do it there too.

---Mike


-- 
---
Mike Tancsa, tel +1 519 651 3400 x203
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-04-23 Thread Don Lewis
On 23 Apr, Mike Tancsa wrote:
> On 4/22/2018 5:29 PM, Don Lewis wrote:
>> Pretty much all of my BIOS settings are the defaults.
>> 
>> I suspect that the idle hang issues are motherboard and/or BIOS
>> specific.  For the record my motherboard is a Gigabyte
>> GA-AX370-Gaming 5.
> Hi Don,
>   Any chance you could try that bhyve test ? Basically, or 3 VMs and then
> run iperf3 between the instances.  I can lock up all 3 of my AMD boards
> (2 ASUS, one MSI) and both my Epyc (SuperMicro) boards.

I should be able to do that, but it will likely be a few days before I
can get around to it.

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-04-23 Thread Mike Tancsa
On 4/22/2018 5:29 PM, Don Lewis wrote:
> Pretty much all of my BIOS settings are the defaults.
> 
> I suspect that the idle hang issues are motherboard and/or BIOS
> specific.  For the record my motherboard is a Gigabyte
> GA-AX370-Gaming 5.
Hi Don,
Any chance you could try that bhyve test ? Basically, or 3 VMs and then
run iperf3 between the instances.  I can lock up all 3 of my AMD boards
(2 ASUS, one MSI) and both my Epyc (SuperMicro) boards.

---Mike

-- 
---
Mike Tancsa, tel +1 519 651 3400 x203
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-04-23 Thread Pete French
> fairly pleased with that. Am going to try the test case that Mike Tancsa
> posted to see if I can get it to lockup today though,. now that I am in
> front of it to reset of required. Will report results either way...

So, I have been ran the iperf scripts all day (about 4 hours now) with
the Ryzen box being the reciever and another BSD box as the sender. I
had re-enabled SMT on the Ryzen box out of curiosity.

All worked fine until just no,w when it did lockup. But the lockup
happened when I fired up VirtualBox as well on the Ryzen machine, had that
doing a lot of disc activity and also did a lot of ZFS activity on the box
itself. Which I find interesting, because your lockups were also happenng
when VirtualBox was running were they not ?

So, am going to try that again (the iperf test and simultaneous Win10
in VirtualBox), but with SMT off, to see if I can reproduce it.

-pete.

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-04-23 Thread Pete French
> I have a Ryzen system
> parts: https://pcpartpicker.com/list/HhycYT
> dmesg: http://dmesgd.nycbug.org/index.cgi?do=view=3516
>
> which I have been using for months without issue and on completely
> default settings.

Thats encouranging -= and I think you are the first person to report
that without any tweaking to the best of my recollection!

> What mobo do you have? Did you update the BIOS to the latest version?

MSI X370 Xpower Titanium - I upgraded to that one from my original choice
ater comments on this thread about power  which I hadnt considered (I
had also decided to buy a larger Ryzen 7). The BIOS is the latest version,
and I cleared CMOS to get the default settings. Memory is some Crucial
DDR4 3200 which is underclocked to 2133. This microprocessor itself
is a Ryzen 7 1700 (not X) which was made in week 33, so should not have the
hardware fault on it.

It's been stable over the weekend, as I posted in a different email, am
going to try and knock it over today with some tests if I can.

-pete.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-04-23 Thread Pete French
> And of course now that we say something about it, I get a lockup
> overnight...

Sods law I guess. Mine survived the weekend quite happily though, so am
fairly pleased with that. Am going to try the test case that Mike Tancsa
posted to see if I can get it to lockup today though,. now that I am in
front of it to reset of required. Will report results either way...

-pete.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-04-22 Thread Don Lewis
On 20 Apr, Pete French wrote:
> So, resurrecting the thread from a few weeks ago, as I finally found
> time yesterday to out together the Ryzen machine I bought the parts for in
> Jaunary (busy year at work). All went smoothl;y, checked it
> booted up, used it for 15 minutes, was impressed by the speed and went home.
> 
> ...and by the time I got home, an hour or so later, it had locked up hard.
> 
> I was somewhat dissapointed, as I had seen various fixes go in,. and had hoped
> the issues were fixed. This morning I have booted the machine back up,
> tweaking the BIOPS to do things mentioned in this thread, viz:
> 
>   Disable Turbo Boost
>   Disable SMT
>   Disable global C-states
> 
> The memory was already ruunning correctly at 2133 (though I have locked that
> in the BIOS too) and I was already using kern.eventtimer.periodic=1, so
> the lockup was not related to those. Its the latest BIOS, and a -STABLE
> build from yesterday.
> 
> I suspect it will now be stable, but I was wondering if anyone was any further
> forward on working out which of the settings above are the ones which 'fix'
> the issue - or indeed if its really fixed, by them or just made far less 
> likely
> to happen.
> 
> Anyone got any more comments on this ?

In terms of hangs and system crashes, my Ryzen system has been stable
since the fix to relocate the shared page.

The random segfault problem during parallel builds went away when I
RMAed my original, early-build CPU.

This commit:
  r329254 | kib | 2018-02-13 16:31:45 -0800 (Tue, 13 Feb 2018) | 43 lines

  Ensure memory consistency on COW.

Fixed most of the remaining random port build errors that I had.  I
think the only remaining problem is random build failures of
guile-related ports, but I also see these on my FX-8320E, so they are
not Ryzen-specific.

Pretty much all of my BIOS settings are the defaults.

I suspect that the idle hang issues are motherboard and/or BIOS
specific.  For the record my motherboard is a Gigabyte
GA-AX370-Gaming 5.

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-04-22 Thread Nimrod Levy
And of course now that we say something about it, I get a lockup
overnight...

--
Nimrod

On Fri, Apr 20, 2018 at 7:48 AM Nimrod Levy  wrote:

> I'm really glad to see that I'm not the only one still interested in this
> thread.  I don't really have anything new to contribute. I've been getting
> about a week or so uptime out of my box. My habit has been to see if it
> hangs, then reboot and rebuild latest stable and reboot and run that for a
> while. Although now that this thread pops back up, I just realized it's
> been 2 weeks since the last cycle. I don't trust it yet, but I've seen
> clear improvement since this started.
>
>
> On Fri, Apr 20, 2018 at 7:18 AM Pete French 
> wrote:
>
>> So, resurrecting the thread from a few weeks ago, as I finally found
>> time yesterday to out together the Ryzen machine I bought the parts for in
>> Jaunary (busy year at work). All went smoothl;y, checked it
>> booted up, used it for 15 minutes, was impressed by the speed and went
>> home.
>>
>> ...and by the time I got home, an hour or so later, it had locked up hard.
>>
>> I was somewhat dissapointed, as I had seen various fixes go in,. and had
>> hoped
>> the issues were fixed. This morning I have booted the machine back up,
>> tweaking the BIOPS to do things mentioned in this thread, viz:
>>
>> Disable Turbo Boost
>> Disable SMT
>> Disable global C-states
>>
>> The memory was already ruunning correctly at 2133 (though I have locked
>> that
>> in the BIOS too) and I was already using kern.eventtimer.periodic=1, so
>> the lockup was not related to those. Its the latest BIOS, and a -STABLE
>> build from yesterday.
>>
>> I suspect it will now be stable, but I was wondering if anyone was any
>> further
>> forward on working out which of the settings above are the ones which
>> 'fix'
>> the issue - or indeed if its really fixed, by them or just made far less
>> likely
>> to happen.
>>
>> Anyone got any more comments on this ?
>>
>> cheers,
>>
>> -pete.
>> ___
>> freebsd-stable@freebsd.org mailing list
>> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
>> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>>
> --
>
> --
> Nimrod
>
-- 

--
Nimrod
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-04-21 Thread Eitan Adler
On 20 April 2018 at 04:15, Pete French  wrote:
> So, resurrecting the thread from a few weeks ago, as I finally found
> time yesterday to out together the Ryzen machine I bought the parts for in
> Jaunary (busy year at work). All went smoothl;y, checked it
> booted up, used it for 15 minutes, was impressed by the speed and went home.
>
> ...and by the time I got home, an hour or so later, it had locked up hard.
...

I have a Ryzen system
parts: https://pcpartpicker.com/list/HhycYT
dmesg: http://dmesgd.nycbug.org/index.cgi?do=view=3516

which I have been using for months without issue and on completely
default settings.

What mobo do you have? Did you update the BIOS to the latest version?


-- 
Eitan Adler
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-04-21 Thread Pete French
> Its not just the Ryzen, I can lock up Epyc CPUs as well. They take a bit
> longer, but its not that hard to repeat. Unfortunately, I had to
> allocate this hardware over the Linux where it works reliably under all
> workloads without any such lock ups with all the default BIOS settings :(

AH, thats a shame to hear - was hooing it had worked OK for you!

> Here is a test case.
> https://lists.freebsd.org/pipermail/freebsd-stable/2018-February/088433.html

Will give that a try next week - when remote form the box I dont want to
crash it, so no such tests at the weekend, but hen I am back will see.

The lockup from Thursday was aan idle-lockup, and that hasnt happened since
I chnaged the BIOS settings. May reenable those one at a time to see if
any single one is reposible, unless anyone has any insight. Much hunch would
be that SMT may be OK to re-enable as tat has nothing do do with loading,
whcih the other two do.

cheers,

-pete.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-04-20 Thread Mike Tancsa
Its not just the Ryzen, I can lock up Epyc CPUs as well. They take a bit
longer, but its not that hard to repeat. Unfortunately, I had to
allocate this hardware over the Linux where it works reliably under all
workloads without any such lock ups with all the default BIOS settings :(

Here is a test case.
https://lists.freebsd.org/pipermail/freebsd-stable/2018-February/088433.html

Peter was able to recreate it as well.

https://lists.freebsd.org/pipermail/freebsd-virtualization/2018-March/006187.html

I dont think its VM / virtualization per se.  I think Peter mentioned
something about IPIs

---Mike



On 4/20/2018 8:39 AM, Nimrod Levy wrote:
> To be honest, I can't remember for certain what's set in the BIOS right
> now. The last post I made was from 3/17 and it says I have the memory clock
> lowered and c-states disabled. I don't think I've changed anything but the
> build of stable since then.
> 
> --
> Nimrod
> 
> On Fri, Apr 20, 2018 at 7:55 AM Pete French 
> wrote:
> 
>>
>>
>> On 20/04/2018 12:48, Nimrod Levy wrote:
>>> I'm really glad to see that I'm not the only one still interested in
>>> this thread.  I don't really have anything new to contribute. I've been
>>> getting about a week or so uptime out of my box. My habit has been to
>>> see if it hangs, then reboot and rebuild latest stable and reboot and
>>> run that for a while. Although now that this thread pops back up, I just
>>> realized it's been 2 weeks since the last cycle. I don't trust it yet,
>>> but I've seen clear improvement since this started.
>>
>> What setting sdo you have - just the ones I listed or some in the BSD
>> setup itself ? I am surprised that the operson who said they were
>> getting an actual Epyc server hasn't commented again on the thread.
>> Hopefully that means it works fine on the server hardware :-)
>>
>> I have seen a few commits go through in STABLE related to VM which
>> interest me, and which might cause hangs, so we shall see how it goes.
>> Possibly looking at the Ryzen issue has flushed out a few VM problems
>> anyway, which would be good, as more stbaility is always good (and I
>> have a vague hope it might fix the mysetrious Go hangs too).
>>
>> -pete.
>>


-- 
---
Mike Tancsa, tel +1 519 651 3400 x203
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-04-20 Thread Nimrod Levy
To be honest, I can't remember for certain what's set in the BIOS right
now. The last post I made was from 3/17 and it says I have the memory clock
lowered and c-states disabled. I don't think I've changed anything but the
build of stable since then.

--
Nimrod

On Fri, Apr 20, 2018 at 7:55 AM Pete French 
wrote:

>
>
> On 20/04/2018 12:48, Nimrod Levy wrote:
> > I'm really glad to see that I'm not the only one still interested in
> > this thread.  I don't really have anything new to contribute. I've been
> > getting about a week or so uptime out of my box. My habit has been to
> > see if it hangs, then reboot and rebuild latest stable and reboot and
> > run that for a while. Although now that this thread pops back up, I just
> > realized it's been 2 weeks since the last cycle. I don't trust it yet,
> > but I've seen clear improvement since this started.
>
> What setting sdo you have - just the ones I listed or some in the BSD
> setup itself ? I am surprised that the operson who said they were
> getting an actual Epyc server hasn't commented again on the thread.
> Hopefully that means it works fine on the server hardware :-)
>
> I have seen a few commits go through in STABLE related to VM which
> interest me, and which might cause hangs, so we shall see how it goes.
> Possibly looking at the Ryzen issue has flushed out a few VM problems
> anyway, which would be good, as more stbaility is always good (and I
> have a vague hope it might fix the mysetrious Go hangs too).
>
> -pete.
>
-- 

--
Nimrod
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-04-20 Thread Pete French



On 20/04/2018 12:48, Nimrod Levy wrote:
I'm really glad to see that I'm not the only one still interested in 
this thread.  I don't really have anything new to contribute. I've been 
getting about a week or so uptime out of my box. My habit has been to 
see if it hangs, then reboot and rebuild latest stable and reboot and 
run that for a while. Although now that this thread pops back up, I just 
realized it's been 2 weeks since the last cycle. I don't trust it yet, 
but I've seen clear improvement since this started.


What setting sdo you have - just the ones I listed or some in the BSD 
setup itself ? I am surprised that the operson who said they were 
getting an actual Epyc server hasn't commented again on the thread. 
Hopefully that means it works fine on the server hardware :-)


I have seen a few commits go through in STABLE related to VM which 
interest me, and which might cause hangs, so we shall see how it goes. 
Possibly looking at the Ryzen issue has flushed out a few VM problems 
anyway, which would be good, as more stbaility is always good (and I 
have a vague hope it might fix the mysetrious Go hangs too).


-pete.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-04-20 Thread Nimrod Levy
I'm really glad to see that I'm not the only one still interested in this
thread.  I don't really have anything new to contribute. I've been getting
about a week or so uptime out of my box. My habit has been to see if it
hangs, then reboot and rebuild latest stable and reboot and run that for a
while. Although now that this thread pops back up, I just realized it's
been 2 weeks since the last cycle. I don't trust it yet, but I've seen
clear improvement since this started.


On Fri, Apr 20, 2018 at 7:18 AM Pete French 
wrote:

> So, resurrecting the thread from a few weeks ago, as I finally found
> time yesterday to out together the Ryzen machine I bought the parts for in
> Jaunary (busy year at work). All went smoothl;y, checked it
> booted up, used it for 15 minutes, was impressed by the speed and went
> home.
>
> ...and by the time I got home, an hour or so later, it had locked up hard.
>
> I was somewhat dissapointed, as I had seen various fixes go in,. and had
> hoped
> the issues were fixed. This morning I have booted the machine back up,
> tweaking the BIOPS to do things mentioned in this thread, viz:
>
> Disable Turbo Boost
> Disable SMT
> Disable global C-states
>
> The memory was already ruunning correctly at 2133 (though I have locked
> that
> in the BIOS too) and I was already using kern.eventtimer.periodic=1, so
> the lockup was not related to those. Its the latest BIOS, and a -STABLE
> build from yesterday.
>
> I suspect it will now be stable, but I was wondering if anyone was any
> further
> forward on working out which of the settings above are the ones which 'fix'
> the issue - or indeed if its really fixed, by them or just made far less
> likely
> to happen.
>
> Anyone got any more comments on this ?
>
> cheers,
>
> -pete.
> ___
> freebsd-stable@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>
-- 

--
Nimrod
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-04-20 Thread Pete French
So, resurrecting the thread from a few weeks ago, as I finally found
time yesterday to out together the Ryzen machine I bought the parts for in
Jaunary (busy year at work). All went smoothl;y, checked it
booted up, used it for 15 minutes, was impressed by the speed and went home.

...and by the time I got home, an hour or so later, it had locked up hard.

I was somewhat dissapointed, as I had seen various fixes go in,. and had hoped
the issues were fixed. This morning I have booted the machine back up,
tweaking the BIOPS to do things mentioned in this thread, viz:

Disable Turbo Boost
Disable SMT
Disable global C-states

The memory was already ruunning correctly at 2133 (though I have locked that
in the BIOS too) and I was already using kern.eventtimer.periodic=1, so
the lockup was not related to those. Its the latest BIOS, and a -STABLE
build from yesterday.

I suspect it will now be stable, but I was wondering if anyone was any further
forward on working out which of the settings above are the ones which 'fix'
the issue - or indeed if its really fixed, by them or just made far less likely
to happen.

Anyone got any more comments on this ?

cheers,

-pete.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-02-13 Thread Eric van Gyzen
On 02/12/2018 21:54, Peter Moody wrote:
>> I'm having really good luck with the kernel patch attached to this
>> message:
>> https://docs.freebsd.org/cgi/getmsg.cgi?fetch=417183+0+archive/2018/freebsd-hackers/20180211.freebsd-hackers
> 
> I'm new to this; what are the chances that this gets into -STABLE in
> the near future?

Pretty high, I imagine.  Add yourself to these to stay informed:

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=225584

https://reviews.freebsd.org/D14347

Eric
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (summary of 4 issues) (seemingly solved!)

2018-02-13 Thread Mike Tancsa
OK, this is all mostly solved for me it seems.

points below inline


On 1/24/2018 9:42 AM, Mike Tancsa wrote:
> I think perhaps a good time to summarize as a few issues seem to be going on
> 
> a) fragile BIOS settings. There seems to be a number of issues around
> RAM speeds and disabled C-STATES that impact stability.  Specifically,
> lowering the default frequency from 2400 to 2133 seems to help some
> users with crashes / lockups under heavy loads.

Also disabling core boost on non X cpus (ie 1600 vs 1600x) and making
sure the CPU is not overheating.  On my ASUS board using a back ported
version of amdtemp and amdsmn I confirmed the temp does not go above 50C
at full load.  Setting the FAN speed to turbo seems to help reduce the
max temp the CPU would get.

> b) CPUs manufactured prior to week 25 (some say week 33?) have a
> hardware defect that manifests itself as segfaults in heavy compiles.  I
> was able to confirm this on 1 of the CPUs I had using a Linux setup. It
> seems to confirm this, you need to physically look at the CPU for the
> manufacturing date :( Not sure how to trigger it on FreeBSD reliably,
> but there is a github project I used to verify on Linux
> (https://github.com/suaefar/ryzen-test)

AMD sent me 3 new CPUs without issue.  Turn around was about 1 week from
Canada to the US and back.

> 
> c) The idle lockup bug.  This *seems* to be confirmed on Linux as well
> http://blog.programster.org/ubuntu-16-04-compile-custom-kernel-for-ryzen
> and
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1690085

Perhaps the settings in a), as well as the most recent BIOS update seems
to have fixed this issue for me.  It sure seemed like a hardware issue,
but then again it could be a side effect of d). However, I was never
able to break into the debugger using a debugging kernel in HEAD so I
suspect it was more hardware related than anything.

BIOS Information
Vendor: American Megatrends Inc.
Version: 3803
Release Date: 01/22/2018
Address: 0xF
This is on a
Product Name: PRIME X370-PRO
Version: Rev X.0x


> 
> d) Compile failures of some ports.  For myself and one other user,
> compiling net/samba47 reliably hangs in roughly the same place.  Its not
> clear if this is related to any of the above bugs or not.

This too seems to be fixed!
The patch in
https://docs.freebsd.org/cgi/getmsg.cgi?fetch=417183+0+archive/2018/freebsd-hackers/20180211.freebsd-hackers

seems to stop the deadlock. I did 90 builds on RELENG_11 with this patch
over night and no deadlocks. For half the builds I had 2 guest VMs also
building. For the second half, it was the only thing running on the box
and its working as expected

All this just in time for my Epyc based system to arrive!


---Mike
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-02-12 Thread Peter Moody
On Mon, Feb 12, 2018 at 1:49 PM, Don Lewis  wrote:
> On 31 Jan, Eugene Grosbein wrote:
>> 31.01.2018 4:36, Mike Tancsa пишет:
>>> On 1/30/2018 2:51 PM, Mike Tancsa wrote:

 And sadly, I am still able to hang the compile in about the same place.
 However, if I set
>>>
>>>
>>> OK, here is a sort of work around. If I have the box a little more busy,
>>> I can avoid whatever deadlock is going on.  In another console I have
>>> cat /dev/urandom | sha256
>>> running while the build runs
>>>
>>> ... and I can compile net/samba47 from scratch without the compile
>>> hanging.  This problem also happens on HEAD from today.  Should I start
>>> a new thread on freebsd-current ? Or just file a bug report ?
>>> The compile worked 4/4
>>
>> That's really strange. Could you try to do "sysctl 
>> kern.eventtimer.periodic=1"
>> and re-do the test without extra load?
>
> I'm having really good luck with the kernel patch attached to this
> message:
> https://docs.freebsd.org/cgi/getmsg.cgi?fetch=417183+0+archive/2018/freebsd-hackers/20180211.freebsd-hackers

I'm new to this; what are the chances that this gets into -STABLE in
the near future?
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-02-12 Thread Mike Tancsa
On 2/12/2018 4:49 PM, Don Lewis wrote:
> On 31 Jan, Eugene Grosbein wrote:
>> 31.01.2018 4:36, Mike Tancsa пишет:
>>> On 1/30/2018 2:51 PM, Mike Tancsa wrote:

 And sadly, I am still able to hang the compile in about the same place.
 However, if I set
>>>
>>>
>>> OK, here is a sort of work around. If I have the box a little more busy,
>>> I can avoid whatever deadlock is going on.  In another console I have
>>> cat /dev/urandom | sha256
>>> running while the build runs
>>>
>>> ... and I can compile net/samba47 from scratch without the compile
>>> hanging.  This problem also happens on HEAD from today.  Should I start
>>> a new thread on freebsd-current ? Or just file a bug report ?
>>> The compile worked 4/4
>>
>> That's really strange. Could you try to do "sysctl 
>> kern.eventtimer.periodic=1"
>> and re-do the test without extra load?
> 
> I'm having really good luck with the kernel patch attached to this
> message:
> https://docs.freebsd.org/cgi/getmsg.cgi?fetch=417183+0+archive/2018/freebsd-hackers/20180211.freebsd-hackers
> 
> Since applying that patch, I did three poudriere runs to build the set
> of ~1700 ports that I use.  Other than one gmake-related build runaway
> that I've also seen on my AMD FX-8320E, I didn't see any random port
> build failures.  When I was last did some testing a few weeks ago,
> lang/go would almost always fail.  I also would seem random build
> failures in lang/guile or finance/gnucash (which uses guile during its
> build) on both my Ryzen and FX-8320E machines, but those built cleanly
> all three times.
> 
> I even built samba 16 times in a row without a hang.
> 


Cool!  I will give it a try!

In other news, the latest BIOS patch from ASUS for my motherboard
*seems* to have fixed the random hangs.  In the BIOS, I had to dial down
the memory one notch, disable q-states and disable core boost for my non
X Ryzen CPUs.  I did that Friday, and running a load that would
typically lock up the box, survived the weekend.  Same with the box I
have in Zoo.  No random freeze ups.

I was also able to take the amdtemp and amdsmn code from HEAD and
compile it on STABLE to confirm the CPU is / was not overheating.  Peak
temp in the low 50s even with the CPU cores are all maxed out.

---Mike

-- 
---
Mike Tancsa, tel +1 519 651 3400 x203
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD (can we sum up yet)?

2018-02-12 Thread George Mitchell
On 02/12/18 16:49, Don Lewis wrote:
> [...]
> I'm having really good luck with the kernel patch attached to this
> message:
> https://docs.freebsd.org/cgi/getmsg.cgi?fetch=417183+0+archive/2018/freebsd-hackers/20180211.freebsd-hackers
> 
> Since applying that patch, I did three poudriere runs to build the set
> of ~1700 ports that I use.  Other than one gmake-related build runaway
> that I've also seen on my AMD FX-8320E, I didn't see any random port
> build failures.  When I was last did some testing a few weeks ago,
> lang/go would almost always fail.  I also would seem random build
> failures in lang/guile or finance/gnucash (which uses guile during its
> build) on both my Ryzen and FX-8320E machines, but those built cleanly
> all three times.
> 
> I even built samba 16 times in a row without a hang.
> [...]

Until this thread started last year, I had been on the verge of
upgrading the build server on my net to a Ryzen.  Needless to say, I
postponed the change.  Now it seems there is hope that a resolution for
the issue may be in sight.  Is it time to survey everyone's experience
with the problem, and determine whether the cited patch helps?  My
sincere thanks in advance.  -- George



signature.asc
Description: OpenPGP digital signature


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-02-12 Thread Don Lewis
On 31 Jan, Eugene Grosbein wrote:
> 31.01.2018 4:36, Mike Tancsa пишет:
>> On 1/30/2018 2:51 PM, Mike Tancsa wrote:
>>>
>>> And sadly, I am still able to hang the compile in about the same place.
>>> However, if I set
>> 
>> 
>> OK, here is a sort of work around. If I have the box a little more busy,
>> I can avoid whatever deadlock is going on.  In another console I have
>> cat /dev/urandom | sha256
>> running while the build runs
>> 
>> ... and I can compile net/samba47 from scratch without the compile
>> hanging.  This problem also happens on HEAD from today.  Should I start
>> a new thread on freebsd-current ? Or just file a bug report ?
>> The compile worked 4/4
> 
> That's really strange. Could you try to do "sysctl kern.eventtimer.periodic=1"
> and re-do the test without extra load?

I'm having really good luck with the kernel patch attached to this
message:
https://docs.freebsd.org/cgi/getmsg.cgi?fetch=417183+0+archive/2018/freebsd-hackers/20180211.freebsd-hackers

Since applying that patch, I did three poudriere runs to build the set
of ~1700 ports that I use.  Other than one gmake-related build runaway
that I've also seen on my AMD FX-8320E, I didn't see any random port
build failures.  When I was last did some testing a few weeks ago,
lang/go would almost always fail.  I also would seem random build
failures in lang/guile or finance/gnucash (which uses guile during its
build) on both my Ryzen and FX-8320E machines, but those built cleanly
all three times.

I even built samba 16 times in a row without a hang.


___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-02-09 Thread Nimrod Levy
Mine was running so well.  I had almost 3 weeks of uptime before it locked
up twice last night.



On Thu, Feb 8, 2018 at 10:24 PM Peter Moody  wrote:

> to close the loop on this thread (for me at least); I replaced the
> asrock ab350 with an msi x370 and things now, just work (c). the msi
> detects the memory as 2133 (rather than the 3000 mhz it was sold as).
> i'm not sure if this because it really is 2133 mhz, or because the msi
> just likes that frequency better. i'm also not sure to what extent the
> issue was fixed by the slower memory versus the different mobo brand &
> chipset.
>
> now my only issue is crashes when I scrub my zfs pool, though I
> strongly suspect a dying disk is the cause of that.
>
> Cheers,
> peter
>
>
> On Thu, Feb 1, 2018 at 10:51 AM, Mike Tancsa  wrote:
> > On 2/1/2018 1:49 PM, Mike Tancsa wrote:
> >> On 2/1/2018 1:40 PM, Ed Maste wrote:
>  root@amdtestr12:/home/mdtancsa # procstat -kk 6067
>    PIDTID COMMTDNAME  KSTACK
> 
>   6067 100865 python2.7   -   ??+0 ??+0 ??+0
> ??+0
>  ??+0 ??+0 ??+0 ??+0 ??+0 ??+0
> >>>
> >>> I think this part is due to the broken loader change in r328536.
> >>> Kernel symbol loading is broken, and this in particular isn't related
> >>> to Ryzen issues.
> >>
> >> Just for the archives, after a buildworld to a newer rev of the source
> >> tree, all was good :)
> > Ugh, to clarify, all was good with the procstat issue.
> >
> > ---Mike
> >
> >
> > --
> > ---
> > Mike Tancsa, tel +1 519 651 3400 x203 <(519)%20651-3400>
> > Sentex Communications, m...@sentex.net
> > Providing Internet services since 1994 www.sentex.net
> > Cambridge, Ontario Canada
>
-- 

--
Nimrod
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-02-08 Thread Peter Moody
to close the loop on this thread (for me at least); I replaced the
asrock ab350 with an msi x370 and things now, just work (c). the msi
detects the memory as 2133 (rather than the 3000 mhz it was sold as).
i'm not sure if this because it really is 2133 mhz, or because the msi
just likes that frequency better. i'm also not sure to what extent the
issue was fixed by the slower memory versus the different mobo brand &
chipset.

now my only issue is crashes when I scrub my zfs pool, though I
strongly suspect a dying disk is the cause of that.

Cheers,
peter


On Thu, Feb 1, 2018 at 10:51 AM, Mike Tancsa  wrote:
> On 2/1/2018 1:49 PM, Mike Tancsa wrote:
>> On 2/1/2018 1:40 PM, Ed Maste wrote:
 root@amdtestr12:/home/mdtancsa # procstat -kk 6067
   PIDTID COMMTDNAME  KSTACK

  6067 100865 python2.7   -   ??+0 ??+0 ??+0 ??+0
 ??+0 ??+0 ??+0 ??+0 ??+0 ??+0
>>>
>>> I think this part is due to the broken loader change in r328536.
>>> Kernel symbol loading is broken, and this in particular isn't related
>>> to Ryzen issues.
>>
>> Just for the archives, after a buildworld to a newer rev of the source
>> tree, all was good :)
> Ugh, to clarify, all was good with the procstat issue.
>
> ---Mike
>
>
> --
> ---
> Mike Tancsa, tel +1 519 651 3400 x203
> Sentex Communications, m...@sentex.net
> Providing Internet services since 1994 www.sentex.net
> Cambridge, Ontario Canada
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-02-01 Thread Mike Tancsa
On 2/1/2018 1:49 PM, Mike Tancsa wrote:
> On 2/1/2018 1:40 PM, Ed Maste wrote:
>>> root@amdtestr12:/home/mdtancsa # procstat -kk 6067
>>>   PIDTID COMMTDNAME  KSTACK
>>>
>>>  6067 100865 python2.7   -   ??+0 ??+0 ??+0 ??+0
>>> ??+0 ??+0 ??+0 ??+0 ??+0 ??+0
>>
>> I think this part is due to the broken loader change in r328536.
>> Kernel symbol loading is broken, and this in particular isn't related
>> to Ryzen issues.
> 
> Just for the archives, after a buildworld to a newer rev of the source
> tree, all was good :)
Ugh, to clarify, all was good with the procstat issue.

---Mike


-- 
---
Mike Tancsa, tel +1 519 651 3400 x203
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-02-01 Thread Mike Tancsa
On 2/1/2018 1:40 PM, Ed Maste wrote:
>> root@amdtestr12:/home/mdtancsa # procstat -kk 6067
>>   PIDTID COMMTDNAME  KSTACK
>>
>>  6067 100865 python2.7   -   ??+0 ??+0 ??+0 ??+0
>> ??+0 ??+0 ??+0 ??+0 ??+0 ??+0
> 
> I think this part is due to the broken loader change in r328536.
> Kernel symbol loading is broken, and this in particular isn't related
> to Ryzen issues.

Just for the archives, after a buildworld to a newer rev of the source
tree, all was good :)

---Mike


-- 
---
Mike Tancsa, tel +1 519 651 3400 x203
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-02-01 Thread Ed Maste
On 30 January 2018 at 19:48, Mike Tancsa <m...@sentex.net> wrote:
>>
> I also just tried upgrading to the latest HEAD with a generic kernel and
> same / similar lockups although procstat -kk gives some odd results
>
>
> root@amdtestr12:/home/mdtancsa # procstat -kk 6067
>   PIDTID COMMTDNAME  KSTACK
>
>  6067 100865 python2.7   -   ??+0 ??+0 ??+0 ??+0
> ??+0 ??+0 ??+0 ??+0 ??+0 ??+0

I think this part is due to the broken loader change in r328536.
Kernel symbol loading is broken, and this in particular isn't related
to Ryzen issues.
_______
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-01-31 Thread Mike Tancsa
On 1/31/2018 8:33 AM, Eugene Grosbein wrote:
> 31.01.2018 4:36, Mike Tancsa пишет:
>> On 1/30/2018 2:51 PM, Mike Tancsa wrote:
>>>
>>> And sadly, I am still able to hang the compile in about the same place.
>>> However, if I set
>>
>>
>> OK, here is a sort of work around. If I have the box a little more busy,
>> I can avoid whatever deadlock is going on.  In another console I have
>> cat /dev/urandom | sha256
>> running while the build runs
>>
>> ... and I can compile net/samba47 from scratch without the compile
>> hanging.  This problem also happens on HEAD from today.  Should I start
>> a new thread on freebsd-current ? Or just file a bug report ?
>> The compile worked 4/4
> 
> That's really strange. Could you try to do "sysctl kern.eventtimer.periodic=1"
> and re-do the test without extra load?

Thanks for the suggestion!  I actually upgraded the box to HEAD last
night and will try there since the problem is there too. I just created
a bug report and started a thread in freebsd-current and will follow up
there with your test.

---Mike



-- 
---
Mike Tancsa, tel +1 519 651 3400
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-01-31 Thread Eugene Grosbein
31.01.2018 4:36, Mike Tancsa пишет:
> On 1/30/2018 2:51 PM, Mike Tancsa wrote:
>>
>> And sadly, I am still able to hang the compile in about the same place.
>> However, if I set
> 
> 
> OK, here is a sort of work around. If I have the box a little more busy,
> I can avoid whatever deadlock is going on.  In another console I have
> cat /dev/urandom | sha256
> running while the build runs
> 
> ... and I can compile net/samba47 from scratch without the compile
> hanging.  This problem also happens on HEAD from today.  Should I start
> a new thread on freebsd-current ? Or just file a bug report ?
> The compile worked 4/4

That's really strange. Could you try to do "sysctl kern.eventtimer.periodic=1"
and re-do the test without extra load?


___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-01-30 Thread Ryan Root
If you want to use an operating system that takes advantage of CPU to memory 
optimizations without taking steps backwards you have to pay a price.


Sent from my Verizon, Samsung Galaxy smartphone
 Original message From: Don Lewis <truck...@freebsd.org> Date: 
1/30/18  4:58 PM  (GMT-08:00) To: Mike Tancsa <m...@sentex.net> Cc: Pete French 
<petefre...@ingresso.co.uk>, freebsd-stable@freebsd.org, Andriy Gapon 
<a...@freebsd.org>, Peter Moody <free...@hda3.com>, Nimrod Levy 
<nimr...@gmail.com> Subject: Re: Ryzen issues on FreeBSD ? (with sort of 
workaround) 
On 30 Jan, Mike Tancsa wrote:
> On 1/30/2018 5:23 PM, Nimrod Levy wrote:
>> That's really strange. I never saw those kinds of deadlocks, but I did
>> notice that if I kept the cpu busy using distributed.net
>> <http://distributed.net> I could keep the full system lockups away for
>> at least a week if not longer.
>> 
>> Not to keep harping on it, but what worked for me was lowering the
>> memory speed. I'm at 11 days of uptime so far without anything running
>> the cpu. Before the change it would lock up anywhere from an hour to a day.
>> 
> Spoke too soon. After a dozen loops, the process has hung again.  Note,
> this is not the box locking up, just the compile.  I do have memory at a
> lower speed too. -- 2133 instead of the default 2400

I suspect the problem is a race condition that causes a wakeup to be
lost.  Adding load changes the timing enough to avoid the problem most
of the time.

> I also just tried upgrading to the latest HEAD with a generic kernel and
> same / similar lockups although procstat -kk gives some odd results
> 
> 
> root@amdtestr12:/home/mdtancsa # procstat -kk 6067
>   PID    TID COMM    TDNAME  KSTACK
> 
>  6067 100865 python2.7   -   ??+0 ??+0 ??+0 ??+0
> ??+0 ??+0 ??+0 ??+0 ??+0 ??+0
>  6067 100900 python2.7   -   ??+0 ??+0 ??+0 ??+0
> ??+0 ??+0 ??+0 ??+0 ??+0 ??+0
>  6067 100901 python2.7   -   ??+0 ??+0 ??+0 ??+0
> ??+0 ??+0 ??+0 ??+0 ??+0 ??+0
>  6067 100902 python2.7   -   ??+0 ??+0 ??+0 ??+0
> ??+0 ??+0 ??+0 ??+0 ??+0 ??+0
>  6067 100903 python2.7   -   ??+0 ??+0 ??+0 ??+0
> ??+0 ??+0 ??+0 ??+0 ??+0 ??+0
>  6067 100904 python2.7   -   ??+0 ??+0 ??+0 ??+0
> ??+0 ??+0 ??+0 ??+0 ??+0 ??+0
>  6067 100905 python2.7   -   ??+0 ??+0 ??+0 ??+0
> ??+0 ??+0 ??+0 ??+0 ??+0 ??+0
>  6067 100906 python2.7   -   ??+0 ??+0 ??+0 ??+0
> ??+0 ??+0 ??+0 ??+0 ??+0 ??+0
>  6067 100907 python2.7   -   ??+0 ??+0 ??+0 ??+0
> ??+0 ??+0 ??+0 ??+0 ??+0 ??+0
>  6067 100908 python2.7   -   ??+0 ??+0 ??+0 ??+0
> ??+0 ??+0 ??+0 ??+0 ??+0 ??+0
>  6067 100909 python2.7   -   ??+0 ??+0 ??+0 ??+0
> ??+0 ??+0 ??+0 ??+0 ??+0 ??+0
>  6067 100910 python2.7   -   ??+0 ??+0 ??+0 ??+0
> ??+0 ??+0 ??+0 ??+0 ??+0 ??+0
>  6067 100911 python2.7   -   ??+0 ??+0 ??+0 ??+0
> ??+0 ??+0 ??+0 ??+0 ??+0 ??+0

Strange ... kernel vs. world mismatch?  Some other new regression in
HEAD?


___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-01-30 Thread Don Lewis
On 30 Jan, Mike Tancsa wrote:
> On 1/30/2018 5:23 PM, Nimrod Levy wrote:
>> That's really strange. I never saw those kinds of deadlocks, but I did
>> notice that if I kept the cpu busy using distributed.net
>>  I could keep the full system lockups away for
>> at least a week if not longer.
>> 
>> Not to keep harping on it, but what worked for me was lowering the
>> memory speed. I'm at 11 days of uptime so far without anything running
>> the cpu. Before the change it would lock up anywhere from an hour to a day.
>> 
> Spoke too soon. After a dozen loops, the process has hung again.  Note,
> this is not the box locking up, just the compile.  I do have memory at a
> lower speed too. -- 2133 instead of the default 2400

I suspect the problem is a race condition that causes a wakeup to be
lost.  Adding load changes the timing enough to avoid the problem most
of the time.

> I also just tried upgrading to the latest HEAD with a generic kernel and
> same / similar lockups although procstat -kk gives some odd results
> 
> 
> root@amdtestr12:/home/mdtancsa # procstat -kk 6067
>   PIDTID COMMTDNAME  KSTACK
> 
>  6067 100865 python2.7   -   ??+0 ??+0 ??+0 ??+0
> ??+0 ??+0 ??+0 ??+0 ??+0 ??+0
>  6067 100900 python2.7   -   ??+0 ??+0 ??+0 ??+0
> ??+0 ??+0 ??+0 ??+0 ??+0 ??+0
>  6067 100901 python2.7   -   ??+0 ??+0 ??+0 ??+0
> ??+0 ??+0 ??+0 ??+0 ??+0 ??+0
>  6067 100902 python2.7   -   ??+0 ??+0 ??+0 ??+0
> ??+0 ??+0 ??+0 ??+0 ??+0 ??+0
>  6067 100903 python2.7   -   ??+0 ??+0 ??+0 ??+0
> ??+0 ??+0 ??+0 ??+0 ??+0 ??+0
>  6067 100904 python2.7   -   ??+0 ??+0 ??+0 ??+0
> ??+0 ??+0 ??+0 ??+0 ??+0 ??+0
>  6067 100905 python2.7   -   ??+0 ??+0 ??+0 ??+0
> ??+0 ??+0 ??+0 ??+0 ??+0 ??+0
>  6067 100906 python2.7   -   ??+0 ??+0 ??+0 ??+0
> ??+0 ??+0 ??+0 ??+0 ??+0 ??+0
>  6067 100907 python2.7   -   ??+0 ??+0 ??+0 ??+0
> ??+0 ??+0 ??+0 ??+0 ??+0 ??+0
>  6067 100908 python2.7   -   ??+0 ??+0 ??+0 ??+0
> ??+0 ??+0 ??+0 ??+0 ??+0 ??+0
>  6067 100909 python2.7   -   ??+0 ??+0 ??+0 ??+0
> ??+0 ??+0 ??+0 ??+0 ??+0 ??+0
>  6067 100910 python2.7   -   ??+0 ??+0 ??+0 ??+0
> ??+0 ??+0 ??+0 ??+0 ??+0 ??+0
>  6067 100911 python2.7   -   ??+0 ??+0 ??+0 ??+0
> ??+0 ??+0 ??+0 ??+0 ??+0 ??+0

Strange ... kernel vs. world mismatch?  Some other new regression in
HEAD?


___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-01-30 Thread Mike Tancsa
On 1/30/2018 5:23 PM, Nimrod Levy wrote:
> That's really strange. I never saw those kinds of deadlocks, but I did
> notice that if I kept the cpu busy using distributed.net
>  I could keep the full system lockups away for
> at least a week if not longer.
> 
> Not to keep harping on it, but what worked for me was lowering the
> memory speed. I'm at 11 days of uptime so far without anything running
> the cpu. Before the change it would lock up anywhere from an hour to a day.
> 
Spoke too soon. After a dozen loops, the process has hung again.  Note,
this is not the box locking up, just the compile.  I do have memory at a
lower speed too. -- 2133 instead of the default 2400


I also just tried upgrading to the latest HEAD with a generic kernel and
same / similar lockups although procstat -kk gives some odd results


root@amdtestr12:/home/mdtancsa # procstat -kk 6067
  PIDTID COMMTDNAME  KSTACK

 6067 100865 python2.7   -   ??+0 ??+0 ??+0 ??+0
??+0 ??+0 ??+0 ??+0 ??+0 ??+0
 6067 100900 python2.7   -   ??+0 ??+0 ??+0 ??+0
??+0 ??+0 ??+0 ??+0 ??+0 ??+0
 6067 100901 python2.7   -   ??+0 ??+0 ??+0 ??+0
??+0 ??+0 ??+0 ??+0 ??+0 ??+0
 6067 100902 python2.7   -   ??+0 ??+0 ??+0 ??+0
??+0 ??+0 ??+0 ??+0 ??+0 ??+0
 6067 100903 python2.7   -   ??+0 ??+0 ??+0 ??+0
??+0 ??+0 ??+0 ??+0 ??+0 ??+0
 6067 100904 python2.7   -   ??+0 ??+0 ??+0 ??+0
??+0 ??+0 ??+0 ??+0 ??+0 ??+0
 6067 100905 python2.7   -   ??+0 ??+0 ??+0 ??+0
??+0 ??+0 ??+0 ??+0 ??+0 ??+0
 6067 100906 python2.7   -   ??+0 ??+0 ??+0 ??+0
??+0 ??+0 ??+0 ??+0 ??+0 ??+0
 6067 100907 python2.7   -   ??+0 ??+0 ??+0 ??+0
??+0 ??+0 ??+0 ??+0 ??+0 ??+0
 6067 100908 python2.7   -   ??+0 ??+0 ??+0 ??+0
??+0 ??+0 ??+0 ??+0 ??+0 ??+0
 6067 100909 python2.7   -   ??+0 ??+0 ??+0 ??+0
??+0 ??+0 ??+0 ??+0 ??+0 ??+0
 6067 100910 python2.7   -   ??+0 ??+0 ??+0 ??+0
??+0 ??+0 ??+0 ??+0 ??+0 ??+0
 6067 100911 python2.7   -   ??+0 ??+0 ??+0 ??+0
??+0 ??+0 ??+0 ??+0 ??+0 ??+0
root@amdtestr12:/home/mdtancsa # procstat -t 6067
  PIDTID COMMTDNAME  CPU  PRI STATE
WCHAN
 6067 100865 python2.7   --1  152 sleep
usem
 6067 100900 python2.7   --1  152 sleep
umtxn
 6067 100901 python2.7   --1  152 sleep
umtxn
 6067 100902 python2.7   --1  152 sleep
umtxn
 6067 100903 python2.7   --1  152 sleep
umtxn
 6067 100904 python2.7   --1  152 sleep
umtxn
 6067 100905 python2.7   --1  152 sleep
umtxn
 6067 100906 python2.7   --1  152 sleep
umtxn
 6067 100907 python2.7   --1  152 sleep
umtxn
 6067 100908 python2.7   --1  152 sleep
umtxn
 6067 100909 python2.7   --1  152 sleep
umtxn
 6067 100910 python2.7   --1  152 sleep
umtxn
 6067 100911 python2.7   --1  152 sleep
umtxn
root@amdtestr12:/home/mdtancsa #


-- 
---
Mike Tancsa, tel +1 519 651 3400
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-01-30 Thread Don Lewis
On 30 Jan, Mike Tancsa wrote:
> On 1/30/2018 2:51 PM, Mike Tancsa wrote:
>> 
>> And sadly, I am still able to hang the compile in about the same place.
>> However, if I set
> 
> 
> OK, here is a sort of work around. If I have the box a little more busy,
> I can avoid whatever deadlock is going on.  In another console I have
> cat /dev/urandom | sha256
> running while the build runs

Interesting ...

> ... and I can compile net/samba47 from scratch without the compile
> hanging.  This problem also happens on HEAD from today.  Should I start
> a new thread on freebsd-current ? Or just file a bug report ?
> The compile worked 4/4

I'd file a PR to capture all the information in one place and drop a
pointer on freebsd-current.

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-01-30 Thread Nimrod Levy
That's really strange. I never saw those kinds of deadlocks, but I did
notice that if I kept the cpu busy using distributed.net I could keep the
full system lockups away for at least a week if not longer.

Not to keep harping on it, but what worked for me was lowering the memory
speed. I'm at 11 days of uptime so far without anything running the cpu.
Before the change it would lock up anywhere from an hour to a day.


On Tue, Jan 30, 2018 at 4:39 PM Mike Tancsa  wrote:

> On 1/30/2018 2:51 PM, Mike Tancsa wrote:
> >
> > And sadly, I am still able to hang the compile in about the same place.
> > However, if I set
>
>
> OK, here is a sort of work around. If I have the box a little more busy,
> I can avoid whatever deadlock is going on.  In another console I have
> cat /dev/urandom | sha256
> running while the build runs
>
> ... and I can compile net/samba47 from scratch without the compile
> hanging.  This problem also happens on HEAD from today.  Should I start
> a new thread on freebsd-current ? Or just file a bug report ?
> The compile worked 4/4
>
> ---Mike
>
>
>
>
>
>
>
>
>
>
> >
> > hw.lower_amd64_sharedpage=0
> >
> > it seems to hang in a different way. CTRL+t shows
> >
> > load: 0.43  cmd: python2.7 15736 [umtxn] 165.00r 14.46u 6.65s 0% 233600k
> > make[1]: Working in: /usr/ports/net/samba47
> > make: Working in: /usr/ports/net/samba47
> >
> >
> > # procstat -t 15736
> >   PIDTID COMMTDNAME  CPU  PRI STATE
> > WCHAN
> > 15736 100855 python2.7   --1  152 sleep
> > usem
> > 15736 100956 python2.7   --1  124 sleep
> > umtxn
> > 15736 100957 python2.7   --1  126 sleep
> > umtxn
> > 15736 100958 python2.7   --1  124 sleep
> > umtxn
> > 15736 100959 python2.7   --1  127 sleep
> > umtxn
> > 15736 100960 python2.7   --1  126 sleep
> > umtxn
> > 15736 100961 python2.7   --1  126 sleep
> > umtxn
> > 15736 100962 python2.7   --1  126 sleep
> > umtxn
> > 15736 100963 python2.7   --1  126 sleep
> > umtxn
> > 15736 100964 python2.7   --1  127 sleep
> > umtxn
> > 15736 100965 python2.7   --1  126 sleep
> > umtxn
> > 15736 100966 python2.7   --1  126 sleep
> > umtxn
> > 15736 100967 python2.7   --1  126 sleep
> > umtxn
> >
> >  # procstat -kk 15736
> >   PIDTID COMMTDNAME  KSTACK
> >
> > 15736 100855 python2.7   -   mi_switch+0xf5
> > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231
> > umtxq_sleep+0x143 do_sem2_wait+0x68a __umtx_op_sem2_wait+0x4b
> > amd64_syscall+0xa48 fast_syscall_common+0xfc
> > 15736 100956 python2.7   -   mi_switch+0xf5
> > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231
> > umtxq_sleep+0x143 do_lock_umutex+0x885 __umtx_op_wait_umutex+0x48
> > amd64_syscall+0xa48 fast_syscall_common+0xfc
> > 15736 100957 python2.7   -   mi_switch+0xf5
> > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231
> > umtxq_sleep+0x143 do_lock_umutex+0x885 __umtx_op_wait_umutex+0x48
> > amd64_syscall+0xa48 fast_syscall_common+0xfc
> > 15736 100958 python2.7   -   mi_switch+0xf5
> > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231
> > umtxq_sleep+0x143 do_lock_umutex+0x885 __umtx_op_wait_umutex+0x48
> > amd64_syscall+0xa48 fast_syscall_common+0xfc
> > 15736 100959 python2.7   -   mi_switch+0xf5
> > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231
> > umtxq_sleep+0x143 do_lock_umutex+0x885 __umtx_op_wait_umutex+0x48
> > amd64_syscall+0xa48 fast_syscall_common+0xfc
> > 15736 100960 python2.7   -   mi_switch+0xf5
> > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231
> > umtxq_sleep+0x143 do_lock_umutex+0x885 __umtx_op_wait_umutex+0x48
> > amd64_syscall+0xa48 fast_syscall_common+0xfc
> > 15736 100961 python2.7   -   mi_switch+0xf5
> > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231
> > umtxq_sleep+0x143 do_lock_umutex+0x885 __umtx_op_wait_umutex+0x48
> > amd64_syscall+0xa48 fast_syscall_common+0xfc
> > 15736 100962 python2.7   -   mi_switch+0xf5
> > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231
> > umtxq_sleep+0x143 do_lock_umutex+0x885 __umtx_op_wait_umutex+0x48
> > amd64_syscall+0xa48 fast_syscall_common+0xfc
> > 15736 100963 python2.7   -   mi_switch+0xf5
> > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231
> > umtxq_sleep+0x143 do_lock_umutex+0x885 __umtx_op_wait_umutex+0x48
> > amd64_syscall+0xa48 fast_syscall_common+0xfc
> > 15736 100964 python2.7   -

Re: Ryzen issues on FreeBSD ? (with sort of workaround)

2018-01-30 Thread Mike Tancsa
On 1/30/2018 2:51 PM, Mike Tancsa wrote:
> 
> And sadly, I am still able to hang the compile in about the same place.
> However, if I set


OK, here is a sort of work around. If I have the box a little more busy,
I can avoid whatever deadlock is going on.  In another console I have
cat /dev/urandom | sha256
running while the build runs

... and I can compile net/samba47 from scratch without the compile
hanging.  This problem also happens on HEAD from today.  Should I start
a new thread on freebsd-current ? Or just file a bug report ?
The compile worked 4/4

---Mike










> 
> hw.lower_amd64_sharedpage=0
> 
> it seems to hang in a different way. CTRL+t shows
> 
> load: 0.43  cmd: python2.7 15736 [umtxn] 165.00r 14.46u 6.65s 0% 233600k
> make[1]: Working in: /usr/ports/net/samba47
> make: Working in: /usr/ports/net/samba47
> 
> 
> # procstat -t 15736
>   PIDTID COMMTDNAME  CPU  PRI STATE
> WCHAN
> 15736 100855 python2.7   --1  152 sleep
> usem
> 15736 100956 python2.7   --1  124 sleep
> umtxn
> 15736 100957 python2.7   --1  126 sleep
> umtxn
> 15736 100958 python2.7   --1  124 sleep
> umtxn
> 15736 100959 python2.7   --1  127 sleep
> umtxn
> 15736 100960 python2.7   --1  126 sleep
> umtxn
> 15736 100961 python2.7   --1  126 sleep
> umtxn
> 15736 100962 python2.7   --1  126 sleep
> umtxn
> 15736 100963 python2.7   --1  126 sleep
> umtxn
> 15736 100964 python2.7   --1  127 sleep
> umtxn
> 15736 100965 python2.7   --1  126 sleep
> umtxn
> 15736 100966 python2.7   --1  126 sleep
> umtxn
> 15736 100967 python2.7   --1  126 sleep
> umtxn
> 
>  # procstat -kk 15736
>   PIDTID COMMTDNAME  KSTACK
> 
> 15736 100855 python2.7   -   mi_switch+0xf5
> sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231
> umtxq_sleep+0x143 do_sem2_wait+0x68a __umtx_op_sem2_wait+0x4b
> amd64_syscall+0xa48 fast_syscall_common+0xfc
> 15736 100956 python2.7   -   mi_switch+0xf5
> sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231
> umtxq_sleep+0x143 do_lock_umutex+0x885 __umtx_op_wait_umutex+0x48
> amd64_syscall+0xa48 fast_syscall_common+0xfc
> 15736 100957 python2.7   -   mi_switch+0xf5
> sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231
> umtxq_sleep+0x143 do_lock_umutex+0x885 __umtx_op_wait_umutex+0x48
> amd64_syscall+0xa48 fast_syscall_common+0xfc
> 15736 100958 python2.7   -   mi_switch+0xf5
> sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231
> umtxq_sleep+0x143 do_lock_umutex+0x885 __umtx_op_wait_umutex+0x48
> amd64_syscall+0xa48 fast_syscall_common+0xfc
> 15736 100959 python2.7   -   mi_switch+0xf5
> sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231
> umtxq_sleep+0x143 do_lock_umutex+0x885 __umtx_op_wait_umutex+0x48
> amd64_syscall+0xa48 fast_syscall_common+0xfc
> 15736 100960 python2.7   -   mi_switch+0xf5
> sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231
> umtxq_sleep+0x143 do_lock_umutex+0x885 __umtx_op_wait_umutex+0x48
> amd64_syscall+0xa48 fast_syscall_common+0xfc
> 15736 100961 python2.7   -   mi_switch+0xf5
> sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231
> umtxq_sleep+0x143 do_lock_umutex+0x885 __umtx_op_wait_umutex+0x48
> amd64_syscall+0xa48 fast_syscall_common+0xfc
> 15736 100962 python2.7   -   mi_switch+0xf5
> sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231
> umtxq_sleep+0x143 do_lock_umutex+0x885 __umtx_op_wait_umutex+0x48
> amd64_syscall+0xa48 fast_syscall_common+0xfc
> 15736 100963 python2.7   -   mi_switch+0xf5
> sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231
> umtxq_sleep+0x143 do_lock_umutex+0x885 __umtx_op_wait_umutex+0x48
> amd64_syscall+0xa48 fast_syscall_common+0xfc
> 15736 100964 python2.7   -   mi_switch+0xf5
> sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231
> umtxq_sleep+0x143 do_lock_umutex+0x885 __umtx_op_wait_umutex+0x48
> amd64_syscall+0xa48 fast_syscall_common+0xfc
> 15736 100965 python2.7   -   mi_switch+0xf5
> sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231
> umtxq_sleep+0x143 do_lock_umutex+0x885 __umtx_op_wait_umutex+0x48
> amd64_syscall+0xa48 fast_syscall_common+0xfc
> 15736 100966 python2.7   -   mi_switch+0xf5
> sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231
> umtxq_sleep+0x143 do_lock_umutex+0x885 __umtx_op_wait_umutex+0x48
> amd64_syscall+0xa48 

Re: Ryzen issues on FreeBSD ?

2018-01-30 Thread Mike Tancsa
On 1/28/2018 7:41 PM, Don Lewis wrote:
> 
> My suspicion is a FreeBSD bug, probably a locking / race issue.  I know
> that we've had to make some tweeks to our code for AMD CPUs, like this:


OK, I got back the CPUs from AMD (fast turn around!)

And sadly, I am still able to hang the compile in about the same place.
However, if I set

hw.lower_amd64_sharedpage=0

it seems to hang in a different way. CTRL+t shows

load: 0.43  cmd: python2.7 15736 [umtxn] 165.00r 14.46u 6.65s 0% 233600k
make[1]: Working in: /usr/ports/net/samba47
make: Working in: /usr/ports/net/samba47


# procstat -t 15736
  PIDTID COMMTDNAME  CPU  PRI STATE
WCHAN
15736 100855 python2.7   --1  152 sleep
usem
15736 100956 python2.7   --1  124 sleep
umtxn
15736 100957 python2.7   --1  126 sleep
umtxn
15736 100958 python2.7   --1  124 sleep
umtxn
15736 100959 python2.7   --1  127 sleep
umtxn
15736 100960 python2.7   --1  126 sleep
umtxn
15736 100961 python2.7   --1  126 sleep
umtxn
15736 100962 python2.7   --1  126 sleep
umtxn
15736 100963 python2.7   --1  126 sleep
umtxn
15736 100964 python2.7   --1  127 sleep
umtxn
15736 100965 python2.7   --1  126 sleep
umtxn
15736 100966 python2.7   --1  126 sleep
umtxn
15736 100967 python2.7   --1  126 sleep
umtxn

 # procstat -kk 15736
  PIDTID COMMTDNAME  KSTACK

15736 100855 python2.7   -   mi_switch+0xf5
sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231
umtxq_sleep+0x143 do_sem2_wait+0x68a __umtx_op_sem2_wait+0x4b
amd64_syscall+0xa48 fast_syscall_common+0xfc
15736 100956 python2.7   -   mi_switch+0xf5
sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231
umtxq_sleep+0x143 do_lock_umutex+0x885 __umtx_op_wait_umutex+0x48
amd64_syscall+0xa48 fast_syscall_common+0xfc
15736 100957 python2.7   -   mi_switch+0xf5
sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231
umtxq_sleep+0x143 do_lock_umutex+0x885 __umtx_op_wait_umutex+0x48
amd64_syscall+0xa48 fast_syscall_common+0xfc
15736 100958 python2.7   -   mi_switch+0xf5
sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231
umtxq_sleep+0x143 do_lock_umutex+0x885 __umtx_op_wait_umutex+0x48
amd64_syscall+0xa48 fast_syscall_common+0xfc
15736 100959 python2.7   -   mi_switch+0xf5
sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231
umtxq_sleep+0x143 do_lock_umutex+0x885 __umtx_op_wait_umutex+0x48
amd64_syscall+0xa48 fast_syscall_common+0xfc
15736 100960 python2.7   -   mi_switch+0xf5
sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231
umtxq_sleep+0x143 do_lock_umutex+0x885 __umtx_op_wait_umutex+0x48
amd64_syscall+0xa48 fast_syscall_common+0xfc
15736 100961 python2.7   -   mi_switch+0xf5
sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231
umtxq_sleep+0x143 do_lock_umutex+0x885 __umtx_op_wait_umutex+0x48
amd64_syscall+0xa48 fast_syscall_common+0xfc
15736 100962 python2.7   -   mi_switch+0xf5
sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231
umtxq_sleep+0x143 do_lock_umutex+0x885 __umtx_op_wait_umutex+0x48
amd64_syscall+0xa48 fast_syscall_common+0xfc
15736 100963 python2.7   -   mi_switch+0xf5
sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231
umtxq_sleep+0x143 do_lock_umutex+0x885 __umtx_op_wait_umutex+0x48
amd64_syscall+0xa48 fast_syscall_common+0xfc
15736 100964 python2.7   -   mi_switch+0xf5
sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231
umtxq_sleep+0x143 do_lock_umutex+0x885 __umtx_op_wait_umutex+0x48
amd64_syscall+0xa48 fast_syscall_common+0xfc
15736 100965 python2.7   -   mi_switch+0xf5
sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231
umtxq_sleep+0x143 do_lock_umutex+0x885 __umtx_op_wait_umutex+0x48
amd64_syscall+0xa48 fast_syscall_common+0xfc
15736 100966 python2.7   -   mi_switch+0xf5
sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231
umtxq_sleep+0x143 do_lock_umutex+0x885 __umtx_op_wait_umutex+0x48
amd64_syscall+0xa48 fast_syscall_common+0xfc
15736 100967 python2.7   -   mi_switch+0xf5
sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231
umtxq_sleep+0x143 do_lock_umutex+0x885 __umtx_op_wait_umutex+0x48
amd64_syscall+0xa48 fast_syscall_common+0xfc

If I kill the make, reboot and just type make, it completes after the
reboot.  If after the reboot, I do an rm -R work, it will hang again.
With the default of

Re: Ryzen issues on FreeBSD ?

2018-01-28 Thread Don Lewis
On 27 Jan, Mike Tancsa wrote:
> On 1/27/2018 3:23 AM, Don Lewis wrote:
>> 
>> I just ran into this for this first time with samba46.  I kicked of a
>> ports build this evening before leaving for several hours.  When I
>> returned, samba46 had failed with a build runaway.  I just tried again
>> and I see python stuck in the usem state.  This is what I see with
>> procstat -k:
> 
> Hmmm, is this indicative of a processor bug or a FreeBSD bug or its
> indeterminate at this point ?

My suspicion is a FreeBSD bug, probably a locking / race issue.  I know
that we've had to make some tweeks to our code for AMD CPUs, like this:


r321608 | kib | 2017-07-27 01:37:07 -0700 (Thu, 27 Jul 2017) | 9 lines

Use MFENCE to serialize RDTSC on non-Intel CPUs.

Kernel already used the stronger barrier instruction for AMDs, correct
the userspace fast gettimeofday() implementation as well.



I did go back and look at the build runaways that I've occasionally seen
on my AMD FX-8320E package builder.  I haven't seen the python issue
there, but have seen gmake get stuck in a sleeping state with a bunch of
zombie offspring.

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ?

2018-01-28 Thread Don Lewis
On 27 Jan, Peter Moody wrote:
> Whelp, I replaced the r5 1600x with an r7 1700 (au 1734) and I'm now
> getting minutes of uptime before I hard crash. With smt, without, with c
> states, without, with opcache, without. No difference.

Check the temperatures.  Maybe the heat sink isn't making good contact
after the CPU replacement.

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Ryzen issues on FreeBSD ?

2018-01-28 Thread Don Lewis
On 28 Jan, Pete French wrote:
> 
> 
> On 28/01/2018 20:28, Don Lewis wrote:
>> I'd be wary of the B350 boards with the higher TDP eight core Ryzen CPUs
>> since the VRMs on the cheaper boards tend to have less robust VRM
>> designs.
> 
> Gah! Yes, I forgot that.originally sec'd the board for a smaller Ryzen, 
> then though "what the hell" and got the 1700 without going back and 
> checking that kind of stuff. Hmm, shall swap for a different one if I 
> can. Thanks for poining that out.

I started off with a Gigabyte AB350 Gaming for my 1700X back when there
was enough ambiguity about ECC support to give me hope that it would
work.  Everything seemed to work other than ECC and the problems caused
by my buggy CPU and the shared page issue, but the VRM temps in the BIOS
were really high (and I had no way to monitor that under load).  When I
upgraded to get working ECC, I also looked at reviews about VRM quality.

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


  1   2   >