On Tue, Jul 18, 2023 at 09:43:51AM +0100, Laurence Tratt wrote: > A small number of us with AMD Ryzen 9 (i.e. chips in the 7x000 range) > machines have been experiencing regular (often daily), or semi-regular > hangs, but without any obvious cause. > > What we don't know is if we're the unlucky few, or whether this might be a > wider issue. So, to see if there is some sort of pattern going on (e.g. are > certain motherboards / BIOSes correlated with hangs or not?), I'd like to > poll Ryzen 9 OpenBSD users. At a minimum we'd need to know: > > CPU model (e.g. "7900x") > Motherboard (e.g. "MSI PRO670-X") > Have you experienced crashes? (Yes/No) If "Yes": > what frequency (e.g. "daily/weekly/no obvious pattern")? > are there are obvious causes (e.g. "happens when I run program X")? > have you found any mitigations (e.g. "updated BIOS")? > Ideally a dmesg too > > We're as interested in Ryzen 9 users who aren't experiencing hangs as who > are! Please feel free to reply to the list, or to me individually, and I'll > collate the information and see if there are any patterns or not. > > > Laurie > -- > Personal https://tratt.net/laurie/ > Software Development Team https://soft-dev.org/ > https://github.com/ltratt https://twitter.com/laurencetratt >
A bit of color commentary here... Laurie and I and a few other folks have been trying to debug the hangs that some people are seeing on these machines. He and I have identical hardware and he sees regular hangs, and I rarely see any (I think in the span of 7 months I've seen maybe 2 or 3 total). I've been using this machine in anger as a daily driver and I can't make it break and other people can't even make it a day without a hang. We've tried to debug the issue and narrow down what device(s) might be causing the problem, or what workload, etc, but nothing is pointing in any specific direction. We've also seen reports of "long slow death" crashes where existing processes continue to work for some time but nothing new can be execed, and eventually even the existing processes freeze. To me that sounds like a lock issue but it never happens on my machine and only infreqently elsewhere, so I can't really debug it. We'd like to know if others have similar machines and if they are stable or not. -ml