This thing can happen due to multiple factors.
I was running a 1800X. Freezes ocurred in 24-48h of uptime. Disabling Global 
C-State or enabling typical power idle in UEFI stopped those freezes. The 
latter option disables PC6 on top of using 0.85V idle voltage. To disable PC6 
you need a cold boot, otherwise only the voltage change is applied and PC6 
remains enabled, crashing the system as usual.

I upgraded to a 2700X (UEFI cleared) and apparently the issue
disappeared. But nope, it's just less frequent. Much less. Currently I
have an uptime of 23d. I also had another 2700X running for 17d before
testing again a 1800X which crashed within 24h, afterwards I inserted
the current 2700X which crashed within 48h. The following boot is the
one with an uptime of 23d.

All in all, this might be related to either a PSU thing or to a non-
existent but required kernel workaround for a bug in the processor, as
detailed in the AMD Revision Guide [1]. The 1800X would crash more
because it has more bugs and workarounds needed, while the 2700X has
fewer, specially related to the PCIe controller. Ironically, the 2700X
consumes less power at idle than the 1800X because it requires lower
voltages: 12nm+ vs 14nm, and the voltages specified at every power level
are also lower. I don't see much logic in saying that the PSU is the
culprit of system instability.

All the people trying "idle=nomwait", "idle=halt" or
"processor.max_cstate=5" should be warned those options are useless.
There are only 2 Cstates available in Ryzen systems, so if you want to
limit Cstate you have to set it to 1 at most ->
"processor.max_cstate=1". And the use of the MWAIT instruction is
disabled by the UEFI if you insert a Ryzen 1800X processor. The 2700X
and the rest of 2nd gen Ryzen are not affected by any MWAIT bug. Idle
option is thus useless too.


It's better to try with "pcie_aspm=off" or "pcie_aspm=force" 
pcie_aspm.policy=performance" and or "nvme_core.default_ps_max_latency_us=0". 
Maybe the PCIe root or anything related does not wakeup and the processor 
stalls waiting for an interrupt to be served.

[1]
https://www.amd.com/system/files/TechDocs/55449_Fam_17h_M_00h-0Fh_Rev_Guide.pdf

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1690085

Title:
  Ryzen 1800X freeze - rcu_sched detected stalls on CPUs/tasks

To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/1690085/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to