My latest posting on this bug issue posted in AMD communities copied
here FYI:

https://community.amd.com/thread/225795?start=90&tstart=0


102. Re: Ryzen linux kernel bug 196683 - Random Soft Lockup
uncle yap
uncle yap Jan 4, 2019 3:29 AM (in response to imshalla)

Dear All,


Some good news and discovery.


My crisis is greatly improved so far after 1st 5 hours running without
lockup now. All I did essentially was changing my Linux Kernel from
4.18.0-11-generic to 4.15.0-43-generic

I had previously also tried 4.18.0-13-generic and found it equally bad.


My highest suspicion is 4.18.0-X kernel's thread scheduler  is/are buggy
with a same bug that would freeze up some threads randomly and up to
12hours long and later randomly unfreeze them. I call that random
because I can not find any consistent pattern on how it freeze /
unfreeze. These hardly require a hard reset unless it is left frozen for
very long time. If I discovered soon enough and gave soft reset by SSH
command sudo systemctl restart sddm it will be recovered. It would be
gdm instead of sddm if you are in ubuntu instead of kubuntu.


My guess for this difference (between requiring a motherboard reset
switch vs soft reset command) is that TOO MANY REPEATED THREAD FROZEN
OVER LONGER TIME UNATTENDED. It is a guess only because I cannot afford
the time to test and prove that. My faithful logical analysis and
derivation is so, because this kernel thread scheduler bug will freeze
more & more threads than it unfreeze over longer unattended time, and
that critical kernel or driver module threads or ssh or bash itself
could have been frozen, hence you have no more chance to soft reset /
recover.


I have proven that when only 1 or 2 threads frozen, servers, ssh, bash,
and even ksysguard (CPUs usage / load percentage graphs) will still be
running and I never found any single CPU core nor logical CPU
(hyperthread) completely stuck in ZERO% usage.


265px-Ksysguard1.png

When my X.org console freezes, mouse will freeze and CPU usage graph
will all freeze, but usually still a good chance if I quickly ssh my
favorite reset command sudo systemctl restart sddm it will be recovered.
If I wasn't checking and left it frozen for long time, there had been a
high chance of it completely not recoverable via ssh command, and reset
switch became the only way to get system back rebooted up.


Today, when I checked my CPU Pstate via kernel, it is not running any
C6, but I mt BIOS setting neither DISABLED C6 nor use TYPICAL CURRENT
IDLE, nor I am using kernel boot idle=nowait , but I think my F4E
version BIOS by Gigabyte X470 had DISABLED C6 power state & forced
TYPICAL CURRENT IDLE:

    ~$ cat /sys/devices/system/cpu/cpu*/cpuidle/state*/name

    POLL

    C1

    C2

    POLL

    C1

    C2

    POLL

    C1

    C2

    POLL

    C1

    C2

    POLL

    C1

    C2

    POLL

    C1

    C2

    POLL

    C1

    C2

    POLL

    C1

    C2

    POLL

    C1

    C2

    POLL

    C1

    C2

    POLL

    C1

    C2

    POLL

>From existing state of stability I am optimistic to expect no further
debugging on my system for now.


My proposal for Kubuntu/Ubuntu users is to check kernel version to be
other than version 4.18.0-X , and try older 4.15.x 1st, and newer
version when they released, if your stability improved with alternate
kernels than stay with them and await for improved kernels and try them
when they became available.


Thanks & regards

uy

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1798961

Title:
  Random unrecoverable freezes on Ubuntu 18.10

Status in linux package in Ubuntu:
  Triaged
Status in linux source package in Bionic:
  Triaged
Status in linux source package in Cosmic:
  Triaged
Status in linux source package in Disco:
  Triaged

Bug description:
  First thing I notice is that the mouse cursor freezes as I'm using it,
  then I hit the CAPS LOCK key and the LED indicator doesn't respond.
  Then I try the "REISUB" command, but it doesn't do anything either.
  Only a hard reset works, pressing down the power button for a few
  seconds.

  How to reproduce?
  I couldn't figure out a consistent method. It is still random to me.

  Version: Ubuntu 4.18.0-10.11-generic 4.18.12
  System information attached.

  Also happens under Arch Linux and Fedora.
  I've talked to another user on IRC who seems to be having the same freezes.

  ProblemType: Bug
  DistroRelease: Ubuntu 18.10
  Package: linux-image-4.18.0-10-generic 4.18.0-10.11
  ProcVersionSignature: Ubuntu 4.18.0-10.11-generic 4.18.12
  Uname: Linux 4.18.0-10-generic x86_64
  ApportVersion: 2.20.10-0ubuntu13
  Architecture: amd64
  AudioDevicesInUse:
   USER        PID ACCESS COMMAND
   /dev/snd/controlC1:  dsilva     1213 F.... pulseaudio
   /dev/snd/controlC0:  dsilva     1213 F.... pulseaudio
  CurrentDesktop: XFCE
  Date: Sat Oct 20 09:54:50 2018
  InstallationDate: Installed on 2018-10-20 (0 days ago)
  InstallationMedia: Xubuntu 18.10 "Cosmic Cuttlefish" - Release amd64 
(20181017.2)
  MachineType: Dell Inc. Inspiron 5458
  ProcFB: 0 inteldrmfb
  ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-4.18.0-10-generic 
root=/dev/mapper/xubuntu--vg-root ro quiet splash vt.handoff=1
  RelatedPackageVersions:
   linux-restricted-modules-4.18.0-10-generic N/A
   linux-backports-modules-4.18.0-10-generic  N/A
   linux-firmware                             1.175
  RfKill:
   0: phy0: Wireless LAN
    Soft blocked: no
    Hard blocked: no
  SourcePackage: linux
  UpgradeStatus: No upgrade log present (probably fresh install)
  dmi.bios.date: 02/02/2018
  dmi.bios.vendor: Dell Inc.
  dmi.bios.version: A15
  dmi.board.name: 09WGNT
  dmi.board.vendor: Dell Inc.
  dmi.board.version: A00
  dmi.chassis.type: 9
  dmi.chassis.vendor: Dell Inc.
  dmi.modalias: 
dmi:bvnDellInc.:bvrA15:bd02/02/2018:svnDellInc.:pnInspiron5458:pvr01:rvnDellInc.:rn09WGNT:rvrA00:cvnDellInc.:ct9:cvr:
  dmi.product.name: Inspiron 5458
  dmi.product.sku: Inspiron 5458
  dmi.product.version: 01
  dmi.sys.vendor: Dell Inc.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1798961/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to