It sounds like what I was getting.

On Thu, Jan 16, 2020 at 11:05 PM Colin Ian King <1799...@bugs.launchpad.net>
wrote:

> After quite a bit of experimentation I found that I can reproduce the bug
> if I have zram *and* also swap on the filesystem enabled while exercising
> the brk stressors and aiol (to cause lots of I/O). Eventually the system
> grinds to a halt, we lose interactivity and we eventually get lockups as
> follows:
> [ 2012.040006] watchdog: BUG: soft lockup - CPU#2 stuck for 22s!
> [stress-ng-brk:1632]
> [ 2012.040922] Modules linked in: zram(E) kvm_intel(E) kvm(E) irqbypass(E)
> crct10dif_pclmul(E) crc32_pclmul(E) ghash_clmulni_intel(E) pcbc(E)
> aesni_intel(E) aes_x86_64(E) crypto_simd(E) glue_helper(E) cryptd(E)
> psmouse(E) input_leds(E) floppy(E) virtio_scsi(E) serio_raw(E) i2c_piix4(E)
> mac_hid(E) pata_acpi(E) qemu_fw_cfg(E) 9pnet_virtio(E) 9p(E) 9pnet(E)
> fscache(E)
> [ 2012.044655] CPU: 2 PID: 1632 Comm: stress-ng-brk Tainted: G
> EL   4.15.18 #1
> [ 2012.045581] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> 1.13.0-1 04/01/2014
> [ 2012.046555] RIP:
> 0010:__raw_callee_save___pv_queued_spin_unlock+0x10/0x17
> [ 2012.047340] RSP: 0018:ffffb73382083718 EFLAGS: 00000246 ORIG_RAX:
> ffffffffffffff11
> [ 2012.048238] RAX: 0000000000000001 RBX: 0000000000000000 RCX:
> 0000000000000002
> [ 2012.049078] RDX: 0000000000000000 RSI: ffff9d327c2f6918 RDI:
> ffffffffa3269978
> [ 2012.049909] RBP: ffffb73382083720 R08: ffff9d327c2f6918 R09:
> ffff9d327c0a5328
> [ 2012.050746] R10: ffff9d327c1e2310 R11: ffff9d327c1e2328 R12:
> ffff9d327c2f6800
> [ 2012.051574] R13: ffff9d327c1e2328 R14: ffff9d327c1e2310 R15:
> ffff9d327c1e2200
> [ 2012.052436] FS:  00007f89f2ccd740(0000) GS:ffff9d327f280000(0000)
> knlGS:0000000000000000
> [ 2012.053382] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 2012.054058] CR2: 00007f1350a8dd90 CR3: 00000000311a4004 CR4:
> 0000000000160ee0
> [ 2012.054889] Call Trace:
> [ 2012.055192]  get_swap_pages+0x193/0x360
> [ 2012.055652]  get_swap_page+0x13f/0x1e0
> [ 2012.056123]  add_to_swap+0x14/0x70
> [ 2012.056530]  shrink_page_list+0x81d/0xbc0
> [ 2012.057013]  shrink_inactive_list+0x242/0x590
> [ 2012.057523]  shrink_node_memcg+0x364/0x770
> [ 2012.058012]  shrink_node+0xf7/0x300
> [ 2012.058432]  ? shrink_node+0xf7/0x300
> [ 2012.058863]  do_try_to_free_pages+0xc9/0x330
> [ 2012.059368]  try_to_free_pages+0xee/0x1b0
> [ 2012.059842]  __alloc_pages_slowpath+0x3fc/0xe00
> [ 2012.060424]  __alloc_pages_nodemask+0x29a/0x2c0
> [ 2012.060963]  alloc_pages_vma+0x88/0x1f0
> [ 2012.061414]  __handle_mm_fault+0x8b7/0x12e0
> [ 2012.061909]  handle_mm_fault+0xb1/0x210
> [ 2012.062375]  __do_page_fault+0x281/0x4b0
> [ 2012.062848]  do_page_fault+0x2e/0xe0
> [ 2012.063274]  ? async_page_fault+0x2f/0x50
> [ 2012.063751]  do_async_page_fault+0x51/0x80
> [ 2012.064262]  async_page_fault+0x45/0x50
> [ 2012.064719] RIP: 0033:0x55ec1997bd0a
> [ 2012.065147] RSP: 002b:00007ffeacd21600 EFLAGS: 00010246
> [ 2012.065754] RAX: 000055ec28601000 RBX: 0000000000000005 RCX:
> 00007f89f2de956b
> [ 2012.066580] RDX: 000055ec28601000 RSI: 00007ffeacd216d0 RDI:
> 000055ec28602000
> [ 2012.067410] RBP: 00007ffeacd216c0 R08: 0000000000000000 R09:
> 00007f89f3d0c2f0
> [ 2012.068290] R10: 0000000000000000 R11: 0000000000000246 R12:
> 0000000000000000
> [ 2012.069129] R13: 0000000000000002 R14: 0000000000000001 R15:
> 00007ffeacd216d0
> [ 2012.069965] Code: 50 41 51 41 52 41 53 e8 3b 05 00 00 41 5b 41 5a 41 59
> 41 58 5f 5e 5a 59 5d c3 90 55 48 89 e5 52 b8 01 00 00 00 31 d2 f0 0f b0 17
> <3c> 01 75 03 5a 5d c3 56 0f b6 f0 e8 bc ff ff ff 5e 5a 5d c3 0f
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1799497
>
> Title:
>   4.15 kernel hard lockup about once a week
>
> Status in linux package in Ubuntu:
>   Incomplete
> Status in zram-config package in Ubuntu:
>   Incomplete
> Status in linux source package in Bionic:
>   Confirmed
> Status in zram-config source package in Bionic:
>   Confirmed
>
> Bug description:
>   My main server has been running into hard lockups about once a week
>   ever since I switched to the 4.15 Ubuntu 18.04 kernel.
>
>   When this happens, nothing is printed to the console, it's effectively
>   stuck showing a login prompt. The system is running with panic=1 on
>   the cmdline but isn't rebooting so the kernel isn't even processing
>   this as a kernel panic.
>
>
>   As this felt like a potential hardware issue, I had my hosting provider
> give me a completely different system, different motherboard, different
> CPU, different RAM and different storage, I installed that system on 18.04
> and moved my data over, a week later, I hit the issue again.
>
>   We've since also had a LXD user reporting similar symptoms here also on
> varying hardware:
>     https://github.com/lxc/lxd/issues/5197
>
>
>   My system doesn't have a lot of memory pressure with about 50% of free
> memory:
>
>   root@vorash:~# free -m
>                 total        used        free      shared  buff/cache
>  available
>   Mem:          31819       17574         402         513       13842
>  13292
>   Swap:         15909        2687       13222
>
>   I will now try to increase console logging as much as possible on the
>   system in the hopes that next time it hangs we can get a better idea
>   of what happened but I'm not too hopeful given the complete silence on
>   the console when this occurs.
>
>   System is currently on:
>     Linux vorash 4.15.0-36-generic #39-Ubuntu SMP Mon Sep 24 16:19:09 UTC
> 2018 x86_64 x86_64 x86_64 GNU/Linux
>
>   But I've seen this since the GA kernel on 4.15 so it's not a recent
> regression.
>   ---
>   ProblemType: Bug
>   AlsaDevices:
>    total 0
>    crw-rw---- 1 root audio 116,  1 Oct 23 16:12 seq
>    crw-rw---- 1 root audio 116, 33 Oct 23 16:12 timer
>   AplayDevices: Error: [Errno 2] No such file or directory: 'aplay':
> 'aplay'
>   ApportVersion: 2.20.9-0ubuntu7.4
>   Architecture: amd64
>   ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord':
> 'arecord'
>   AudioDevicesInUse:
>    Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed
> with exit code 1: Cannot stat file /proc/22822/fd/10: Permission denied
>    Cannot stat file /proc/22831/fd/10: Permission denied
>   DistroRelease: Ubuntu 18.04
>   HibernationDevice:
>    RESUME=none
>    CRYPTSETUP=n
>   IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig':
> 'iwconfig'
>   Lsusb:
>    Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
>    Bus 001 Device 002: ID 046b:ff10 American Megatrends, Inc. Virtual
> Keyboard and Mouse
>    Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
>   MachineType: Intel Corporation S1200SP
>   NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
>   Package: linux (not installed)
>   PciMultimedia:
>
>   ProcEnviron:
>    TERM=xterm
>    PATH=(custom, no user)
>    XDG_RUNTIME_DIR=<set>
>    LANG=en_US.UTF-8
>    SHELL=/bin/bash
>   ProcFB: 0 mgadrmfb
>   ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-38-generic
> root=UUID=575c878a-0be6-4806-9c83-28f67aedea65 ro biosdevname=0
> net.ifnames=0 panic=1 verbose console=tty0 console=ttyS0,115200n8
>   ProcVersionSignature: Ubuntu 4.15.0-38.41-generic 4.15.18
>   RelatedPackageVersions:
>    linux-restricted-modules-4.15.0-38-generic N/A
>    linux-backports-modules-4.15.0-38-generic  N/A
>    linux-firmware                             1.173.1
>   RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill'
>   Tags:  bionic
>   Uname: Linux 4.15.0-38-generic x86_64
>   UnreportableReason: This report is about a package that is not installed.
>   UpgradeStatus: No upgrade log present (probably fresh install)
>   UserGroups:
>
>   _MarkForUpload: False
>   dmi.bios.date: 01/25/2018
>   dmi.bios.vendor: Intel Corporation
>   dmi.bios.version: S1200SP.86B.03.01.1029.012520180838
>   dmi.board.asset.tag: Base Board Asset Tag
>   dmi.board.name: S1200SP
>   dmi.board.vendor: Intel Corporation
>   dmi.board.version: H57532-271
>   dmi.chassis.asset.tag: ....................
>   dmi.chassis.type: 23
>   dmi.chassis.vendor: ...............................
>   dmi.chassis.version: ..................
>   dmi.modalias:
> dmi:bvnIntelCorporation:bvrS1200SP.86B.03.01.1029.012520180838:bd01/25/2018:svnIntelCorporation:pnS1200SP:pvr....................:rvnIntelCorporation:rnS1200SP:rvrH57532-271:cvn...............................:ct23:cvr..................:
>   dmi.product.family: Family
>   dmi.product.name: S1200SP
>   dmi.product.version: ....................
>   dmi.sys.vendor: Intel Corporation
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1799497/+subscriptions
>

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1799497

Title:
  4.15 kernel hard lockup about once a week

Status in linux package in Ubuntu:
  Incomplete
Status in zram-config package in Ubuntu:
  Incomplete
Status in linux source package in Bionic:
  Confirmed
Status in zram-config source package in Bionic:
  Confirmed

Bug description:
  My main server has been running into hard lockups about once a week
  ever since I switched to the 4.15 Ubuntu 18.04 kernel.

  When this happens, nothing is printed to the console, it's effectively
  stuck showing a login prompt. The system is running with panic=1 on
  the cmdline but isn't rebooting so the kernel isn't even processing
  this as a kernel panic.

  
  As this felt like a potential hardware issue, I had my hosting provider give 
me a completely different system, different motherboard, different CPU, 
different RAM and different storage, I installed that system on 18.04 and moved 
my data over, a week later, I hit the issue again.

  We've since also had a LXD user reporting similar symptoms here also on 
varying hardware:
    https://github.com/lxc/lxd/issues/5197

  
  My system doesn't have a lot of memory pressure with about 50% of free memory:

  root@vorash:~# free -m
                total        used        free      shared  buff/cache   
available
  Mem:          31819       17574         402         513       13842       
13292
  Swap:         15909        2687       13222

  I will now try to increase console logging as much as possible on the
  system in the hopes that next time it hangs we can get a better idea
  of what happened but I'm not too hopeful given the complete silence on
  the console when this occurs.

  System is currently on:
    Linux vorash 4.15.0-36-generic #39-Ubuntu SMP Mon Sep 24 16:19:09 UTC 2018 
x86_64 x86_64 x86_64 GNU/Linux

  But I've seen this since the GA kernel on 4.15 so it's not a recent 
regression.
  --- 
  ProblemType: Bug
  AlsaDevices:
   total 0
   crw-rw---- 1 root audio 116,  1 Oct 23 16:12 seq
   crw-rw---- 1 root audio 116, 33 Oct 23 16:12 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay'
  ApportVersion: 2.20.9-0ubuntu7.4
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 
'arecord'
  AudioDevicesInUse:
   Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with 
exit code 1: Cannot stat file /proc/22822/fd/10: Permission denied
   Cannot stat file /proc/22831/fd/10: Permission denied
  DistroRelease: Ubuntu 18.04
  HibernationDevice:
   RESUME=none
   CRYPTSETUP=n
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig'
  Lsusb:
   Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
   Bus 001 Device 002: ID 046b:ff10 American Megatrends, Inc. Virtual Keyboard 
and Mouse
   Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
  MachineType: Intel Corporation S1200SP
  NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
  Package: linux (not installed)
  PciMultimedia:
   
  ProcEnviron:
   TERM=xterm
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=<set>
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 mgadrmfb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-38-generic 
root=UUID=575c878a-0be6-4806-9c83-28f67aedea65 ro biosdevname=0 net.ifnames=0 
panic=1 verbose console=tty0 console=ttyS0,115200n8
  ProcVersionSignature: Ubuntu 4.15.0-38.41-generic 4.15.18
  RelatedPackageVersions:
   linux-restricted-modules-4.15.0-38-generic N/A
   linux-backports-modules-4.15.0-38-generic  N/A
   linux-firmware                             1.173.1
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill'
  Tags:  bionic
  Uname: Linux 4.15.0-38-generic x86_64
  UnreportableReason: This report is about a package that is not installed.
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups:
   
  _MarkForUpload: False
  dmi.bios.date: 01/25/2018
  dmi.bios.vendor: Intel Corporation
  dmi.bios.version: S1200SP.86B.03.01.1029.012520180838
  dmi.board.asset.tag: Base Board Asset Tag
  dmi.board.name: S1200SP
  dmi.board.vendor: Intel Corporation
  dmi.board.version: H57532-271
  dmi.chassis.asset.tag: ....................
  dmi.chassis.type: 23
  dmi.chassis.vendor: ...............................
  dmi.chassis.version: ..................
  dmi.modalias: 
dmi:bvnIntelCorporation:bvrS1200SP.86B.03.01.1029.012520180838:bd01/25/2018:svnIntelCorporation:pnS1200SP:pvr....................:rvnIntelCorporation:rnS1200SP:rvrH57532-271:cvn...............................:ct23:cvr..................:
  dmi.product.family: Family
  dmi.product.name: S1200SP
  dmi.product.version: ....................
  dmi.sys.vendor: Intel Corporation

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1799497/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to