Re: [Kernel-packages] [Bug 1799497] Re: 4.15 kernel hard lockup about once a week
It sounds like what I was getting. On Thu, Jan 16, 2020 at 11:05 PM Colin Ian King <1799...@bugs.launchpad.net> wrote: > After quite a bit of experimentation I found that I can reproduce the bug > if I have zram *and* also swap on the filesystem enabled while exercising > the brk stressors and aiol (to cause lots of I/O). Eventually the system > grinds to a halt, we lose interactivity and we eventually get lockups as > follows: > [ 2012.040006] watchdog: BUG: soft lockup - CPU#2 stuck for 22s! > [stress-ng-brk:1632] > [ 2012.040922] Modules linked in: zram(E) kvm_intel(E) kvm(E) irqbypass(E) > crct10dif_pclmul(E) crc32_pclmul(E) ghash_clmulni_intel(E) pcbc(E) > aesni_intel(E) aes_x86_64(E) crypto_simd(E) glue_helper(E) cryptd(E) > psmouse(E) input_leds(E) floppy(E) virtio_scsi(E) serio_raw(E) i2c_piix4(E) > mac_hid(E) pata_acpi(E) qemu_fw_cfg(E) 9pnet_virtio(E) 9p(E) 9pnet(E) > fscache(E) > [ 2012.044655] CPU: 2 PID: 1632 Comm: stress-ng-brk Tainted: G > EL 4.15.18 #1 > [ 2012.045581] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > 1.13.0-1 04/01/2014 > [ 2012.046555] RIP: > 0010:__raw_callee_save___pv_queued_spin_unlock+0x10/0x17 > [ 2012.047340] RSP: 0018:b73382083718 EFLAGS: 0246 ORIG_RAX: > ff11 > [ 2012.048238] RAX: 0001 RBX: RCX: > 0002 > [ 2012.049078] RDX: RSI: 9d327c2f6918 RDI: > a3269978 > [ 2012.049909] RBP: b73382083720 R08: 9d327c2f6918 R09: > 9d327c0a5328 > [ 2012.050746] R10: 9d327c1e2310 R11: 9d327c1e2328 R12: > 9d327c2f6800 > [ 2012.051574] R13: 9d327c1e2328 R14: 9d327c1e2310 R15: > 9d327c1e2200 > [ 2012.052436] FS: 7f89f2ccd740() GS:9d327f28() > knlGS: > [ 2012.053382] CS: 0010 DS: ES: CR0: 80050033 > [ 2012.054058] CR2: 7f1350a8dd90 CR3: 311a4004 CR4: > 00160ee0 > [ 2012.054889] Call Trace: > [ 2012.055192] get_swap_pages+0x193/0x360 > [ 2012.055652] get_swap_page+0x13f/0x1e0 > [ 2012.056123] add_to_swap+0x14/0x70 > [ 2012.056530] shrink_page_list+0x81d/0xbc0 > [ 2012.057013] shrink_inactive_list+0x242/0x590 > [ 2012.057523] shrink_node_memcg+0x364/0x770 > [ 2012.058012] shrink_node+0xf7/0x300 > [ 2012.058432] ? shrink_node+0xf7/0x300 > [ 2012.058863] do_try_to_free_pages+0xc9/0x330 > [ 2012.059368] try_to_free_pages+0xee/0x1b0 > [ 2012.059842] __alloc_pages_slowpath+0x3fc/0xe00 > [ 2012.060424] __alloc_pages_nodemask+0x29a/0x2c0 > [ 2012.060963] alloc_pages_vma+0x88/0x1f0 > [ 2012.061414] __handle_mm_fault+0x8b7/0x12e0 > [ 2012.061909] handle_mm_fault+0xb1/0x210 > [ 2012.062375] __do_page_fault+0x281/0x4b0 > [ 2012.062848] do_page_fault+0x2e/0xe0 > [ 2012.063274] ? async_page_fault+0x2f/0x50 > [ 2012.063751] do_async_page_fault+0x51/0x80 > [ 2012.064262] async_page_fault+0x45/0x50 > [ 2012.064719] RIP: 0033:0x55ec1997bd0a > [ 2012.065147] RSP: 002b:7ffeacd21600 EFLAGS: 00010246 > [ 2012.065754] RAX: 55ec28601000 RBX: 0005 RCX: > 7f89f2de956b > [ 2012.066580] RDX: 55ec28601000 RSI: 7ffeacd216d0 RDI: > 55ec28602000 > [ 2012.067410] RBP: 7ffeacd216c0 R08: R09: > 7f89f3d0c2f0 > [ 2012.068290] R10: R11: 0246 R12: > > [ 2012.069129] R13: 0002 R14: 0001 R15: > 7ffeacd216d0 > [ 2012.069965] Code: 50 41 51 41 52 41 53 e8 3b 05 00 00 41 5b 41 5a 41 59 > 41 58 5f 5e 5a 59 5d c3 90 55 48 89 e5 52 b8 01 00 00 00 31 d2 f0 0f b0 17 > <3c> 01 75 03 5a 5d c3 56 0f b6 f0 e8 bc ff ff ff 5e 5a 5d c3 0f > > -- > You received this bug notification because you are subscribed to the bug > report. > https://bugs.launchpad.net/bugs/1799497 > > Title: > 4.15 kernel hard lockup about once a week > > Status in linux package in Ubuntu: > Incomplete > Status in zram-config package in Ubuntu: > Incomplete > Status in linux source package in Bionic: > Confirmed > Status in zram-config source package in Bionic: > Confirmed > > Bug description: > My main server has been running into hard lockups about once a week > ever since I switched to the 4.15 Ubuntu 18.04 kernel. > > When this happens, nothing is printed to the console, it's effectively > stuck showing a login prompt. The system is running with panic=1 on > the cmdline but isn't rebooting so the kernel isn't even processing > this as a kernel panic. > > > As this felt like a potential hardware issue, I had my hosting provider > give me a completely different system, different motherboard, different > CPU, different RAM and different storage, I installed that system on 18.04 > and moved my data over, a week later, I hit the issue again. > > We've since also had a LXD user reporting similar symptoms here also on > varying hardware: > https://github.com/lxc/lxd/issues/5197 > > > My system doesn't have a lot of memory pressure with about 50% of free > memory: > >
Re: [Kernel-packages] [Bug 1799497] Re: 4.15 kernel hard lockup about once a week
Hi.. I had to remove zram config from my production servers long ago. ... since then I don't have the issue. I was using LXD containers a lot on the hosts with different kind of usage,, But I don't have any other setup at the moment On Fri, Jan 10, 2020 at 12:11 AM Colin Ian King <1799...@bugs.launchpad.net> wrote: > Can reproduce this with stress-ng exercising high memory pressure scenario > using: > stress-ng --brk 0 -v --aiol 0 > > -- > You received this bug notification because you are subscribed to the bug > report. > https://bugs.launchpad.net/bugs/1799497 > > Title: > 4.15 kernel hard lockup about once a week > > Status in linux package in Ubuntu: > Incomplete > Status in zram-config package in Ubuntu: > Incomplete > Status in linux source package in Bionic: > Confirmed > Status in zram-config source package in Bionic: > Confirmed > > Bug description: > My main server has been running into hard lockups about once a week > ever since I switched to the 4.15 Ubuntu 18.04 kernel. > > When this happens, nothing is printed to the console, it's effectively > stuck showing a login prompt. The system is running with panic=1 on > the cmdline but isn't rebooting so the kernel isn't even processing > this as a kernel panic. > > > As this felt like a potential hardware issue, I had my hosting provider > give me a completely different system, different motherboard, different > CPU, different RAM and different storage, I installed that system on 18.04 > and moved my data over, a week later, I hit the issue again. > > We've since also had a LXD user reporting similar symptoms here also on > varying hardware: > https://github.com/lxc/lxd/issues/5197 > > > My system doesn't have a lot of memory pressure with about 50% of free > memory: > > root@vorash:~# free -m > totalusedfree shared buff/cache > available > Mem: 31819 17574 402 513 13842 > 13292 > Swap: 159092687 13222 > > I will now try to increase console logging as much as possible on the > system in the hopes that next time it hangs we can get a better idea > of what happened but I'm not too hopeful given the complete silence on > the console when this occurs. > > System is currently on: > Linux vorash 4.15.0-36-generic #39-Ubuntu SMP Mon Sep 24 16:19:09 UTC > 2018 x86_64 x86_64 x86_64 GNU/Linux > > But I've seen this since the GA kernel on 4.15 so it's not a recent > regression. > --- > ProblemType: Bug > AlsaDevices: >total 0 >crw-rw 1 root audio 116, 1 Oct 23 16:12 seq >crw-rw 1 root audio 116, 33 Oct 23 16:12 timer > AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': > 'aplay' > ApportVersion: 2.20.9-0ubuntu7.4 > Architecture: amd64 > ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': > 'arecord' > AudioDevicesInUse: >Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed > with exit code 1: Cannot stat file /proc/22822/fd/10: Permission denied >Cannot stat file /proc/22831/fd/10: Permission denied > DistroRelease: Ubuntu 18.04 > HibernationDevice: >RESUME=none >CRYPTSETUP=n > IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': > 'iwconfig' > Lsusb: >Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub >Bus 001 Device 002: ID 046b:ff10 American Megatrends, Inc. Virtual > Keyboard and Mouse >Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub > MachineType: Intel Corporation S1200SP > NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair > Package: linux (not installed) > PciMultimedia: > > ProcEnviron: >TERM=xterm >PATH=(custom, no user) >XDG_RUNTIME_DIR= >LANG=en_US.UTF-8 >SHELL=/bin/bash > ProcFB: 0 mgadrmfb > ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-38-generic > root=UUID=575c878a-0be6-4806-9c83-28f67aedea65 ro biosdevname=0 > net.ifnames=0 panic=1 verbose console=tty0 console=ttyS0,115200n8 > ProcVersionSignature: Ubuntu 4.15.0-38.41-generic 4.15.18 > RelatedPackageVersions: >linux-restricted-modules-4.15.0-38-generic N/A >linux-backports-modules-4.15.0-38-generic N/A >linux-firmware 1.173.1 > RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill' > Tags: bionic > Uname: Linux 4.15.0-38-generic x86_64 > UnreportableReason: This report is about a package that is not installed. > UpgradeStatus: No upgrade log present (probably fresh install) > UserGroups: > > _MarkForUpload: False > dmi.bios.date: 01/25/2018 > dmi.bios.vendor: Intel Corporation > dmi.bios.version: S1200SP.86B.03.01.1029.012520180838 > dmi.board.asset.tag: Base Board Asset Tag > dmi.board.name: S1200SP > dmi.board.vendor: Intel Corporation > dmi.board.version: H57532-271 > dmi.chassis.asset.tag: > dmi.chassis.type:
[Kernel-packages] [Bug 1799497] Re: 4.15 kernel hard lockup about once a week
OK.. it is been quite a while with no locks I had it once after the zram config pacakge was removed,, but no other locks since then. kernel version is 4.15.0-33 to 38 in different servers.. I am going to update the servers to latest version reboot, and wait for a little longer. then I am going to install back zram-config on certain servers to see if it shows up again. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1799497 Title: 4.15 kernel hard lockup about once a week Status in linux package in Ubuntu: Incomplete Status in linux source package in Bionic: Incomplete Bug description: My main server has been running into hard lockups about once a week ever since I switched to the 4.15 Ubuntu 18.04 kernel. When this happens, nothing is printed to the console, it's effectively stuck showing a login prompt. The system is running with panic=1 on the cmdline but isn't rebooting so the kernel isn't even processing this as a kernel panic. As this felt like a potential hardware issue, I had my hosting provider give me a completely different system, different motherboard, different CPU, different RAM and different storage, I installed that system on 18.04 and moved my data over, a week later, I hit the issue again. We've since also had a LXD user reporting similar symptoms here also on varying hardware: https://github.com/lxc/lxd/issues/5197 My system doesn't have a lot of memory pressure with about 50% of free memory: root@vorash:~# free -m totalusedfree shared buff/cache available Mem: 31819 17574 402 513 13842 13292 Swap: 159092687 13222 I will now try to increase console logging as much as possible on the system in the hopes that next time it hangs we can get a better idea of what happened but I'm not too hopeful given the complete silence on the console when this occurs. System is currently on: Linux vorash 4.15.0-36-generic #39-Ubuntu SMP Mon Sep 24 16:19:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux But I've seen this since the GA kernel on 4.15 so it's not a recent regression. --- ProblemType: Bug AlsaDevices: total 0 crw-rw 1 root audio 116, 1 Oct 23 16:12 seq crw-rw 1 root audio 116, 33 Oct 23 16:12 timer AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay' ApportVersion: 2.20.9-0ubuntu7.4 Architecture: amd64 ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 'arecord' AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: Cannot stat file /proc/22822/fd/10: Permission denied Cannot stat file /proc/22831/fd/10: Permission denied DistroRelease: Ubuntu 18.04 HibernationDevice: RESUME=none CRYPTSETUP=n IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig' Lsusb: Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub Bus 001 Device 002: ID 046b:ff10 American Megatrends, Inc. Virtual Keyboard and Mouse Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub MachineType: Intel Corporation S1200SP NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair Package: linux (not installed) PciMultimedia: ProcEnviron: TERM=xterm PATH=(custom, no user) XDG_RUNTIME_DIR= LANG=en_US.UTF-8 SHELL=/bin/bash ProcFB: 0 mgadrmfb ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-38-generic root=UUID=575c878a-0be6-4806-9c83-28f67aedea65 ro biosdevname=0 net.ifnames=0 panic=1 verbose console=tty0 console=ttyS0,115200n8 ProcVersionSignature: Ubuntu 4.15.0-38.41-generic 4.15.18 RelatedPackageVersions: linux-restricted-modules-4.15.0-38-generic N/A linux-backports-modules-4.15.0-38-generic N/A linux-firmware 1.173.1 RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill' Tags: bionic Uname: Linux 4.15.0-38-generic x86_64 UnreportableReason: This report is about a package that is not installed. UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: _MarkForUpload: False dmi.bios.date: 01/25/2018 dmi.bios.vendor: Intel Corporation dmi.bios.version: S1200SP.86B.03.01.1029.012520180838 dmi.board.asset.tag: Base Board Asset Tag dmi.board.name: S1200SP dmi.board.vendor: Intel Corporation dmi.board.version: H57532-271 dmi.chassis.asset.tag: dmi.chassis.type: 23 dmi.chassis.vendor: ... dmi.chassis.version: .. dmi.modalias:
[Kernel-packages] [Bug 1799497] Re: 4.15 kernel hard lockup about once a week
Got a hot locked with no zram-config installed.. Same behaviour, no log information, can't even type in the console, no ssh, no ping. ALso all the LXD containers don't ping either -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1799497 Title: 4.15 kernel hard lockup about once a week Status in linux package in Ubuntu: Incomplete Status in linux source package in Bionic: Incomplete Bug description: My main server has been running into hard lockups about once a week ever since I switched to the 4.15 Ubuntu 18.04 kernel. When this happens, nothing is printed to the console, it's effectively stuck showing a login prompt. The system is running with panic=1 on the cmdline but isn't rebooting so the kernel isn't even processing this as a kernel panic. As this felt like a potential hardware issue, I had my hosting provider give me a completely different system, different motherboard, different CPU, different RAM and different storage, I installed that system on 18.04 and moved my data over, a week later, I hit the issue again. We've since also had a LXD user reporting similar symptoms here also on varying hardware: https://github.com/lxc/lxd/issues/5197 My system doesn't have a lot of memory pressure with about 50% of free memory: root@vorash:~# free -m totalusedfree shared buff/cache available Mem: 31819 17574 402 513 13842 13292 Swap: 159092687 13222 I will now try to increase console logging as much as possible on the system in the hopes that next time it hangs we can get a better idea of what happened but I'm not too hopeful given the complete silence on the console when this occurs. System is currently on: Linux vorash 4.15.0-36-generic #39-Ubuntu SMP Mon Sep 24 16:19:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux But I've seen this since the GA kernel on 4.15 so it's not a recent regression. --- ProblemType: Bug AlsaDevices: total 0 crw-rw 1 root audio 116, 1 Oct 23 16:12 seq crw-rw 1 root audio 116, 33 Oct 23 16:12 timer AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay' ApportVersion: 2.20.9-0ubuntu7.4 Architecture: amd64 ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 'arecord' AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: Cannot stat file /proc/22822/fd/10: Permission denied Cannot stat file /proc/22831/fd/10: Permission denied DistroRelease: Ubuntu 18.04 HibernationDevice: RESUME=none CRYPTSETUP=n IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig' Lsusb: Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub Bus 001 Device 002: ID 046b:ff10 American Megatrends, Inc. Virtual Keyboard and Mouse Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub MachineType: Intel Corporation S1200SP NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair Package: linux (not installed) PciMultimedia: ProcEnviron: TERM=xterm PATH=(custom, no user) XDG_RUNTIME_DIR= LANG=en_US.UTF-8 SHELL=/bin/bash ProcFB: 0 mgadrmfb ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-38-generic root=UUID=575c878a-0be6-4806-9c83-28f67aedea65 ro biosdevname=0 net.ifnames=0 panic=1 verbose console=tty0 console=ttyS0,115200n8 ProcVersionSignature: Ubuntu 4.15.0-38.41-generic 4.15.18 RelatedPackageVersions: linux-restricted-modules-4.15.0-38-generic N/A linux-backports-modules-4.15.0-38-generic N/A linux-firmware 1.173.1 RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill' Tags: bionic Uname: Linux 4.15.0-38-generic x86_64 UnreportableReason: This report is about a package that is not installed. UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: _MarkForUpload: False dmi.bios.date: 01/25/2018 dmi.bios.vendor: Intel Corporation dmi.bios.version: S1200SP.86B.03.01.1029.012520180838 dmi.board.asset.tag: Base Board Asset Tag dmi.board.name: S1200SP dmi.board.vendor: Intel Corporation dmi.board.version: H57532-271 dmi.chassis.asset.tag: dmi.chassis.type: 23 dmi.chassis.vendor: ... dmi.chassis.version: .. dmi.modalias: dmi:bvnIntelCorporation:bvrS1200SP.86B.03.01.1029.012520180838:bd01/25/2018:svnIntelCorporation:pnS1200SP:pvr:rvnIntelCorporation:rnS1200SP:rvrH57532-271:cvn...:ct23:cvr..: dmi.product.family: Family dmi.product.name: S1200SP dmi.product.version: dmi.sys.vendor: Intel Corporation To manage notifications about this bug
[Kernel-packages] [Bug 1799497] Re: 4.15 kernel hard lockup about once a week
Correct.. I ould like to give it some more time to see if it doesn't happen. So far so good, no lockups. I hadnt have to restart any server in a week and a half. I'll try to prepare the same setup on another server with zram-config to see if it happens again on that particular server -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1799497 Title: 4.15 kernel hard lockup about once a week Status in linux package in Ubuntu: Incomplete Status in linux source package in Bionic: Incomplete Bug description: My main server has been running into hard lockups about once a week ever since I switched to the 4.15 Ubuntu 18.04 kernel. When this happens, nothing is printed to the console, it's effectively stuck showing a login prompt. The system is running with panic=1 on the cmdline but isn't rebooting so the kernel isn't even processing this as a kernel panic. As this felt like a potential hardware issue, I had my hosting provider give me a completely different system, different motherboard, different CPU, different RAM and different storage, I installed that system on 18.04 and moved my data over, a week later, I hit the issue again. We've since also had a LXD user reporting similar symptoms here also on varying hardware: https://github.com/lxc/lxd/issues/5197 My system doesn't have a lot of memory pressure with about 50% of free memory: root@vorash:~# free -m totalusedfree shared buff/cache available Mem: 31819 17574 402 513 13842 13292 Swap: 159092687 13222 I will now try to increase console logging as much as possible on the system in the hopes that next time it hangs we can get a better idea of what happened but I'm not too hopeful given the complete silence on the console when this occurs. System is currently on: Linux vorash 4.15.0-36-generic #39-Ubuntu SMP Mon Sep 24 16:19:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux But I've seen this since the GA kernel on 4.15 so it's not a recent regression. --- ProblemType: Bug AlsaDevices: total 0 crw-rw 1 root audio 116, 1 Oct 23 16:12 seq crw-rw 1 root audio 116, 33 Oct 23 16:12 timer AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay' ApportVersion: 2.20.9-0ubuntu7.4 Architecture: amd64 ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 'arecord' AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: Cannot stat file /proc/22822/fd/10: Permission denied Cannot stat file /proc/22831/fd/10: Permission denied DistroRelease: Ubuntu 18.04 HibernationDevice: RESUME=none CRYPTSETUP=n IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig' Lsusb: Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub Bus 001 Device 002: ID 046b:ff10 American Megatrends, Inc. Virtual Keyboard and Mouse Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub MachineType: Intel Corporation S1200SP NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair Package: linux (not installed) PciMultimedia: ProcEnviron: TERM=xterm PATH=(custom, no user) XDG_RUNTIME_DIR= LANG=en_US.UTF-8 SHELL=/bin/bash ProcFB: 0 mgadrmfb ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-38-generic root=UUID=575c878a-0be6-4806-9c83-28f67aedea65 ro biosdevname=0 net.ifnames=0 panic=1 verbose console=tty0 console=ttyS0,115200n8 ProcVersionSignature: Ubuntu 4.15.0-38.41-generic 4.15.18 RelatedPackageVersions: linux-restricted-modules-4.15.0-38-generic N/A linux-backports-modules-4.15.0-38-generic N/A linux-firmware 1.173.1 RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill' Tags: bionic Uname: Linux 4.15.0-38-generic x86_64 UnreportableReason: This report is about a package that is not installed. UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: _MarkForUpload: False dmi.bios.date: 01/25/2018 dmi.bios.vendor: Intel Corporation dmi.bios.version: S1200SP.86B.03.01.1029.012520180838 dmi.board.asset.tag: Base Board Asset Tag dmi.board.name: S1200SP dmi.board.vendor: Intel Corporation dmi.board.version: H57532-271 dmi.chassis.asset.tag: dmi.chassis.type: 23 dmi.chassis.vendor: ... dmi.chassis.version: .. dmi.modalias: dmi:bvnIntelCorporation:bvrS1200SP.86B.03.01.1029.012520180838:bd01/25/2018:svnIntelCorporation:pnS1200SP:pvr:rvnIntelCorporation:rnS1200SP:rvrH57532-271:cvn...:ct23:cvr..: dmi.product.family: Family dmi.product.name: S1200SP
[Kernel-packages] [Bug 1799497] Re: 4.15 kernel hard lockup about once a week
In my case it hasn't happen again.. Although I removed package zram- config from the host servers ( I think this is the only difference in software from 16.04 to 18.04 that I added. I would like to either discard or confirm that that it has an effect on the issue -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1799497 Title: 4.15 kernel hard lockup about once a week Status in linux package in Ubuntu: Incomplete Status in linux source package in Bionic: Incomplete Bug description: My main server has been running into hard lockups about once a week ever since I switched to the 4.15 Ubuntu 18.04 kernel. When this happens, nothing is printed to the console, it's effectively stuck showing a login prompt. The system is running with panic=1 on the cmdline but isn't rebooting so the kernel isn't even processing this as a kernel panic. As this felt like a potential hardware issue, I had my hosting provider give me a completely different system, different motherboard, different CPU, different RAM and different storage, I installed that system on 18.04 and moved my data over, a week later, I hit the issue again. We've since also had a LXD user reporting similar symptoms here also on varying hardware: https://github.com/lxc/lxd/issues/5197 My system doesn't have a lot of memory pressure with about 50% of free memory: root@vorash:~# free -m totalusedfree shared buff/cache available Mem: 31819 17574 402 513 13842 13292 Swap: 159092687 13222 I will now try to increase console logging as much as possible on the system in the hopes that next time it hangs we can get a better idea of what happened but I'm not too hopeful given the complete silence on the console when this occurs. System is currently on: Linux vorash 4.15.0-36-generic #39-Ubuntu SMP Mon Sep 24 16:19:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux But I've seen this since the GA kernel on 4.15 so it's not a recent regression. --- ProblemType: Bug AlsaDevices: total 0 crw-rw 1 root audio 116, 1 Oct 23 16:12 seq crw-rw 1 root audio 116, 33 Oct 23 16:12 timer AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay' ApportVersion: 2.20.9-0ubuntu7.4 Architecture: amd64 ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 'arecord' AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: Cannot stat file /proc/22822/fd/10: Permission denied Cannot stat file /proc/22831/fd/10: Permission denied DistroRelease: Ubuntu 18.04 HibernationDevice: RESUME=none CRYPTSETUP=n IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig' Lsusb: Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub Bus 001 Device 002: ID 046b:ff10 American Megatrends, Inc. Virtual Keyboard and Mouse Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub MachineType: Intel Corporation S1200SP NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair Package: linux (not installed) PciMultimedia: ProcEnviron: TERM=xterm PATH=(custom, no user) XDG_RUNTIME_DIR= LANG=en_US.UTF-8 SHELL=/bin/bash ProcFB: 0 mgadrmfb ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-38-generic root=UUID=575c878a-0be6-4806-9c83-28f67aedea65 ro biosdevname=0 net.ifnames=0 panic=1 verbose console=tty0 console=ttyS0,115200n8 ProcVersionSignature: Ubuntu 4.15.0-38.41-generic 4.15.18 RelatedPackageVersions: linux-restricted-modules-4.15.0-38-generic N/A linux-backports-modules-4.15.0-38-generic N/A linux-firmware 1.173.1 RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill' Tags: bionic Uname: Linux 4.15.0-38-generic x86_64 UnreportableReason: This report is about a package that is not installed. UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: _MarkForUpload: False dmi.bios.date: 01/25/2018 dmi.bios.vendor: Intel Corporation dmi.bios.version: S1200SP.86B.03.01.1029.012520180838 dmi.board.asset.tag: Base Board Asset Tag dmi.board.name: S1200SP dmi.board.vendor: Intel Corporation dmi.board.version: H57532-271 dmi.chassis.asset.tag: dmi.chassis.type: 23 dmi.chassis.vendor: ... dmi.chassis.version: .. dmi.modalias: dmi:bvnIntelCorporation:bvrS1200SP.86B.03.01.1029.012520180838:bd01/25/2018:svnIntelCorporation:pnS1200SP:pvr:rvnIntelCorporation:rnS1200SP:rvrH57532-271:cvn...:ct23:cvr..: dmi.product.family: Family dmi.product.name: S1200SP dmi.product.version:
[Kernel-packages] [Bug 1799497] Re: 4.15 kernel hard lockup about once a week
Hello, I sumbitted the report on LXD since that is the only thing I have installed on the server that is actively running as Stéphane mentioned on https://github.com/lxc/lxd/issues/5197 I also thought it maybe hardware issue, but since upgrading to 18.04 in May I have experienced this on a variety of hardware, and even though I thought it may be upgrade issue it is also not the case. I also thought it was memory related, since now it occurs, as Stéphane mentiones around once a week, but in my case on different servers. THe last server where it happened didn't have any issue for the last maybe two months and was not that loaded in terms of memory, but it seems more frequent in servers that are actively used in both memory and CPU. It doesn't happen on blade hosts that only have 2-4 LXD containers and 4GB of RAM, it has only happened on 16GB, 24GB, 48GB and 128GB of RAM HP and Dell servers, that have a little more load (minimum 6 containers up to 20) At least I a not alone, but have no clue how to recreate or address this issue (since also logs provide no information) I could also try some kernels. On 4.4 as Stephane mentioned didn't happen, int only started happening on GA (as he also mentiones) of 18.04. I have been constantly upgrading the kernel to no avail. So it seems it could have been introduced before. strangely and thankfully it doesn't happen on my main production server (Except yesterday crash on one of them). Mostly on development servers that are actively used (developers are not happy) ** Bug watch added: LXD bug tracker #5197 https://github.com/lxc/lxd/issues/5197 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1799497 Title: 4.15 kernel hard lockup about once a week Status in linux package in Ubuntu: Incomplete Status in linux source package in Bionic: Incomplete Bug description: My main server has been running into hard lockups about once a week ever since I switched to the 4.15 Ubuntu 18.04 kernel. When this happens, nothing is printed to the console, it's effectively stuck showing a login prompt. The system is running with panic=1 on the cmdline but isn't rebooting so the kernel isn't even processing this as a kernel panic. As this felt like a potential hardware issue, I had my hosting provider give me a completely different system, different motherboard, different CPU, different RAM and different storage, I installed that system on 18.04 and moved my data over, a week later, I hit the issue again. We've since also had a LXD user reporting similar symptoms here also on varying hardware: https://github.com/lxc/lxd/issues/5197 My system doesn't have a lot of memory pressure with about 50% of free memory: root@vorash:~# free -m totalusedfree shared buff/cache available Mem: 31819 17574 402 513 13842 13292 Swap: 159092687 13222 I will now try to increase console logging as much as possible on the system in the hopes that next time it hangs we can get a better idea of what happened but I'm not too hopeful given the complete silence on the console when this occurs. System is currently on: Linux vorash 4.15.0-36-generic #39-Ubuntu SMP Mon Sep 24 16:19:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux But I've seen this since the GA kernel on 4.15 so it's not a recent regression. --- ProblemType: Bug AlsaDevices: total 0 crw-rw 1 root audio 116, 1 Oct 23 16:12 seq crw-rw 1 root audio 116, 33 Oct 23 16:12 timer AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay' ApportVersion: 2.20.9-0ubuntu7.4 Architecture: amd64 ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 'arecord' AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: Cannot stat file /proc/22822/fd/10: Permission denied Cannot stat file /proc/22831/fd/10: Permission denied DistroRelease: Ubuntu 18.04 HibernationDevice: RESUME=none CRYPTSETUP=n IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig' Lsusb: Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub Bus 001 Device 002: ID 046b:ff10 American Megatrends, Inc. Virtual Keyboard and Mouse Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub MachineType: Intel Corporation S1200SP NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair Package: linux (not installed) PciMultimedia: ProcEnviron: TERM=xterm PATH=(custom, no user) XDG_RUNTIME_DIR= LANG=en_US.UTF-8 SHELL=/bin/bash ProcFB: 0 mgadrmfb ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-38-generic root=UUID=575c878a-0be6-4806-9c83-28f67aedea65 ro biosdevname=0 net.ifnames=0 panic=1 verbose