Confirm the instances with tdp_mmu=0 does not seem to crash. Had 5 instances running for 4 days.
-- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2032176 Title: Crashing with CPU soft lock on GA kernel 5.15.0.79.76 and HWE kernel 5.19.0-46.47-22.04.1 Status in linux package in Ubuntu: Confirmed Status in linux source package in Jammy: Triaged Bug description: The crash occurred on a juju machine, and the juju agent was lost. The juju machine is on an openstack instance provision by juju. The openstack console log indicts the it is related to spin_lock and KVM MMU: [418200.348830] ? _raw_spin_lock+0x22/0x30 [418200.349588] _raw_write_lock+0x20/0x30 [418200.350196] kvm_tdp_mmu_map+0x2b1/0x490 [kvm] [418200.351014] kvm_mmu_notifier_invalidate_range_start+0x1ad/0x300 [kvm] [418200.351796] direct_page_fault+0x206/0x310 [kvm] [418200.352667] __mmu_notifier_invalidate_range_start+0x91/0x1b0 [418200.353624] kvm_tdp_page_fault+0x72/0x90 [kvm] [418200.354496] try_to_migrate_one+0x691/0x730 [418200.355436] kvm_mmu_page_fault+0x73/0x1c0 [kvm] openstack console log: https://pastebin.canonical.com/p/spmH8r3crQ/ syslog: https://pastebin.canonical.com/p/wFPsFD8G9n/ The syslog was rotated after the crash occurred, so the syslog at the time of the initial crash was lost. Other juju machine with 5.15.0.79.76 kernel seems to have the same issues. We previously have a similar issue with 5.15.0-73. The juju machine crashed with raw_spin_lock and kvm mmu in the logs as well: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2026229 ProblemType: Bug DistroRelease: Ubuntu 22.04 Package: linux-image-5.19.0-46-generic 5.19.0-46.47~22.04.1 ProcVersionSignature: Ubuntu 5.19.0-46.47~22.04.1-generic 5.19.17 Uname: Linux 5.19.0-46-generic x86_64 NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair ApportVersion: 2.20.11-0ubuntu82.5 Architecture: amd64 CasperMD5CheckResult: unknown CloudArchitecture: x86_64 CloudID: openstack CloudName: openstack CloudPlatform: openstack CloudSubPlatform: metadata (http://169.254.169.254) Date: Mon Aug 21 08:59:46 2023 Ec2AMI: ami-00000c61 Ec2AMIManifest: FIXME Ec2AvailabilityZone: availability-zone-1 Ec2InstanceType: builder-cpu4-ram72-disk20 Ec2Kernel: unavailable Ec2Ramdisk: unavailable ProcEnviron: TERM=xterm-256color PATH=(custom, no user) LANG=C.UTF-8 SHELL=/bin/bash SourcePackage: linux-signed-hwe-5.19 UpgradeStatus: No upgrade log present (probably fresh install) --- ProblemType: Bug AlsaDevices: total 0 crw-rw---- 1 root audio 116, 1 Aug 23 03:23 seq crw-rw---- 1 root audio 116, 33 Aug 23 03:23 timer AplayDevices: Error: [Errno 2] No such file or directory: 'aplay' ApportVersion: 2.20.11-0ubuntu82.5 Architecture: amd64 ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord' AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: CRDA: N/A CasperMD5CheckResult: unknown CloudArchitecture: x86_64 CloudID: openstack CloudName: openstack CloudPlatform: openstack CloudSubPlatform: metadata (http://169.254.169.254) DistroRelease: Ubuntu 22.04 Ec2AMI: ami-00000fbb Ec2AMIManifest: FIXME Ec2AvailabilityZone: availability-zone-2 Ec2InstanceType: builder-cpu2-ram44-disk20 Ec2Kernel: unavailable Ec2Ramdisk: unavailable IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig' Lsusb: Bus 001 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub Lsusb-t: /: Bus 01.Port 1: Dev 1, Class=root_hub, Driver=uhci_hcd/2p, 12M MachineType: OpenStack Foundation OpenStack Nova NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair Package: linux (not installed) PciMultimedia: ProcEnviron: TERM=xterm-256color PATH=(custom, no user) LANG=C.UTF-8 SHELL=/bin/bash ProcFB: 0 qxldrmfb ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.15.0-83-generic root=UUID=a6de04b8-3631-4ce4-bb96-48076f4a56bf ro console=tty1 console=ttyS0 ProcVersionSignature: Ubuntu 5.15.0-83.92-generic 5.15.116 RelatedPackageVersions: linux-restricted-modules-5.15.0-83-generic N/A linux-backports-modules-5.15.0-83-generic N/A linux-firmware 20220329.git681281e4-0ubuntu3.17 RfKill: Error: [Errno 2] No such file or directory: 'rfkill' Tags: jammy ec2-images Uname: Linux 5.15.0-83-generic x86_64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: N/A _MarkForUpload: True dmi.bios.date: 04/01/2014 dmi.bios.release: 0.0 dmi.bios.vendor: SeaBIOS dmi.bios.version: 1.13.0-1ubuntu1.1 dmi.chassis.type: 1 dmi.chassis.vendor: QEMU dmi.chassis.version: pc-i440fx-4.2 dmi.modalias: dmi:bvnSeaBIOS:bvr1.13.0-1ubuntu1.1:bd04/01/2014:br0.0:svnOpenStackFoundation:pnOpenStackNova:pvr21.2.4:cvnQEMU:ct1:cvrpc-i440fx-4.2:sku: dmi.product.family: Virtual Machine dmi.product.name: OpenStack Nova dmi.product.version: 21.2.4 dmi.sys.vendor: OpenStack Foundation To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2032176/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp