[Kernel-packages] [Bug 2059978] Re: linux-aws-5.15 ADT test MISS because it's unable to find package
@paride: Yes, I've seen this with other kernels, mostly with the nvidia drivers. I think all of the runs of the following since March 20 show this problem: https://autopkgtest.ubuntu.com/packages/n/nvidia-graphics-drivers-510-server/focal/amd64 https://autopkgtest.ubuntu.com/packages/n/nvidia-graphics-drivers-515/focal/amd64 https://autopkgtest.ubuntu.com/packages/n/nvidia-graphics-drivers-460-server/focal/amd64 As mentioned in MM, all of these started failing between March 11 and March 20. All of these drivers are transitional packages (they don't really do anything other then depend on the next driver in the series to keep users on a supported driver), but I don't know if this is of interest to the problem as I've seen this with other packages too. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2059978 Title: linux-aws-5.15 ADT test MISS because it's unable to find package Status in Auto Package Testing: Invalid Status in autopkgtest package in Ubuntu: Confirmed Status in linux package in Ubuntu: New Bug description: SRU cycle 2024.03.04 Focal aws-5.15 ADT test linux-aws-5.15 for both amd64 and arm64 results in a MISS with the error message below. It seems like the test was unable to locate the kernel-testing--linux- aws-5.15--modules-extra--preferred$ package which led to other missing packages and the test erroring out with exit code 1. This test has been SKIPPED before but with seemingly different reasons. I have also attached the whole log amd64 output to this bug report. "339s Reading state information... 339s E: Unable to locate package ^kernel-testing--linux-aws-5.15--modules-extra--preferred$ 339s E: Couldn't find any package by glob '^kernel-testing--linux-aws-5.15--modules-extra--preferred$' 339s E: Couldn't find any package by regex '^kernel-testing--linux-aws-5.15--modules-extra--preferred$' 339s Reading package lists... 339s Building dependency tree... 339s Reading state information... 339s E: Unable to locate package ^linux-modules-extra-aws-5.15$ 339s E: Couldn't find any package by glob '^linux-modules-extra-aws-5.15$' 339s E: Couldn't find any package by regex '^linux-modules-extra-aws-5.15$' 339s autopkgtest [16:53:45]: rebooting testbed after setup commands that affected boot 363s autopkgtest [16:54:09]: testbed running kernel: Linux 5.15.0-1057-aws #63~20.04.1-Ubuntu SMP Mon Mar 25 10:28:36 UTC 2024 363s autopkgtest [16:54:09]: apt-source linux-aws-5.15 364s blame: linux-aws-5.15 364s badpkg: rules extract failed with exit code 1 364s autopkgtest [16:54:10]: ERROR: erroneous package: rules extract failed with exit code 1" To manage notifications about this bug go to: https://bugs.launchpad.net/auto-package-testing/+bug/2059978/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2052640] Re: New NVIDIA release 470.239.06
I have done my typical CUDA based testing with this package using the generic, nvidia and gcp kernels using bionic, focal, jammy and mantic (amd64 only so far). -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to nvidia-graphics-drivers-470 in Ubuntu. https://bugs.launchpad.net/bugs/2052640 Title: New NVIDIA release 470.239.06 Status in fabric-manager-470 package in Ubuntu: In Progress Status in libnvidia-nscq-470 package in Ubuntu: In Progress Status in nvidia-graphics-drivers-470 package in Ubuntu: In Progress Status in nvidia-graphics-drivers-470-server package in Ubuntu: In Progress Status in fabric-manager-470 source package in Bionic: In Progress Status in libnvidia-nscq-470 source package in Bionic: In Progress Status in nvidia-graphics-drivers-470 source package in Bionic: In Progress Status in nvidia-graphics-drivers-470-server source package in Bionic: In Progress Status in fabric-manager-470 source package in Focal: In Progress Status in libnvidia-nscq-470 source package in Focal: In Progress Status in nvidia-graphics-drivers-470 source package in Focal: In Progress Status in nvidia-graphics-drivers-470-server source package in Focal: In Progress Status in fabric-manager-470 source package in Jammy: In Progress Status in libnvidia-nscq-470 source package in Jammy: In Progress Status in nvidia-graphics-drivers-470 source package in Jammy: In Progress Status in nvidia-graphics-drivers-470-server source package in Jammy: In Progress Status in fabric-manager-470 source package in Mantic: In Progress Status in libnvidia-nscq-470 source package in Mantic: In Progress Status in nvidia-graphics-drivers-470 source package in Mantic: In Progress Status in nvidia-graphics-drivers-470-server source package in Mantic: In Progress Status in fabric-manager-470 source package in Noble: In Progress Status in libnvidia-nscq-470 source package in Noble: In Progress Status in nvidia-graphics-drivers-470 source package in Noble: In Progress Status in nvidia-graphics-drivers-470-server source package in Noble: In Progress Bug description: [Impact] These releases provide both bug fixes and new features, and we would like to make sure all of our users have access to these improvements. See the changelog entry below for a full list of changes and bugs. [Test Case] The following development and SRU process was followed: https://wiki.ubuntu.com/NVidiaUpdates Certification test suite must pass on a range of hardware: https://git.launchpad.net/plainbox-provider-sru/tree/units/sru.pxu The QA team that executed the tests will be in charge of attaching the artifacts and console output of the appropriate run to the bug. nVidia maintainers team members will not mark ‘verification-done’ until this has happened. Note as this is a legacy driver, the QA team available hardware might be limited if not existent. Tests on GKE might be suitable, as they still default to 470 series. [Regression Potential] In order to mitigate the regression potential, the results of the aforementioned system level tests are attached to this bug. [Discussion] [Changelog] To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/fabric-manager-470/+bug/2052640/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2029934] Re: arm64 AWS host hangs during modprobe nvidia on lunar and mantic
I can reproduce the failure on mantic with both the DKMS and LRM drivers. Specifically what I'm doing to install these are: for DKMS: sudo DEBIAN_FRONTEND=noninteractive apt-get install -y nvidia-driver-535-server for LRM: sudo DEBIAN_FRONTEND=noninteractive apt-get install -y nvidia-headless-no-dkms-535-server linux-modules-nvidia-535-server-generic nvidia-utils-535-server I'm intentionally not using `ubuntu-drivers` to isolate this testing to just the installation and functioning of the drivers. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-aws in Ubuntu. https://bugs.launchpad.net/bugs/2029934 Title: arm64 AWS host hangs during modprobe nvidia on lunar and mantic Status in linux-aws package in Ubuntu: Confirmed Status in nvidia-graphics-drivers-525 package in Ubuntu: Confirmed Status in nvidia-graphics-drivers-525-server package in Ubuntu: Confirmed Status in nvidia-graphics-drivers-535 package in Ubuntu: Confirmed Status in nvidia-graphics-drivers-535-server package in Ubuntu: Confirmed Bug description: Loading the nvidia driver dkms modules with "modprove nvidia" will result in the host hanging and being completely unusable. This was reproduced using both the linux generic and linux-aws kernels on lunar and mantic using an AWS g5g.xlarge instance. To reproduce using the generic kernel: # Deploy a arm64 host with an nvidia gpu, such as an AWS g5g.xlarge. # Install the linux generic kernel from lunar-updates: $ sudo DEBIAN_FRONTEND=noninteractive apt-get install -y -o DPkg::Options::=--force-confold linux-generic # Boot to the linux-generic kernel (this can be accomplished by removing the existing kernel, in this case it was the linux-aws 6.2.0-1008-aws kernel) $ sudo DEBIAN_FRONTEND=noninteractive apt-get purge -y -o DPkg::Options::=--force-confold linux-aws linux-aws-headers-6.2.0-1008 linux-headers-6.2.0-1008-aws linux-headers-aws linux-image-6.2.0-1008-aws linux-image-aws linux-modules-6.2.0-1008-aws linux-headers-6.2.0-1008-aws linux-image-6.2.0-1008-aws linux-modules-6.2.0-1008-aws $ reboot # Install the Nvidia 535-server driver DKMS package: $ sudo DEBIAN_FRONTEND=noninteractive apt-get install -y nvidia-driver-535-server # Enable the driver $ sudo modprobe nvidia # At this point the system will hang and never return. # A reboot instead of a modprobe will result in a system that never boots up all the way. I was able to recover the console logs from such a system and found (the full captured log is attached): [1.964942] nvidia: loading out-of-tree module taints kernel. [1.965475] nvidia: module license 'NVIDIA' taints kernel. [1.965905] Disabling lock debugging due to kernel taint [1.980905] nvidia: module verification failed: signature and/or required key missing - tainting kernel [2.012067] nvidia-nvlink: Nvlink Core is being initialized, major device number 510 [2.012715] [ 62.025143] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: [ 62.025807] rcu: 3-...0: (14 ticks this GP) idle=c04c/1/0x4000 softirq=653/654 fqs=3301 [ 62.026516](detected by 0, t=15003 jiffies, g=-699, q=216 ncpus=4) [ 62.027018] Task dump for CPU 3: [ 62.027290] task:systemd-udevd state:R running task stack:0 pid:164 ppid:144flags:0x000e [ 62.028066] Call trace: [ 62.028273] __switch_to+0xbc/0x100 [ 62.028567] 0x228 Timed out for waiting the udev queue being empty. Timed out for waiting the udev queue being empty. [ 242.045143] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: [ 242.045655] rcu: 3-...0: (14 ticks this GP) idle=c04c/1/0x4000 softirq=653/654 fqs=12303 [ 242.046373](detected by 1, t=60008 jiffies, g=-699, q=937 ncpus=4) [ 242.046874] Task dump for CPU 3: [ 242.047146] task:systemd-udevd state:R running task stack:0 pid:164 ppid:144flags:0x000f [ 242.047922] Call trace: [ 242.048128] __switch_to+0xbc/0x100 [ 242.048417] 0x228 Timed out for waiting the udev queue being empty. Begin: Loading essential drivers ... [ 384.001142] watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [modprobe:215] [ 384.001738] Modules linked in: nvidia(POE+) crct10dif_ce video polyval_ce polyval_generic drm_kms_helper ghash_ce syscopyarea sm4 sysfillrect sha2_ce sysimgblt sha256_arm64 sha1_ce drm nvme nvme_core ena nvme_common aes_neon_bs aes_neon_blk aes_ce_blk aes_ce_cipher [ 384.003513] CPU: 2 PID: 215 Comm: modprobe Tainted: P OE 6.2.0-26-generic #26-Ubuntu [ 384.004210] Hardware name: Amazon EC2 g5g.xlarge/, BIOS 1.0 11/1/2018 [ 384.004715] pstate: 8045 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 384.005259] pc : smp_call_function_many_cond+0x1b4/0x4b4 [ 384.005683] lr : smp_call_function_many_cond+0x1d0/0x4b4 [ 384.006108] sp : 889a
[Kernel-packages] [Bug 2042564] Re: Performance regression in the 5.15 Ubuntu 20.04 kernel compared to 5.4 Ubuntu 20.04 kernel
We are still looking into this issue. While we can reproduce the test case and see difference in the performance, the delta is not as significant and our results have not very consistent. I'm taking the approach of setting up a more comprehensive test environment to run more tests faster. Hopefully we can then go through an analysis and bisect process with more meaningful results. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2042564 Title: Performance regression in the 5.15 Ubuntu 20.04 kernel compared to 5.4 Ubuntu 20.04 kernel Status in linux package in Ubuntu: New Status in linux source package in Focal: New Bug description: We in the Canonical Public Cloud team have received report from our colleagues in Google regarding a potential performance regression with the 5.15 kernel vs the 5.4 kernel on ubuntu 20.04. Their test were performed using the linux-gkeop and linux-gkeop-5.15 kernels. I have verified with the generic Ubuntu 20.04 5.4 linux-generic and the Ubuntu 20.04 5.15 linux-generic-hwe-20.04 kernels. The tests were run using `fio` fio commands: * 4k initwrite: `fio --ioengine=libaio --blocksize=4k --readwrite=write --filesize=40G --end_fsync=1 --iodepth=128 --direct=1 --group_reporting --numjobs=8 --name=fiojob1 --filename=/dev/sdc` * 4k overwrite: `fio --ioengine=libaio --blocksize=4k --readwrite=write --filesize=40G --end_fsync=1 --iodepth=128 --direct=1 --group_reporting --numjobs=8 --name=fiojob1 --filename=/dev/sdc` My reproducer was to launch an Ubuntu 20.04 cloud image locally with qemu the results are below: Using 5.4 kernel ``` ubuntu@cloudimg:~$ uname --kernel-release 5.4.0-164-generic ubuntu@cloudimg:~$ sudo fio --ioengine=libaio --blocksize=4k --readwrite=write --filesize=40G --end_fsync=1 --iodepth=128 --direct=1 --group_reporting --numjobs=8 --name=fiojob1 --filename=/dev/sda fiojob1: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=128 ... fio-3.16 Starting 8 processes Jobs: 8 (f=8): [W(8)][99.6%][w=925MiB/s][w=237k IOPS][eta 00m:01s] fiojob1: (groupid=0, jobs=8): err= 0: pid=2443: Thu Nov 2 09:15:22 2023 write: IOPS=317k, BW=1237MiB/s (1297MB/s)(320GiB/264837msec); 0 zone resets slat (nsec): min=628, max=37820k, avg=7207.71, stdev=101058.61 clat (nsec): min=457, max=56099k, avg=340.45, stdev=1707823.38 lat (usec): min=23, max=56100, avg=3229.78, stdev=1705.80 clat percentiles (usec): | 1.00th=[ 775], 5.00th=[ 1352], 10.00th=[ 1647], 20.00th=[ 2024], | 30.00th=[ 2343], 40.00th=[ 2638], 50.00th=[ 2933], 60.00th=[ 3261], | 70.00th=[ 3654], 80.00th=[ 4146], 90.00th=[ 5014], 95.00th=[ 5932], | 99.00th=[ 8979], 99.50th=[10945], 99.90th=[18220], 99.95th=[22676], | 99.99th=[32113] bw ( MiB/s): min= 524, max= 1665, per=100.00%, avg=1237.72, stdev=20.42, samples=4232 iops: min=134308, max=426326, avg=316855.16, stdev=5227.36, samples=4232 lat (nsec) : 500=0.01%, 750=0.01%, 1000=0.01% lat (usec) : 4=0.01%, 10=0.01%, 20=0.01%, 50=0.01%, 100=0.01% lat (usec) : 250=0.05%, 500=0.54%, 750=0.37%, 1000=0.93% lat (msec) : 2=17.40%, 4=58.02%, 10=22.01%, 20=0.60%, 50=0.07% lat (msec) : 100=0.01% cpu : usr=3.29%, sys=7.45%, ctx=1262621, majf=0, minf=103 IO depths: 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0% submit: 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.1% issued rwts: total=0,83886080,0,8 short=0,0,0,0 dropped=0,0,0,0 latency : target=0, window=0, percentile=100.00%, depth=128 Run status group 0 (all jobs): WRITE: bw=1237MiB/s (1297MB/s), 1237MiB/s-1237MiB/s (1297MB/s-1297MB/s), io=320GiB (344GB), run=264837-264837msec Disk stats (read/write): sda: ios=36/32868891, merge=0/50979424, ticks=5/27498602, in_queue=1183124, util=100.00% ``` After upgrading to linux-generic-hwe-20.04 kernel and rebooting ``` ubuntu@cloudimg:~$ uname --kernel-release 5.15.0-88-generic ubuntu@cloudimg:~$ sudo fio --ioengine=libaio --blocksize=4k --readwrite=write --filesize=40G --end_fsync=1 --iodepth=128 --direct=1 --group_reporting --numjobs=8 --name=fiojob1 --filename=/dev/sda fiojob1: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=128 ... fio-3.16 Starting 8 processes Jobs: 1 (f=1): [_(7),W(1)][100.0%][w=410MiB/s][w=105k IOPS][eta 00m:00s] fiojob1: (groupid=0, jobs=8): err= 0: pid=1438: Thu Nov 2 09:46:49 2023 write: IOPS=155k, BW=605MiB/s (634MB/s)(320GiB/541949msec); 0 zone resets slat (nsec): min=660, max=325426k, avg=10351.04, stdev=232438.50 clat (nsec): min=1100, max=782743k, avg=6595
[Kernel-packages] [Bug 2043431] [NEW] no scanner detected
Public bug reported: simple scan or document scanner 3.36.3 gives error "no scanners detected" when my Epson ET-3830 worked better on the header 5.4.0.165 but even then, sometimes it gave the following symptoms: however, if it sits for an hour, when the scanner goes to sleep, then if I launch document scanner, then my HP Tower finds it OK. Very puzzling. It's almost like some USB device detecting daemon is flakey System:Kernel: 5.4.0-166-generic x86_64 bits: 64 compiler: gcc v: 9.4.0 Desktop: MATE 1.26.0 wm: marco dm: LightDM Distro: Linux Mint 20.3 Una base: Ubuntu 20.04 focal Machine: Type: Desktop System: HP product: HP Slim Desktop 290-p0xxx v: N/A serial: Chassis: type: 3 serial: Mobo: HP model: 843F v: 00 serial: UEFI: AMI v: F.43 date: 06/16/2020 CPU: Topology: Dual Core model: Intel Celeron G4900 bits: 64 type: MCP arch: Kaby Lake rev: B L2 cache: 2048 KiB flags: lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx bogomips: 12399 Speed: 3100 MHz min/max: 800/3100 MHz Core speeds (MHz): 1: 3100 2: 3100 Graphics: Device-1: Intel UHD Graphics 610 vendor: Hewlett-Packard driver: i915 v: kernel bus ID: 00:02.0 chip ID: 8086:3e93 Display: x11 server: X.Org 1.20.13 driver: modesetting unloaded: fbdev,vesa compositor: marco resolution: 1280x1024~60Hz OpenGL: renderer: Mesa Intel UHD Graphics 610 (CFL GT1) v: 4.6 Mesa 21.2.6 direct render: Yes Audio: Device-1: Intel Cannon Lake PCH cAVS vendor: Hewlett-Packard driver: snd_hda_intel v: kernel bus ID: 00:1f.3 chip ID: 8086:a348 Sound Server: ALSA v: k5.4.0-166-generic Network: Device-1: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet vendor: Hewlett-Packard driver: r8169 v: kernel port: 4000 bus ID: 01:00.0 chip ID: 10ec:8168 IF: enp1s0 state: down mac: Device-2: Realtek RTL8821CE 802.11ac PCIe Wireless Network Adapter vendor: Hewlett-Packard driver: rtl8821ce v: v5.5.2.1_35598.20191029 port: 3000 bus ID: 02:00.0 chip ID: 10ec:c821 IF: wlp2s0 state: up mac: Drives:Local Storage: total: 2.73 TiB used: 1.82 TiB (66.6%) ID-1: /dev/sda vendor: Seagate model: ST3000DM003-1F216N size: 2.73 TiB speed: 6.0 Gb/s serial: Partition: ID-1: / size: 2.68 TiB used: 1.82 TiB (67.7%) fs: ext4 dev: /dev/dm-1 ID-2: /boot size: 703.1 MiB used: 633.8 MiB (90.2%) fs: ext4 dev: /dev/sda2 ID-3: swap-1 size: 976.0 MiB used: 0 KiB (0.0%) fs: swap dev: /dev/dm-2 USB: Hub: 1-0:1 info: Full speed (or root) Hub ports: 16 rev: 2.0 chip ID: 1d6b:0002 Device-1: 1-1:2 info: Pixart Imaging Optical Mouse type: Mouse driver: hid-generic,usbhid rev: 2.0 chip ID: 093a:2510 Device-2: 1-5:3 info: Seiko Epson ET-3830 Series type: driver: usblp rev: 2.0 chip ID: 04b8:1184 Device-3: 1-8:4 info: IBM NetVista Full Width Keyboard type: Keyboard driver: hid-generic,usbhid rev: 1.1 chip ID: 04b3:3025 Device-4: 1-14:5 info: Realtek Bluetooth Radio type: Bluetooth driver: btusb rev: 1.1 chip ID: 0bda:b00a Hub: 2-0:1 info: Full speed (or root) Hub ports: 8 rev: 3.1 chip ID: 1d6b:0003 Sensors: System Temperatures: cpu: 47.0 C mobo: 27.8 C Fan Speeds (RPM): N/A Repos: No active apt repos in: /etc/apt/sources.list Active apt repos in: /etc/apt/sources.list.d/forkotov02-ppa-focal.list 1: deb http: //ppa.launchpad.net/forkotov02/ppa/ubuntu focal main Active apt repos in: /etc/apt/sources.list.d/official-package-repositories.list 1: deb http: //packages.linuxmint.com una main upstream import backport #id:linuxmint_main 2: deb http: //archive.ubuntu.com/ubuntu focal main restricted universe multiverse 3: deb http: //archive.ubuntu.com/ubuntu focal-updates main restricted universe multiverse 4: deb http: //archive.ubuntu.com/ubuntu focal-backports main restricted universe multiverse 5: deb http: //security.ubuntu.com/ubuntu/ focal-security main restricted universe multiverse 6: deb http: //archive.canonical.com/ubuntu/ focal partner Active apt repos in: /etc/apt/sources.list.d/skype-stable.list 1: deb [arch=amd64] https: //repo.skype.com/deb stable main Active apt repos in: /etc/apt/sources.list.d/vivaldi.list 1: deb [arch=amd64] https: //repo.vivaldi.com/stable/deb/ stable main Active apt repos in: /etc/apt/sources.list.d/yandex-browser.list 1: deb [arch=amd64] https: //repo.yandex.ru/yandex-browser/deb stable main Info: Processes: 215 Uptime: 4h 54m Memory: 15.45 GiB used: 1.27 GiB (8.2%) Init: systemd v: 245 runlevel: 5 Compilers: gcc: 9.4.
[Kernel-packages] [Bug 2037417] Re: mantic images after 20230917 are failing to deploy with failure to mount root and kernel filesystems
The latest maas images from 20231008 are booting without issue: ubuntu@akis:~$ lsb_release -sc No LSB modules are available. mantic ubuntu@akis:~$ cat /etc/cloud/build.info build_name: server serial: 20231008 ubuntu@akis:~$ uname -a Linux akis 6.5.0-7-generic #7-Ubuntu SMP PREEMPT_DYNAMIC Fri Sep 29 09:14:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2037417 Title: mantic images after 20230917 are failing to deploy with failure to mount root and kernel filesystems Status in cloud-images: New Status in maas-images: Confirmed Status in The Ubuntu-power-systems project: Invalid Status in Release Notes for Ubuntu: New Status in linux package in Ubuntu: Invalid Status in systemd package in Ubuntu: Invalid Status in util-linux package in Ubuntu: Fix Released Status in linux source package in Mantic: Invalid Status in systemd source package in Mantic: Invalid Status in util-linux source package in Mantic: Fix Released Bug description: Mantic arm64 deploys started failing on Sept 18th with: [ 41.913552] systemd[1]: Starting systemd-remount-fs.service - Remount Root and Kernel File Systems... Starting [0;1;39msystemd-remount-f鈥t Root and Kernel File Systems... [ 41.940748] systemd[1]: Starting systemd-udev-trigger.service - Coldplug All udev Devices... Starting [0;1;39msystemd-udev-trig鈥0m - Coldplug All udev Devices... [ 41.964758] systemd[1]: Started systemd-journald.service - Journal Service. [[0;32m OK [0m] Started [0;1;39msystemd-journald.service[0m - Journal Service. [[0;32m OK [0m] Mounted [0;1;39mdev-hugepages.mount[0m - Huge Pages File System. [[0;32m OK [0m] Mounted [0;1;39mdev-mqueue.mount[鈥�- POSIX Message Queue File System. [[0;32m OK [0m] Mounted [0;1;39msys-kernel-debug.m鈥t[0m - Kernel Debug File System. [[0;32m OK [0m] Mounted [0;1;39msys-kernel-tracing鈥t[0m - Kernel Trace File System. [[0;32m OK [0m] Finished [0;1;39mkeyboard-setup.se鈥�- Set the console keyboard layout. [[0;32m OK [0m] Finished [0;1;39mkmod-static-nodes鈥eate List of Static Device Nodes. [[0;32m OK [0m] Finished [0;1;39mlvm2-monitor.serv鈥ing dmeventd or progress polling. [[0;32m OK [0m] Finished [0;1;39mmodprobe@configfs鈥0m - Load Kernel Module configfs. [[0;32m OK [0m] Finished [0;1;39mmodprobe@dm_mod.s鈥[0m - Load Kernel Module dm_mod. [[0;32m OK [0m] Finished [0;1;39mmodprobe@drm.service[0m - Load Kernel Module drm. [[0;32m OK [0m] Finished [0;1;39mmodprobe@efi_psto鈥 - Load Kernel Module efi_pstore. [[0;32m OK [0m] Finished [0;1;39mmodprobe@fuse.service[0m - Load Kernel Module fuse. [[0;32m OK [0m] Finished [0;1;39mmodprobe@loop.service[0m - Load Kernel Module loop. [[0;32m OK [0m] Finished [0;1;39msystemd-modules-l鈥ervice[0m - Load Kernel Modules. [[0;1;31mFAILED[0m] Failed to start [0;1;39msystemd-re鈥unt Root and Kernel File Systems. See 'systemctl status systemd-remount-fs.service' for details. After this many other services and cloud-init fails. See the full kopter-0918.log. For comparison, a log from the prior day's test is also attached. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-images/+bug/2037417/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2037417] Re: mantic images after 20230917 are failing to deploy with failure to mount root and kernel filesystems
Special maas image built with util-linux, 2.39.1-4ubuntu2, from https://ppa.launchpadcontent.net/xnox/release-critical/ubuntu is looking good. I have one machine deployed with this: ubuntu@rumford:~$ uname -r 6.5.0-5-lowlatency ubuntu@rumford:~$ apt-cache policy util-linux util-linux: Installed: 2.39.1-4ubuntu2 Candidate: 2.39.1-4ubuntu2 Version table: *** 2.39.1-4ubuntu2 500 500 https://ppa.launchpadcontent.net/xnox/release-critical/ubuntu mantic/main amd64 Packages 100 /var/lib/dpkg/status 2.39.1-4ubuntu1 500 500 http://archive.ubuntu.com/ubuntu mantic/main amd64 Packages ubuntu@rumford:~$ cat /etc/cloud/build.info build_name: server serial: 20231006.1732 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2037417 Title: mantic images after 20230917 are failing to deploy with failure to mount root and kernel filesystems Status in cloud-images: New Status in maas-images: Confirmed Status in The Ubuntu-power-systems project: Invalid Status in Release Notes for Ubuntu: New Status in linux package in Ubuntu: Invalid Status in systemd package in Ubuntu: Invalid Status in util-linux package in Ubuntu: Triaged Status in linux source package in Mantic: Invalid Status in systemd source package in Mantic: Invalid Status in util-linux source package in Mantic: Triaged Bug description: Mantic arm64 deploys started failing on Sept 18th with: [ 41.913552] systemd[1]: Starting systemd-remount-fs.service - Remount Root and Kernel File Systems... Starting [0;1;39msystemd-remount-f鈥t Root and Kernel File Systems... [ 41.940748] systemd[1]: Starting systemd-udev-trigger.service - Coldplug All udev Devices... Starting [0;1;39msystemd-udev-trig鈥0m - Coldplug All udev Devices... [ 41.964758] systemd[1]: Started systemd-journald.service - Journal Service. [[0;32m OK [0m] Started [0;1;39msystemd-journald.service[0m - Journal Service. [[0;32m OK [0m] Mounted [0;1;39mdev-hugepages.mount[0m - Huge Pages File System. [[0;32m OK [0m] Mounted [0;1;39mdev-mqueue.mount[鈥�- POSIX Message Queue File System. [[0;32m OK [0m] Mounted [0;1;39msys-kernel-debug.m鈥t[0m - Kernel Debug File System. [[0;32m OK [0m] Mounted [0;1;39msys-kernel-tracing鈥t[0m - Kernel Trace File System. [[0;32m OK [0m] Finished [0;1;39mkeyboard-setup.se鈥�- Set the console keyboard layout. [[0;32m OK [0m] Finished [0;1;39mkmod-static-nodes鈥eate List of Static Device Nodes. [[0;32m OK [0m] Finished [0;1;39mlvm2-monitor.serv鈥ing dmeventd or progress polling. [[0;32m OK [0m] Finished [0;1;39mmodprobe@configfs鈥0m - Load Kernel Module configfs. [[0;32m OK [0m] Finished [0;1;39mmodprobe@dm_mod.s鈥[0m - Load Kernel Module dm_mod. [[0;32m OK [0m] Finished [0;1;39mmodprobe@drm.service[0m - Load Kernel Module drm. [[0;32m OK [0m] Finished [0;1;39mmodprobe@efi_psto鈥 - Load Kernel Module efi_pstore. [[0;32m OK [0m] Finished [0;1;39mmodprobe@fuse.service[0m - Load Kernel Module fuse. [[0;32m OK [0m] Finished [0;1;39mmodprobe@loop.service[0m - Load Kernel Module loop. [[0;32m OK [0m] Finished [0;1;39msystemd-modules-l鈥ervice[0m - Load Kernel Modules. [[0;1;31mFAILED[0m] Failed to start [0;1;39msystemd-re鈥unt Root and Kernel File Systems. See 'systemctl status systemd-remount-fs.service' for details. After this many other services and cloud-init fails. See the full kopter-0918.log. For comparison, a log from the prior day's test is also attached. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-images/+bug/2037417/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2037417] Re: mantic images after 20230917 are failing to deploy with failure to mount root and kernel filesystems
** Project changed: linux => linux (Ubuntu) ** Changed in: linux (Ubuntu) Milestone: None => ubuntu-23.10 ** Also affects: linux (Ubuntu Mantic) Importance: Undecided Status: New ** Also affects: systemd (Ubuntu Mantic) Importance: Undecided Status: Confirmed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2037417 Title: mantic images after 20230917 are failing to deploy with failure to mount root and kernel filesystems Status in maas-images: New Status in The Ubuntu-power-systems project: Confirmed Status in linux package in Ubuntu: New Status in systemd package in Ubuntu: Confirmed Status in linux source package in Mantic: New Status in systemd source package in Mantic: Confirmed Bug description: Mantic arm64 deploys started failing on Sept 18th with: [ 41.913552] systemd[1]: Starting systemd-remount-fs.service - Remount Root and Kernel File Systems... Starting [0;1;39msystemd-remount-f鈥t Root and Kernel File Systems... [ 41.940748] systemd[1]: Starting systemd-udev-trigger.service - Coldplug All udev Devices... Starting [0;1;39msystemd-udev-trig鈥0m - Coldplug All udev Devices... [ 41.964758] systemd[1]: Started systemd-journald.service - Journal Service. [[0;32m OK [0m] Started [0;1;39msystemd-journald.service[0m - Journal Service. [[0;32m OK [0m] Mounted [0;1;39mdev-hugepages.mount[0m - Huge Pages File System. [[0;32m OK [0m] Mounted [0;1;39mdev-mqueue.mount[鈥�- POSIX Message Queue File System. [[0;32m OK [0m] Mounted [0;1;39msys-kernel-debug.m鈥t[0m - Kernel Debug File System. [[0;32m OK [0m] Mounted [0;1;39msys-kernel-tracing鈥t[0m - Kernel Trace File System. [[0;32m OK [0m] Finished [0;1;39mkeyboard-setup.se鈥�- Set the console keyboard layout. [[0;32m OK [0m] Finished [0;1;39mkmod-static-nodes鈥eate List of Static Device Nodes. [[0;32m OK [0m] Finished [0;1;39mlvm2-monitor.serv鈥ing dmeventd or progress polling. [[0;32m OK [0m] Finished [0;1;39mmodprobe@configfs鈥0m - Load Kernel Module configfs. [[0;32m OK [0m] Finished [0;1;39mmodprobe@dm_mod.s鈥[0m - Load Kernel Module dm_mod. [[0;32m OK [0m] Finished [0;1;39mmodprobe@drm.service[0m - Load Kernel Module drm. [[0;32m OK [0m] Finished [0;1;39mmodprobe@efi_psto鈥 - Load Kernel Module efi_pstore. [[0;32m OK [0m] Finished [0;1;39mmodprobe@fuse.service[0m - Load Kernel Module fuse. [[0;32m OK [0m] Finished [0;1;39mmodprobe@loop.service[0m - Load Kernel Module loop. [[0;32m OK [0m] Finished [0;1;39msystemd-modules-l鈥ervice[0m - Load Kernel Modules. [[0;1;31mFAILED[0m] Failed to start [0;1;39msystemd-re鈥unt Root and Kernel File Systems. See 'systemctl status systemd-remount-fs.service' for details. After this many other services and cloud-init fails. See the full kopter-0918.log. For comparison, a log from the prior day's test is also attached. To manage notifications about this bug go to: https://bugs.launchpad.net/maas-images/+bug/2037417/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2034447] acpidump.txt
apport information ** Attachment added: "acpidump.txt" https://bugs.launchpad.net/bugs/2034447/+attachment/5697982/+files/acpidump.txt -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2034447 Title: `refcount_t: underflow; use-after-free.` on hidon w/ 5.15.0-85-generic Status in linux package in Ubuntu: Incomplete Bug description: Seeing a panic on hidon (an Nvidia H100) after booting the 5.15.0-85-generic kernel: [ 58.935877] [ cut here ] [ 58.935893] refcount_t: underflow; use-after-free. [ 58.935920] WARNING: CPU: 207 PID: 2985 at lib/refcount.c:28 refcount_warn_saturate+0xf7/0x150 [ 58.935943] Modules linked in: x86_pkg_temp_thermal(+) intel_powerclamp coretemp nls_iso8859_1 rapl irdma(+) i40e qat_4xxx(+) isst_if_mbox_pci intel_qat pmt_telemetry pmt_crashlog idxd(+) isst_if_mmio pmt_class isst_if_common authenc idxd_bus intel_th_gth mei_me intel_th_pci intel_th mei switchtec ipmi_ssif acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler mac_hid sch_fq_codel dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ramoops reed_solomon pstore_blk pstore_zone efi_pstore ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 multipath linear mlx5_ib ib_uverbs ib_core ast i2c_algo_bit drm_vram_helper drm_ttm_helper ttm drm_kms_helper raid0 mlx5_core syscopyarea sysfillrect sysimgblt crct10dif_pclmul fb_sys_fops crc32_pclmul ixgbe cec mlxfw ghash_clmulni_intel aesni_intel psample crypto_simd xfrm_algo ice rc_core cryptd tls nvme i2c_i801 dca xhci_pci intel_pmt drm [ 58.936077] pci_hyperv_intf i2c_ismt i2c_smbus [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936083] mdio [ 58.936096] xhci_pci_renesas nvme_core wmi pinctrl_emmitsburg [ 58.936106] CPU: 207 PID: 2985 Comm: systemd-udevd Not tainted 5.15.0-85-generic #95-Ubuntu [ 58.936115] Hardware name: NVIDIA DGXH100/DGXH100, BIOS 1.0.7 05/08/2023 [ 58.936119] RIP: 0010:refcount_warn_saturate+0xf7/0x150 [ 58.936130] Code: eb 9e 0f b6 1d 5e e6 b9 01 80 fb 01 0f 87 f4 63 6f 00 83 e3 01 75 89 48 c7 c7 88 c3 23 9e c6 05 42 e6 b9 01 01 e8 d8 e4 6b 00 <0f> 0b e9 6f ff ff ff 0f b6 1d 2d e6 b9 01 80 fb 01 0f 87 b1 63 6f [ 58.936135] RSP: 0018:ff4d5d94b2c7fa28 EFLAGS: 00010282 [ 58.936142] RAX: RBX: RCX: 0027 [ 58.936146] RDX: ff314dbbbf9e0588 RSI: 0001 RDI: ff314dbbbf9e0580 [ 58.936149] RBP: ff4d5d94b2c7fa30 R08: 0026 R09: ff4d5d94b2c7f9c0 [ 58.936153] R10: 0028 R11: 0001 R12: [ 58.936156] R13: ff314cbfdbcb6900 R14: ff314cbfdbcb67b8 R15: ff314cbfd24b4000 [ 58.936159] FS: 7fadd2f6c8c0() GS:ff314dbbbf9c() knlGS: [ 58.936163] CS: 0010 DS: ES: CR0: 80050033 [ 58.936167] CR2: 7fadd243b584 CR3: 00012972c006 CR4: 00771ee0 [ 58.936171] DR0: DR1: DR2: [ 58.936174] DR3: DR6: fffe07f0 DR7: 0400 [ 58.936177] PKRU: 5554 [ 58.936179] Call Trace: [ 58.936184] [ 58.936188] ? show_trace_log_lvl+0x1d6/0x2ea [ 58.936204] ? show_trace_log_lvl+0x1d6/0x2ea [ 58.936212] ? crypto_mod_put+0x6b/0x80 [ 58.936225] ? show_regs.part.0+0x23/0x29 [ 58.936232] ? show_regs.cold+0x8/0xd [ 58.936239] ? refcount_warn_saturate+0xf7/0x150 [ 58.936246] ? __warn+0x8c/0x100 [ 58.936255] ? refcount_warn_saturate+0xf7/0x150 [ 58.936263] ? report_bug+0xa4/0xd0 [ 58.936274] ? down_trylock+0x2e/0x40 [ 58.936285] ? handle_bug+0x39/0x90 [ 58.936296] ? exc_invalid_op+0x19/0x70 [ 58.936301] ? asm_exc_invalid_op+0x1b/0x20 [ 58.936310] ? refcount_warn_saturate+0xf7/0x150 [ 58.936317] ? refcount_warn_saturate+0xf7/0x150 [ 58.936323] crypto_mod_put+0x6b/0x80 [ 58.936329] crypto_destroy_tfm+0x4e/0xa0 [ 58.936336] pkcs1pad_exit_tfm+0x15/0x20 [ 58.936345] crypto_akcipher_exit_tfm+0x13/0x20 [ 58.936352] crypto_destroy_tfm+0x43/0xa0 [ 58.936358] public_key_verify_signature+0x2dc/0x3c0 [ 58.936366] ? find_asymmetric_key+0xd2/0x1d0 [ 58.936374] ? kfree+0x1f7/0x250 [ 58.936385] public_key_verify_signature_2+0x15/0x20 [ 58.936389] verify_signature+0x37/0x60 [ 58.936393] pkcs7_validate_trust_one.constprop.0+0x156/0x1e0 [ 58.936400] pkcs7_validate_trust+0x4a/0xa0 [ 58.93
[Kernel-packages] [Bug 2034447] WifiSyslog.txt
apport information ** Attachment added: "WifiSyslog.txt" https://bugs.launchpad.net/bugs/2034447/+attachment/5697981/+files/WifiSyslog.txt -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2034447 Title: `refcount_t: underflow; use-after-free.` on hidon w/ 5.15.0-85-generic Status in linux package in Ubuntu: Incomplete Bug description: Seeing a panic on hidon (an Nvidia H100) after booting the 5.15.0-85-generic kernel: [ 58.935877] [ cut here ] [ 58.935893] refcount_t: underflow; use-after-free. [ 58.935920] WARNING: CPU: 207 PID: 2985 at lib/refcount.c:28 refcount_warn_saturate+0xf7/0x150 [ 58.935943] Modules linked in: x86_pkg_temp_thermal(+) intel_powerclamp coretemp nls_iso8859_1 rapl irdma(+) i40e qat_4xxx(+) isst_if_mbox_pci intel_qat pmt_telemetry pmt_crashlog idxd(+) isst_if_mmio pmt_class isst_if_common authenc idxd_bus intel_th_gth mei_me intel_th_pci intel_th mei switchtec ipmi_ssif acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler mac_hid sch_fq_codel dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ramoops reed_solomon pstore_blk pstore_zone efi_pstore ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 multipath linear mlx5_ib ib_uverbs ib_core ast i2c_algo_bit drm_vram_helper drm_ttm_helper ttm drm_kms_helper raid0 mlx5_core syscopyarea sysfillrect sysimgblt crct10dif_pclmul fb_sys_fops crc32_pclmul ixgbe cec mlxfw ghash_clmulni_intel aesni_intel psample crypto_simd xfrm_algo ice rc_core cryptd tls nvme i2c_i801 dca xhci_pci intel_pmt drm [ 58.936077] pci_hyperv_intf i2c_ismt i2c_smbus [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936083] mdio [ 58.936096] xhci_pci_renesas nvme_core wmi pinctrl_emmitsburg [ 58.936106] CPU: 207 PID: 2985 Comm: systemd-udevd Not tainted 5.15.0-85-generic #95-Ubuntu [ 58.936115] Hardware name: NVIDIA DGXH100/DGXH100, BIOS 1.0.7 05/08/2023 [ 58.936119] RIP: 0010:refcount_warn_saturate+0xf7/0x150 [ 58.936130] Code: eb 9e 0f b6 1d 5e e6 b9 01 80 fb 01 0f 87 f4 63 6f 00 83 e3 01 75 89 48 c7 c7 88 c3 23 9e c6 05 42 e6 b9 01 01 e8 d8 e4 6b 00 <0f> 0b e9 6f ff ff ff 0f b6 1d 2d e6 b9 01 80 fb 01 0f 87 b1 63 6f [ 58.936135] RSP: 0018:ff4d5d94b2c7fa28 EFLAGS: 00010282 [ 58.936142] RAX: RBX: RCX: 0027 [ 58.936146] RDX: ff314dbbbf9e0588 RSI: 0001 RDI: ff314dbbbf9e0580 [ 58.936149] RBP: ff4d5d94b2c7fa30 R08: 0026 R09: ff4d5d94b2c7f9c0 [ 58.936153] R10: 0028 R11: 0001 R12: [ 58.936156] R13: ff314cbfdbcb6900 R14: ff314cbfdbcb67b8 R15: ff314cbfd24b4000 [ 58.936159] FS: 7fadd2f6c8c0() GS:ff314dbbbf9c() knlGS: [ 58.936163] CS: 0010 DS: ES: CR0: 80050033 [ 58.936167] CR2: 7fadd243b584 CR3: 00012972c006 CR4: 00771ee0 [ 58.936171] DR0: DR1: DR2: [ 58.936174] DR3: DR6: fffe07f0 DR7: 0400 [ 58.936177] PKRU: 5554 [ 58.936179] Call Trace: [ 58.936184] [ 58.936188] ? show_trace_log_lvl+0x1d6/0x2ea [ 58.936204] ? show_trace_log_lvl+0x1d6/0x2ea [ 58.936212] ? crypto_mod_put+0x6b/0x80 [ 58.936225] ? show_regs.part.0+0x23/0x29 [ 58.936232] ? show_regs.cold+0x8/0xd [ 58.936239] ? refcount_warn_saturate+0xf7/0x150 [ 58.936246] ? __warn+0x8c/0x100 [ 58.936255] ? refcount_warn_saturate+0xf7/0x150 [ 58.936263] ? report_bug+0xa4/0xd0 [ 58.936274] ? down_trylock+0x2e/0x40 [ 58.936285] ? handle_bug+0x39/0x90 [ 58.936296] ? exc_invalid_op+0x19/0x70 [ 58.936301] ? asm_exc_invalid_op+0x1b/0x20 [ 58.936310] ? refcount_warn_saturate+0xf7/0x150 [ 58.936317] ? refcount_warn_saturate+0xf7/0x150 [ 58.936323] crypto_mod_put+0x6b/0x80 [ 58.936329] crypto_destroy_tfm+0x4e/0xa0 [ 58.936336] pkcs1pad_exit_tfm+0x15/0x20 [ 58.936345] crypto_akcipher_exit_tfm+0x13/0x20 [ 58.936352] crypto_destroy_tfm+0x43/0xa0 [ 58.936358] public_key_verify_signature+0x2dc/0x3c0 [ 58.936366] ? find_asymmetric_key+0xd2/0x1d0 [ 58.936374] ? kfree+0x1f7/0x250 [ 58.936385] public_key_verify_signature_2+0x15/0x20 [ 58.936389] verify_signature+0x37/0x60 [ 58.936393] pkcs7_validate_trust_one.constprop.0+0x156/0x1e0 [ 58.936400] pkcs7_validate_trust+0x4a/0xa0 [ 5
[Kernel-packages] [Bug 2034447] UdevDb.txt
apport information ** Attachment added: "UdevDb.txt" https://bugs.launchpad.net/bugs/2034447/+attachment/5697980/+files/UdevDb.txt -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2034447 Title: `refcount_t: underflow; use-after-free.` on hidon w/ 5.15.0-85-generic Status in linux package in Ubuntu: Incomplete Bug description: Seeing a panic on hidon (an Nvidia H100) after booting the 5.15.0-85-generic kernel: [ 58.935877] [ cut here ] [ 58.935893] refcount_t: underflow; use-after-free. [ 58.935920] WARNING: CPU: 207 PID: 2985 at lib/refcount.c:28 refcount_warn_saturate+0xf7/0x150 [ 58.935943] Modules linked in: x86_pkg_temp_thermal(+) intel_powerclamp coretemp nls_iso8859_1 rapl irdma(+) i40e qat_4xxx(+) isst_if_mbox_pci intel_qat pmt_telemetry pmt_crashlog idxd(+) isst_if_mmio pmt_class isst_if_common authenc idxd_bus intel_th_gth mei_me intel_th_pci intel_th mei switchtec ipmi_ssif acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler mac_hid sch_fq_codel dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ramoops reed_solomon pstore_blk pstore_zone efi_pstore ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 multipath linear mlx5_ib ib_uverbs ib_core ast i2c_algo_bit drm_vram_helper drm_ttm_helper ttm drm_kms_helper raid0 mlx5_core syscopyarea sysfillrect sysimgblt crct10dif_pclmul fb_sys_fops crc32_pclmul ixgbe cec mlxfw ghash_clmulni_intel aesni_intel psample crypto_simd xfrm_algo ice rc_core cryptd tls nvme i2c_i801 dca xhci_pci intel_pmt drm [ 58.936077] pci_hyperv_intf i2c_ismt i2c_smbus [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936083] mdio [ 58.936096] xhci_pci_renesas nvme_core wmi pinctrl_emmitsburg [ 58.936106] CPU: 207 PID: 2985 Comm: systemd-udevd Not tainted 5.15.0-85-generic #95-Ubuntu [ 58.936115] Hardware name: NVIDIA DGXH100/DGXH100, BIOS 1.0.7 05/08/2023 [ 58.936119] RIP: 0010:refcount_warn_saturate+0xf7/0x150 [ 58.936130] Code: eb 9e 0f b6 1d 5e e6 b9 01 80 fb 01 0f 87 f4 63 6f 00 83 e3 01 75 89 48 c7 c7 88 c3 23 9e c6 05 42 e6 b9 01 01 e8 d8 e4 6b 00 <0f> 0b e9 6f ff ff ff 0f b6 1d 2d e6 b9 01 80 fb 01 0f 87 b1 63 6f [ 58.936135] RSP: 0018:ff4d5d94b2c7fa28 EFLAGS: 00010282 [ 58.936142] RAX: RBX: RCX: 0027 [ 58.936146] RDX: ff314dbbbf9e0588 RSI: 0001 RDI: ff314dbbbf9e0580 [ 58.936149] RBP: ff4d5d94b2c7fa30 R08: 0026 R09: ff4d5d94b2c7f9c0 [ 58.936153] R10: 0028 R11: 0001 R12: [ 58.936156] R13: ff314cbfdbcb6900 R14: ff314cbfdbcb67b8 R15: ff314cbfd24b4000 [ 58.936159] FS: 7fadd2f6c8c0() GS:ff314dbbbf9c() knlGS: [ 58.936163] CS: 0010 DS: ES: CR0: 80050033 [ 58.936167] CR2: 7fadd243b584 CR3: 00012972c006 CR4: 00771ee0 [ 58.936171] DR0: DR1: DR2: [ 58.936174] DR3: DR6: fffe07f0 DR7: 0400 [ 58.936177] PKRU: 5554 [ 58.936179] Call Trace: [ 58.936184] [ 58.936188] ? show_trace_log_lvl+0x1d6/0x2ea [ 58.936204] ? show_trace_log_lvl+0x1d6/0x2ea [ 58.936212] ? crypto_mod_put+0x6b/0x80 [ 58.936225] ? show_regs.part.0+0x23/0x29 [ 58.936232] ? show_regs.cold+0x8/0xd [ 58.936239] ? refcount_warn_saturate+0xf7/0x150 [ 58.936246] ? __warn+0x8c/0x100 [ 58.936255] ? refcount_warn_saturate+0xf7/0x150 [ 58.936263] ? report_bug+0xa4/0xd0 [ 58.936274] ? down_trylock+0x2e/0x40 [ 58.936285] ? handle_bug+0x39/0x90 [ 58.936296] ? exc_invalid_op+0x19/0x70 [ 58.936301] ? asm_exc_invalid_op+0x1b/0x20 [ 58.936310] ? refcount_warn_saturate+0xf7/0x150 [ 58.936317] ? refcount_warn_saturate+0xf7/0x150 [ 58.936323] crypto_mod_put+0x6b/0x80 [ 58.936329] crypto_destroy_tfm+0x4e/0xa0 [ 58.936336] pkcs1pad_exit_tfm+0x15/0x20 [ 58.936345] crypto_akcipher_exit_tfm+0x13/0x20 [ 58.936352] crypto_destroy_tfm+0x43/0xa0 [ 58.936358] public_key_verify_signature+0x2dc/0x3c0 [ 58.936366] ? find_asymmetric_key+0xd2/0x1d0 [ 58.936374] ? kfree+0x1f7/0x250 [ 58.936385] public_key_verify_signature_2+0x15/0x20 [ 58.936389] verify_signature+0x37/0x60 [ 58.936393] pkcs7_validate_trust_one.constprop.0+0x156/0x1e0 [ 58.936400] pkcs7_validate_trust+0x4a/0xa0 [ 58.936406]
[Kernel-packages] [Bug 2034447] ProcModules.txt
apport information ** Attachment added: "ProcModules.txt" https://bugs.launchpad.net/bugs/2034447/+attachment/5697979/+files/ProcModules.txt -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2034447 Title: `refcount_t: underflow; use-after-free.` on hidon w/ 5.15.0-85-generic Status in linux package in Ubuntu: Incomplete Bug description: Seeing a panic on hidon (an Nvidia H100) after booting the 5.15.0-85-generic kernel: [ 58.935877] [ cut here ] [ 58.935893] refcount_t: underflow; use-after-free. [ 58.935920] WARNING: CPU: 207 PID: 2985 at lib/refcount.c:28 refcount_warn_saturate+0xf7/0x150 [ 58.935943] Modules linked in: x86_pkg_temp_thermal(+) intel_powerclamp coretemp nls_iso8859_1 rapl irdma(+) i40e qat_4xxx(+) isst_if_mbox_pci intel_qat pmt_telemetry pmt_crashlog idxd(+) isst_if_mmio pmt_class isst_if_common authenc idxd_bus intel_th_gth mei_me intel_th_pci intel_th mei switchtec ipmi_ssif acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler mac_hid sch_fq_codel dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ramoops reed_solomon pstore_blk pstore_zone efi_pstore ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 multipath linear mlx5_ib ib_uverbs ib_core ast i2c_algo_bit drm_vram_helper drm_ttm_helper ttm drm_kms_helper raid0 mlx5_core syscopyarea sysfillrect sysimgblt crct10dif_pclmul fb_sys_fops crc32_pclmul ixgbe cec mlxfw ghash_clmulni_intel aesni_intel psample crypto_simd xfrm_algo ice rc_core cryptd tls nvme i2c_i801 dca xhci_pci intel_pmt drm [ 58.936077] pci_hyperv_intf i2c_ismt i2c_smbus [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936083] mdio [ 58.936096] xhci_pci_renesas nvme_core wmi pinctrl_emmitsburg [ 58.936106] CPU: 207 PID: 2985 Comm: systemd-udevd Not tainted 5.15.0-85-generic #95-Ubuntu [ 58.936115] Hardware name: NVIDIA DGXH100/DGXH100, BIOS 1.0.7 05/08/2023 [ 58.936119] RIP: 0010:refcount_warn_saturate+0xf7/0x150 [ 58.936130] Code: eb 9e 0f b6 1d 5e e6 b9 01 80 fb 01 0f 87 f4 63 6f 00 83 e3 01 75 89 48 c7 c7 88 c3 23 9e c6 05 42 e6 b9 01 01 e8 d8 e4 6b 00 <0f> 0b e9 6f ff ff ff 0f b6 1d 2d e6 b9 01 80 fb 01 0f 87 b1 63 6f [ 58.936135] RSP: 0018:ff4d5d94b2c7fa28 EFLAGS: 00010282 [ 58.936142] RAX: RBX: RCX: 0027 [ 58.936146] RDX: ff314dbbbf9e0588 RSI: 0001 RDI: ff314dbbbf9e0580 [ 58.936149] RBP: ff4d5d94b2c7fa30 R08: 0026 R09: ff4d5d94b2c7f9c0 [ 58.936153] R10: 0028 R11: 0001 R12: [ 58.936156] R13: ff314cbfdbcb6900 R14: ff314cbfdbcb67b8 R15: ff314cbfd24b4000 [ 58.936159] FS: 7fadd2f6c8c0() GS:ff314dbbbf9c() knlGS: [ 58.936163] CS: 0010 DS: ES: CR0: 80050033 [ 58.936167] CR2: 7fadd243b584 CR3: 00012972c006 CR4: 00771ee0 [ 58.936171] DR0: DR1: DR2: [ 58.936174] DR3: DR6: fffe07f0 DR7: 0400 [ 58.936177] PKRU: 5554 [ 58.936179] Call Trace: [ 58.936184] [ 58.936188] ? show_trace_log_lvl+0x1d6/0x2ea [ 58.936204] ? show_trace_log_lvl+0x1d6/0x2ea [ 58.936212] ? crypto_mod_put+0x6b/0x80 [ 58.936225] ? show_regs.part.0+0x23/0x29 [ 58.936232] ? show_regs.cold+0x8/0xd [ 58.936239] ? refcount_warn_saturate+0xf7/0x150 [ 58.936246] ? __warn+0x8c/0x100 [ 58.936255] ? refcount_warn_saturate+0xf7/0x150 [ 58.936263] ? report_bug+0xa4/0xd0 [ 58.936274] ? down_trylock+0x2e/0x40 [ 58.936285] ? handle_bug+0x39/0x90 [ 58.936296] ? exc_invalid_op+0x19/0x70 [ 58.936301] ? asm_exc_invalid_op+0x1b/0x20 [ 58.936310] ? refcount_warn_saturate+0xf7/0x150 [ 58.936317] ? refcount_warn_saturate+0xf7/0x150 [ 58.936323] crypto_mod_put+0x6b/0x80 [ 58.936329] crypto_destroy_tfm+0x4e/0xa0 [ 58.936336] pkcs1pad_exit_tfm+0x15/0x20 [ 58.936345] crypto_akcipher_exit_tfm+0x13/0x20 [ 58.936352] crypto_destroy_tfm+0x43/0xa0 [ 58.936358] public_key_verify_signature+0x2dc/0x3c0 [ 58.936366] ? find_asymmetric_key+0xd2/0x1d0 [ 58.936374] ? kfree+0x1f7/0x250 [ 58.936385] public_key_verify_signature_2+0x15/0x20 [ 58.936389] verify_signature+0x37/0x60 [ 58.936393] pkcs7_validate_trust_one.constprop.0+0x156/0x1e0 [ 58.936400] pkcs7_validate_trust+0x4a/0xa0 [
[Kernel-packages] [Bug 2034447] ProcCpuinfoMinimal.txt
apport information ** Attachment added: "ProcCpuinfoMinimal.txt" https://bugs.launchpad.net/bugs/2034447/+attachment/5697977/+files/ProcCpuinfoMinimal.txt -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2034447 Title: `refcount_t: underflow; use-after-free.` on hidon w/ 5.15.0-85-generic Status in linux package in Ubuntu: Incomplete Bug description: Seeing a panic on hidon (an Nvidia H100) after booting the 5.15.0-85-generic kernel: [ 58.935877] [ cut here ] [ 58.935893] refcount_t: underflow; use-after-free. [ 58.935920] WARNING: CPU: 207 PID: 2985 at lib/refcount.c:28 refcount_warn_saturate+0xf7/0x150 [ 58.935943] Modules linked in: x86_pkg_temp_thermal(+) intel_powerclamp coretemp nls_iso8859_1 rapl irdma(+) i40e qat_4xxx(+) isst_if_mbox_pci intel_qat pmt_telemetry pmt_crashlog idxd(+) isst_if_mmio pmt_class isst_if_common authenc idxd_bus intel_th_gth mei_me intel_th_pci intel_th mei switchtec ipmi_ssif acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler mac_hid sch_fq_codel dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ramoops reed_solomon pstore_blk pstore_zone efi_pstore ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 multipath linear mlx5_ib ib_uverbs ib_core ast i2c_algo_bit drm_vram_helper drm_ttm_helper ttm drm_kms_helper raid0 mlx5_core syscopyarea sysfillrect sysimgblt crct10dif_pclmul fb_sys_fops crc32_pclmul ixgbe cec mlxfw ghash_clmulni_intel aesni_intel psample crypto_simd xfrm_algo ice rc_core cryptd tls nvme i2c_i801 dca xhci_pci intel_pmt drm [ 58.936077] pci_hyperv_intf i2c_ismt i2c_smbus [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936083] mdio [ 58.936096] xhci_pci_renesas nvme_core wmi pinctrl_emmitsburg [ 58.936106] CPU: 207 PID: 2985 Comm: systemd-udevd Not tainted 5.15.0-85-generic #95-Ubuntu [ 58.936115] Hardware name: NVIDIA DGXH100/DGXH100, BIOS 1.0.7 05/08/2023 [ 58.936119] RIP: 0010:refcount_warn_saturate+0xf7/0x150 [ 58.936130] Code: eb 9e 0f b6 1d 5e e6 b9 01 80 fb 01 0f 87 f4 63 6f 00 83 e3 01 75 89 48 c7 c7 88 c3 23 9e c6 05 42 e6 b9 01 01 e8 d8 e4 6b 00 <0f> 0b e9 6f ff ff ff 0f b6 1d 2d e6 b9 01 80 fb 01 0f 87 b1 63 6f [ 58.936135] RSP: 0018:ff4d5d94b2c7fa28 EFLAGS: 00010282 [ 58.936142] RAX: RBX: RCX: 0027 [ 58.936146] RDX: ff314dbbbf9e0588 RSI: 0001 RDI: ff314dbbbf9e0580 [ 58.936149] RBP: ff4d5d94b2c7fa30 R08: 0026 R09: ff4d5d94b2c7f9c0 [ 58.936153] R10: 0028 R11: 0001 R12: [ 58.936156] R13: ff314cbfdbcb6900 R14: ff314cbfdbcb67b8 R15: ff314cbfd24b4000 [ 58.936159] FS: 7fadd2f6c8c0() GS:ff314dbbbf9c() knlGS: [ 58.936163] CS: 0010 DS: ES: CR0: 80050033 [ 58.936167] CR2: 7fadd243b584 CR3: 00012972c006 CR4: 00771ee0 [ 58.936171] DR0: DR1: DR2: [ 58.936174] DR3: DR6: fffe07f0 DR7: 0400 [ 58.936177] PKRU: 5554 [ 58.936179] Call Trace: [ 58.936184] [ 58.936188] ? show_trace_log_lvl+0x1d6/0x2ea [ 58.936204] ? show_trace_log_lvl+0x1d6/0x2ea [ 58.936212] ? crypto_mod_put+0x6b/0x80 [ 58.936225] ? show_regs.part.0+0x23/0x29 [ 58.936232] ? show_regs.cold+0x8/0xd [ 58.936239] ? refcount_warn_saturate+0xf7/0x150 [ 58.936246] ? __warn+0x8c/0x100 [ 58.936255] ? refcount_warn_saturate+0xf7/0x150 [ 58.936263] ? report_bug+0xa4/0xd0 [ 58.936274] ? down_trylock+0x2e/0x40 [ 58.936285] ? handle_bug+0x39/0x90 [ 58.936296] ? exc_invalid_op+0x19/0x70 [ 58.936301] ? asm_exc_invalid_op+0x1b/0x20 [ 58.936310] ? refcount_warn_saturate+0xf7/0x150 [ 58.936317] ? refcount_warn_saturate+0xf7/0x150 [ 58.936323] crypto_mod_put+0x6b/0x80 [ 58.936329] crypto_destroy_tfm+0x4e/0xa0 [ 58.936336] pkcs1pad_exit_tfm+0x15/0x20 [ 58.936345] crypto_akcipher_exit_tfm+0x13/0x20 [ 58.936352] crypto_destroy_tfm+0x43/0xa0 [ 58.936358] public_key_verify_signature+0x2dc/0x3c0 [ 58.936366] ? find_asymmetric_key+0xd2/0x1d0 [ 58.936374] ? kfree+0x1f7/0x250 [ 58.936385] public_key_verify_signature_2+0x15/0x20 [ 58.936389] verify_signature+0x37/0x60 [ 58.936393] pkcs7_validate_trust_one.constprop.0+0x156/0x1e0 [ 58.936400] pkcs7_validate_trust+0
[Kernel-packages] [Bug 2034447] ProcInterrupts.txt
apport information ** Attachment added: "ProcInterrupts.txt" https://bugs.launchpad.net/bugs/2034447/+attachment/5697978/+files/ProcInterrupts.txt -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2034447 Title: `refcount_t: underflow; use-after-free.` on hidon w/ 5.15.0-85-generic Status in linux package in Ubuntu: Incomplete Bug description: Seeing a panic on hidon (an Nvidia H100) after booting the 5.15.0-85-generic kernel: [ 58.935877] [ cut here ] [ 58.935893] refcount_t: underflow; use-after-free. [ 58.935920] WARNING: CPU: 207 PID: 2985 at lib/refcount.c:28 refcount_warn_saturate+0xf7/0x150 [ 58.935943] Modules linked in: x86_pkg_temp_thermal(+) intel_powerclamp coretemp nls_iso8859_1 rapl irdma(+) i40e qat_4xxx(+) isst_if_mbox_pci intel_qat pmt_telemetry pmt_crashlog idxd(+) isst_if_mmio pmt_class isst_if_common authenc idxd_bus intel_th_gth mei_me intel_th_pci intel_th mei switchtec ipmi_ssif acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler mac_hid sch_fq_codel dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ramoops reed_solomon pstore_blk pstore_zone efi_pstore ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 multipath linear mlx5_ib ib_uverbs ib_core ast i2c_algo_bit drm_vram_helper drm_ttm_helper ttm drm_kms_helper raid0 mlx5_core syscopyarea sysfillrect sysimgblt crct10dif_pclmul fb_sys_fops crc32_pclmul ixgbe cec mlxfw ghash_clmulni_intel aesni_intel psample crypto_simd xfrm_algo ice rc_core cryptd tls nvme i2c_i801 dca xhci_pci intel_pmt drm [ 58.936077] pci_hyperv_intf i2c_ismt i2c_smbus [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936083] mdio [ 58.936096] xhci_pci_renesas nvme_core wmi pinctrl_emmitsburg [ 58.936106] CPU: 207 PID: 2985 Comm: systemd-udevd Not tainted 5.15.0-85-generic #95-Ubuntu [ 58.936115] Hardware name: NVIDIA DGXH100/DGXH100, BIOS 1.0.7 05/08/2023 [ 58.936119] RIP: 0010:refcount_warn_saturate+0xf7/0x150 [ 58.936130] Code: eb 9e 0f b6 1d 5e e6 b9 01 80 fb 01 0f 87 f4 63 6f 00 83 e3 01 75 89 48 c7 c7 88 c3 23 9e c6 05 42 e6 b9 01 01 e8 d8 e4 6b 00 <0f> 0b e9 6f ff ff ff 0f b6 1d 2d e6 b9 01 80 fb 01 0f 87 b1 63 6f [ 58.936135] RSP: 0018:ff4d5d94b2c7fa28 EFLAGS: 00010282 [ 58.936142] RAX: RBX: RCX: 0027 [ 58.936146] RDX: ff314dbbbf9e0588 RSI: 0001 RDI: ff314dbbbf9e0580 [ 58.936149] RBP: ff4d5d94b2c7fa30 R08: 0026 R09: ff4d5d94b2c7f9c0 [ 58.936153] R10: 0028 R11: 0001 R12: [ 58.936156] R13: ff314cbfdbcb6900 R14: ff314cbfdbcb67b8 R15: ff314cbfd24b4000 [ 58.936159] FS: 7fadd2f6c8c0() GS:ff314dbbbf9c() knlGS: [ 58.936163] CS: 0010 DS: ES: CR0: 80050033 [ 58.936167] CR2: 7fadd243b584 CR3: 00012972c006 CR4: 00771ee0 [ 58.936171] DR0: DR1: DR2: [ 58.936174] DR3: DR6: fffe07f0 DR7: 0400 [ 58.936177] PKRU: 5554 [ 58.936179] Call Trace: [ 58.936184] [ 58.936188] ? show_trace_log_lvl+0x1d6/0x2ea [ 58.936204] ? show_trace_log_lvl+0x1d6/0x2ea [ 58.936212] ? crypto_mod_put+0x6b/0x80 [ 58.936225] ? show_regs.part.0+0x23/0x29 [ 58.936232] ? show_regs.cold+0x8/0xd [ 58.936239] ? refcount_warn_saturate+0xf7/0x150 [ 58.936246] ? __warn+0x8c/0x100 [ 58.936255] ? refcount_warn_saturate+0xf7/0x150 [ 58.936263] ? report_bug+0xa4/0xd0 [ 58.936274] ? down_trylock+0x2e/0x40 [ 58.936285] ? handle_bug+0x39/0x90 [ 58.936296] ? exc_invalid_op+0x19/0x70 [ 58.936301] ? asm_exc_invalid_op+0x1b/0x20 [ 58.936310] ? refcount_warn_saturate+0xf7/0x150 [ 58.936317] ? refcount_warn_saturate+0xf7/0x150 [ 58.936323] crypto_mod_put+0x6b/0x80 [ 58.936329] crypto_destroy_tfm+0x4e/0xa0 [ 58.936336] pkcs1pad_exit_tfm+0x15/0x20 [ 58.936345] crypto_akcipher_exit_tfm+0x13/0x20 [ 58.936352] crypto_destroy_tfm+0x43/0xa0 [ 58.936358] public_key_verify_signature+0x2dc/0x3c0 [ 58.936366] ? find_asymmetric_key+0xd2/0x1d0 [ 58.936374] ? kfree+0x1f7/0x250 [ 58.936385] public_key_verify_signature_2+0x15/0x20 [ 58.936389] verify_signature+0x37/0x60 [ 58.936393] pkcs7_validate_trust_one.constprop.0+0x156/0x1e0 [ 58.936400] pkcs7_validate_trust+0x4a/0xa0
[Kernel-packages] [Bug 2034447] ProcCpuinfo.txt
apport information ** Attachment added: "ProcCpuinfo.txt" https://bugs.launchpad.net/bugs/2034447/+attachment/5697976/+files/ProcCpuinfo.txt -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2034447 Title: `refcount_t: underflow; use-after-free.` on hidon w/ 5.15.0-85-generic Status in linux package in Ubuntu: Incomplete Bug description: Seeing a panic on hidon (an Nvidia H100) after booting the 5.15.0-85-generic kernel: [ 58.935877] [ cut here ] [ 58.935893] refcount_t: underflow; use-after-free. [ 58.935920] WARNING: CPU: 207 PID: 2985 at lib/refcount.c:28 refcount_warn_saturate+0xf7/0x150 [ 58.935943] Modules linked in: x86_pkg_temp_thermal(+) intel_powerclamp coretemp nls_iso8859_1 rapl irdma(+) i40e qat_4xxx(+) isst_if_mbox_pci intel_qat pmt_telemetry pmt_crashlog idxd(+) isst_if_mmio pmt_class isst_if_common authenc idxd_bus intel_th_gth mei_me intel_th_pci intel_th mei switchtec ipmi_ssif acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler mac_hid sch_fq_codel dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ramoops reed_solomon pstore_blk pstore_zone efi_pstore ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 multipath linear mlx5_ib ib_uverbs ib_core ast i2c_algo_bit drm_vram_helper drm_ttm_helper ttm drm_kms_helper raid0 mlx5_core syscopyarea sysfillrect sysimgblt crct10dif_pclmul fb_sys_fops crc32_pclmul ixgbe cec mlxfw ghash_clmulni_intel aesni_intel psample crypto_simd xfrm_algo ice rc_core cryptd tls nvme i2c_i801 dca xhci_pci intel_pmt drm [ 58.936077] pci_hyperv_intf i2c_ismt i2c_smbus [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936083] mdio [ 58.936096] xhci_pci_renesas nvme_core wmi pinctrl_emmitsburg [ 58.936106] CPU: 207 PID: 2985 Comm: systemd-udevd Not tainted 5.15.0-85-generic #95-Ubuntu [ 58.936115] Hardware name: NVIDIA DGXH100/DGXH100, BIOS 1.0.7 05/08/2023 [ 58.936119] RIP: 0010:refcount_warn_saturate+0xf7/0x150 [ 58.936130] Code: eb 9e 0f b6 1d 5e e6 b9 01 80 fb 01 0f 87 f4 63 6f 00 83 e3 01 75 89 48 c7 c7 88 c3 23 9e c6 05 42 e6 b9 01 01 e8 d8 e4 6b 00 <0f> 0b e9 6f ff ff ff 0f b6 1d 2d e6 b9 01 80 fb 01 0f 87 b1 63 6f [ 58.936135] RSP: 0018:ff4d5d94b2c7fa28 EFLAGS: 00010282 [ 58.936142] RAX: RBX: RCX: 0027 [ 58.936146] RDX: ff314dbbbf9e0588 RSI: 0001 RDI: ff314dbbbf9e0580 [ 58.936149] RBP: ff4d5d94b2c7fa30 R08: 0026 R09: ff4d5d94b2c7f9c0 [ 58.936153] R10: 0028 R11: 0001 R12: [ 58.936156] R13: ff314cbfdbcb6900 R14: ff314cbfdbcb67b8 R15: ff314cbfd24b4000 [ 58.936159] FS: 7fadd2f6c8c0() GS:ff314dbbbf9c() knlGS: [ 58.936163] CS: 0010 DS: ES: CR0: 80050033 [ 58.936167] CR2: 7fadd243b584 CR3: 00012972c006 CR4: 00771ee0 [ 58.936171] DR0: DR1: DR2: [ 58.936174] DR3: DR6: fffe07f0 DR7: 0400 [ 58.936177] PKRU: 5554 [ 58.936179] Call Trace: [ 58.936184] [ 58.936188] ? show_trace_log_lvl+0x1d6/0x2ea [ 58.936204] ? show_trace_log_lvl+0x1d6/0x2ea [ 58.936212] ? crypto_mod_put+0x6b/0x80 [ 58.936225] ? show_regs.part.0+0x23/0x29 [ 58.936232] ? show_regs.cold+0x8/0xd [ 58.936239] ? refcount_warn_saturate+0xf7/0x150 [ 58.936246] ? __warn+0x8c/0x100 [ 58.936255] ? refcount_warn_saturate+0xf7/0x150 [ 58.936263] ? report_bug+0xa4/0xd0 [ 58.936274] ? down_trylock+0x2e/0x40 [ 58.936285] ? handle_bug+0x39/0x90 [ 58.936296] ? exc_invalid_op+0x19/0x70 [ 58.936301] ? asm_exc_invalid_op+0x1b/0x20 [ 58.936310] ? refcount_warn_saturate+0xf7/0x150 [ 58.936317] ? refcount_warn_saturate+0xf7/0x150 [ 58.936323] crypto_mod_put+0x6b/0x80 [ 58.936329] crypto_destroy_tfm+0x4e/0xa0 [ 58.936336] pkcs1pad_exit_tfm+0x15/0x20 [ 58.936345] crypto_akcipher_exit_tfm+0x13/0x20 [ 58.936352] crypto_destroy_tfm+0x43/0xa0 [ 58.936358] public_key_verify_signature+0x2dc/0x3c0 [ 58.936366] ? find_asymmetric_key+0xd2/0x1d0 [ 58.936374] ? kfree+0x1f7/0x250 [ 58.936385] public_key_verify_signature_2+0x15/0x20 [ 58.936389] verify_signature+0x37/0x60 [ 58.936393] pkcs7_validate_trust_one.constprop.0+0x156/0x1e0 [ 58.936400] pkcs7_validate_trust+0x4a/0xa0 [
[Kernel-packages] [Bug 2034447] Lsusb-v.txt
apport information ** Attachment added: "Lsusb-v.txt" https://bugs.launchpad.net/bugs/2034447/+attachment/5697975/+files/Lsusb-v.txt -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2034447 Title: `refcount_t: underflow; use-after-free.` on hidon w/ 5.15.0-85-generic Status in linux package in Ubuntu: Incomplete Bug description: Seeing a panic on hidon (an Nvidia H100) after booting the 5.15.0-85-generic kernel: [ 58.935877] [ cut here ] [ 58.935893] refcount_t: underflow; use-after-free. [ 58.935920] WARNING: CPU: 207 PID: 2985 at lib/refcount.c:28 refcount_warn_saturate+0xf7/0x150 [ 58.935943] Modules linked in: x86_pkg_temp_thermal(+) intel_powerclamp coretemp nls_iso8859_1 rapl irdma(+) i40e qat_4xxx(+) isst_if_mbox_pci intel_qat pmt_telemetry pmt_crashlog idxd(+) isst_if_mmio pmt_class isst_if_common authenc idxd_bus intel_th_gth mei_me intel_th_pci intel_th mei switchtec ipmi_ssif acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler mac_hid sch_fq_codel dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ramoops reed_solomon pstore_blk pstore_zone efi_pstore ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 multipath linear mlx5_ib ib_uverbs ib_core ast i2c_algo_bit drm_vram_helper drm_ttm_helper ttm drm_kms_helper raid0 mlx5_core syscopyarea sysfillrect sysimgblt crct10dif_pclmul fb_sys_fops crc32_pclmul ixgbe cec mlxfw ghash_clmulni_intel aesni_intel psample crypto_simd xfrm_algo ice rc_core cryptd tls nvme i2c_i801 dca xhci_pci intel_pmt drm [ 58.936077] pci_hyperv_intf i2c_ismt i2c_smbus [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936083] mdio [ 58.936096] xhci_pci_renesas nvme_core wmi pinctrl_emmitsburg [ 58.936106] CPU: 207 PID: 2985 Comm: systemd-udevd Not tainted 5.15.0-85-generic #95-Ubuntu [ 58.936115] Hardware name: NVIDIA DGXH100/DGXH100, BIOS 1.0.7 05/08/2023 [ 58.936119] RIP: 0010:refcount_warn_saturate+0xf7/0x150 [ 58.936130] Code: eb 9e 0f b6 1d 5e e6 b9 01 80 fb 01 0f 87 f4 63 6f 00 83 e3 01 75 89 48 c7 c7 88 c3 23 9e c6 05 42 e6 b9 01 01 e8 d8 e4 6b 00 <0f> 0b e9 6f ff ff ff 0f b6 1d 2d e6 b9 01 80 fb 01 0f 87 b1 63 6f [ 58.936135] RSP: 0018:ff4d5d94b2c7fa28 EFLAGS: 00010282 [ 58.936142] RAX: RBX: RCX: 0027 [ 58.936146] RDX: ff314dbbbf9e0588 RSI: 0001 RDI: ff314dbbbf9e0580 [ 58.936149] RBP: ff4d5d94b2c7fa30 R08: 0026 R09: ff4d5d94b2c7f9c0 [ 58.936153] R10: 0028 R11: 0001 R12: [ 58.936156] R13: ff314cbfdbcb6900 R14: ff314cbfdbcb67b8 R15: ff314cbfd24b4000 [ 58.936159] FS: 7fadd2f6c8c0() GS:ff314dbbbf9c() knlGS: [ 58.936163] CS: 0010 DS: ES: CR0: 80050033 [ 58.936167] CR2: 7fadd243b584 CR3: 00012972c006 CR4: 00771ee0 [ 58.936171] DR0: DR1: DR2: [ 58.936174] DR3: DR6: fffe07f0 DR7: 0400 [ 58.936177] PKRU: 5554 [ 58.936179] Call Trace: [ 58.936184] [ 58.936188] ? show_trace_log_lvl+0x1d6/0x2ea [ 58.936204] ? show_trace_log_lvl+0x1d6/0x2ea [ 58.936212] ? crypto_mod_put+0x6b/0x80 [ 58.936225] ? show_regs.part.0+0x23/0x29 [ 58.936232] ? show_regs.cold+0x8/0xd [ 58.936239] ? refcount_warn_saturate+0xf7/0x150 [ 58.936246] ? __warn+0x8c/0x100 [ 58.936255] ? refcount_warn_saturate+0xf7/0x150 [ 58.936263] ? report_bug+0xa4/0xd0 [ 58.936274] ? down_trylock+0x2e/0x40 [ 58.936285] ? handle_bug+0x39/0x90 [ 58.936296] ? exc_invalid_op+0x19/0x70 [ 58.936301] ? asm_exc_invalid_op+0x1b/0x20 [ 58.936310] ? refcount_warn_saturate+0xf7/0x150 [ 58.936317] ? refcount_warn_saturate+0xf7/0x150 [ 58.936323] crypto_mod_put+0x6b/0x80 [ 58.936329] crypto_destroy_tfm+0x4e/0xa0 [ 58.936336] pkcs1pad_exit_tfm+0x15/0x20 [ 58.936345] crypto_akcipher_exit_tfm+0x13/0x20 [ 58.936352] crypto_destroy_tfm+0x43/0xa0 [ 58.936358] public_key_verify_signature+0x2dc/0x3c0 [ 58.936366] ? find_asymmetric_key+0xd2/0x1d0 [ 58.936374] ? kfree+0x1f7/0x250 [ 58.936385] public_key_verify_signature_2+0x15/0x20 [ 58.936389] verify_signature+0x37/0x60 [ 58.936393] pkcs7_validate_trust_one.constprop.0+0x156/0x1e0 [ 58.936400] pkcs7_validate_trust+0x4a/0xa0 [ 58.9364
[Kernel-packages] [Bug 2034447] Lspci-vt.txt
apport information ** Attachment added: "Lspci-vt.txt" https://bugs.launchpad.net/bugs/2034447/+attachment/5697974/+files/Lspci-vt.txt -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2034447 Title: `refcount_t: underflow; use-after-free.` on hidon w/ 5.15.0-85-generic Status in linux package in Ubuntu: Incomplete Bug description: Seeing a panic on hidon (an Nvidia H100) after booting the 5.15.0-85-generic kernel: [ 58.935877] [ cut here ] [ 58.935893] refcount_t: underflow; use-after-free. [ 58.935920] WARNING: CPU: 207 PID: 2985 at lib/refcount.c:28 refcount_warn_saturate+0xf7/0x150 [ 58.935943] Modules linked in: x86_pkg_temp_thermal(+) intel_powerclamp coretemp nls_iso8859_1 rapl irdma(+) i40e qat_4xxx(+) isst_if_mbox_pci intel_qat pmt_telemetry pmt_crashlog idxd(+) isst_if_mmio pmt_class isst_if_common authenc idxd_bus intel_th_gth mei_me intel_th_pci intel_th mei switchtec ipmi_ssif acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler mac_hid sch_fq_codel dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ramoops reed_solomon pstore_blk pstore_zone efi_pstore ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 multipath linear mlx5_ib ib_uverbs ib_core ast i2c_algo_bit drm_vram_helper drm_ttm_helper ttm drm_kms_helper raid0 mlx5_core syscopyarea sysfillrect sysimgblt crct10dif_pclmul fb_sys_fops crc32_pclmul ixgbe cec mlxfw ghash_clmulni_intel aesni_intel psample crypto_simd xfrm_algo ice rc_core cryptd tls nvme i2c_i801 dca xhci_pci intel_pmt drm [ 58.936077] pci_hyperv_intf i2c_ismt i2c_smbus [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936083] mdio [ 58.936096] xhci_pci_renesas nvme_core wmi pinctrl_emmitsburg [ 58.936106] CPU: 207 PID: 2985 Comm: systemd-udevd Not tainted 5.15.0-85-generic #95-Ubuntu [ 58.936115] Hardware name: NVIDIA DGXH100/DGXH100, BIOS 1.0.7 05/08/2023 [ 58.936119] RIP: 0010:refcount_warn_saturate+0xf7/0x150 [ 58.936130] Code: eb 9e 0f b6 1d 5e e6 b9 01 80 fb 01 0f 87 f4 63 6f 00 83 e3 01 75 89 48 c7 c7 88 c3 23 9e c6 05 42 e6 b9 01 01 e8 d8 e4 6b 00 <0f> 0b e9 6f ff ff ff 0f b6 1d 2d e6 b9 01 80 fb 01 0f 87 b1 63 6f [ 58.936135] RSP: 0018:ff4d5d94b2c7fa28 EFLAGS: 00010282 [ 58.936142] RAX: RBX: RCX: 0027 [ 58.936146] RDX: ff314dbbbf9e0588 RSI: 0001 RDI: ff314dbbbf9e0580 [ 58.936149] RBP: ff4d5d94b2c7fa30 R08: 0026 R09: ff4d5d94b2c7f9c0 [ 58.936153] R10: 0028 R11: 0001 R12: [ 58.936156] R13: ff314cbfdbcb6900 R14: ff314cbfdbcb67b8 R15: ff314cbfd24b4000 [ 58.936159] FS: 7fadd2f6c8c0() GS:ff314dbbbf9c() knlGS: [ 58.936163] CS: 0010 DS: ES: CR0: 80050033 [ 58.936167] CR2: 7fadd243b584 CR3: 00012972c006 CR4: 00771ee0 [ 58.936171] DR0: DR1: DR2: [ 58.936174] DR3: DR6: fffe07f0 DR7: 0400 [ 58.936177] PKRU: 5554 [ 58.936179] Call Trace: [ 58.936184] [ 58.936188] ? show_trace_log_lvl+0x1d6/0x2ea [ 58.936204] ? show_trace_log_lvl+0x1d6/0x2ea [ 58.936212] ? crypto_mod_put+0x6b/0x80 [ 58.936225] ? show_regs.part.0+0x23/0x29 [ 58.936232] ? show_regs.cold+0x8/0xd [ 58.936239] ? refcount_warn_saturate+0xf7/0x150 [ 58.936246] ? __warn+0x8c/0x100 [ 58.936255] ? refcount_warn_saturate+0xf7/0x150 [ 58.936263] ? report_bug+0xa4/0xd0 [ 58.936274] ? down_trylock+0x2e/0x40 [ 58.936285] ? handle_bug+0x39/0x90 [ 58.936296] ? exc_invalid_op+0x19/0x70 [ 58.936301] ? asm_exc_invalid_op+0x1b/0x20 [ 58.936310] ? refcount_warn_saturate+0xf7/0x150 [ 58.936317] ? refcount_warn_saturate+0xf7/0x150 [ 58.936323] crypto_mod_put+0x6b/0x80 [ 58.936329] crypto_destroy_tfm+0x4e/0xa0 [ 58.936336] pkcs1pad_exit_tfm+0x15/0x20 [ 58.936345] crypto_akcipher_exit_tfm+0x13/0x20 [ 58.936352] crypto_destroy_tfm+0x43/0xa0 [ 58.936358] public_key_verify_signature+0x2dc/0x3c0 [ 58.936366] ? find_asymmetric_key+0xd2/0x1d0 [ 58.936374] ? kfree+0x1f7/0x250 [ 58.936385] public_key_verify_signature_2+0x15/0x20 [ 58.936389] verify_signature+0x37/0x60 [ 58.936393] pkcs7_validate_trust_one.constprop.0+0x156/0x1e0 [ 58.936400] pkcs7_validate_trust+0x4a/0xa0 [ 58.93
[Kernel-packages] [Bug 2034447] Lspci.txt
apport information ** Attachment added: "Lspci.txt" https://bugs.launchpad.net/bugs/2034447/+attachment/5697973/+files/Lspci.txt -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2034447 Title: `refcount_t: underflow; use-after-free.` on hidon w/ 5.15.0-85-generic Status in linux package in Ubuntu: Incomplete Bug description: Seeing a panic on hidon (an Nvidia H100) after booting the 5.15.0-85-generic kernel: [ 58.935877] [ cut here ] [ 58.935893] refcount_t: underflow; use-after-free. [ 58.935920] WARNING: CPU: 207 PID: 2985 at lib/refcount.c:28 refcount_warn_saturate+0xf7/0x150 [ 58.935943] Modules linked in: x86_pkg_temp_thermal(+) intel_powerclamp coretemp nls_iso8859_1 rapl irdma(+) i40e qat_4xxx(+) isst_if_mbox_pci intel_qat pmt_telemetry pmt_crashlog idxd(+) isst_if_mmio pmt_class isst_if_common authenc idxd_bus intel_th_gth mei_me intel_th_pci intel_th mei switchtec ipmi_ssif acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler mac_hid sch_fq_codel dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ramoops reed_solomon pstore_blk pstore_zone efi_pstore ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 multipath linear mlx5_ib ib_uverbs ib_core ast i2c_algo_bit drm_vram_helper drm_ttm_helper ttm drm_kms_helper raid0 mlx5_core syscopyarea sysfillrect sysimgblt crct10dif_pclmul fb_sys_fops crc32_pclmul ixgbe cec mlxfw ghash_clmulni_intel aesni_intel psample crypto_simd xfrm_algo ice rc_core cryptd tls nvme i2c_i801 dca xhci_pci intel_pmt drm [ 58.936077] pci_hyperv_intf i2c_ismt i2c_smbus [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936083] mdio [ 58.936096] xhci_pci_renesas nvme_core wmi pinctrl_emmitsburg [ 58.936106] CPU: 207 PID: 2985 Comm: systemd-udevd Not tainted 5.15.0-85-generic #95-Ubuntu [ 58.936115] Hardware name: NVIDIA DGXH100/DGXH100, BIOS 1.0.7 05/08/2023 [ 58.936119] RIP: 0010:refcount_warn_saturate+0xf7/0x150 [ 58.936130] Code: eb 9e 0f b6 1d 5e e6 b9 01 80 fb 01 0f 87 f4 63 6f 00 83 e3 01 75 89 48 c7 c7 88 c3 23 9e c6 05 42 e6 b9 01 01 e8 d8 e4 6b 00 <0f> 0b e9 6f ff ff ff 0f b6 1d 2d e6 b9 01 80 fb 01 0f 87 b1 63 6f [ 58.936135] RSP: 0018:ff4d5d94b2c7fa28 EFLAGS: 00010282 [ 58.936142] RAX: RBX: RCX: 0027 [ 58.936146] RDX: ff314dbbbf9e0588 RSI: 0001 RDI: ff314dbbbf9e0580 [ 58.936149] RBP: ff4d5d94b2c7fa30 R08: 0026 R09: ff4d5d94b2c7f9c0 [ 58.936153] R10: 0028 R11: 0001 R12: [ 58.936156] R13: ff314cbfdbcb6900 R14: ff314cbfdbcb67b8 R15: ff314cbfd24b4000 [ 58.936159] FS: 7fadd2f6c8c0() GS:ff314dbbbf9c() knlGS: [ 58.936163] CS: 0010 DS: ES: CR0: 80050033 [ 58.936167] CR2: 7fadd243b584 CR3: 00012972c006 CR4: 00771ee0 [ 58.936171] DR0: DR1: DR2: [ 58.936174] DR3: DR6: fffe07f0 DR7: 0400 [ 58.936177] PKRU: 5554 [ 58.936179] Call Trace: [ 58.936184] [ 58.936188] ? show_trace_log_lvl+0x1d6/0x2ea [ 58.936204] ? show_trace_log_lvl+0x1d6/0x2ea [ 58.936212] ? crypto_mod_put+0x6b/0x80 [ 58.936225] ? show_regs.part.0+0x23/0x29 [ 58.936232] ? show_regs.cold+0x8/0xd [ 58.936239] ? refcount_warn_saturate+0xf7/0x150 [ 58.936246] ? __warn+0x8c/0x100 [ 58.936255] ? refcount_warn_saturate+0xf7/0x150 [ 58.936263] ? report_bug+0xa4/0xd0 [ 58.936274] ? down_trylock+0x2e/0x40 [ 58.936285] ? handle_bug+0x39/0x90 [ 58.936296] ? exc_invalid_op+0x19/0x70 [ 58.936301] ? asm_exc_invalid_op+0x1b/0x20 [ 58.936310] ? refcount_warn_saturate+0xf7/0x150 [ 58.936317] ? refcount_warn_saturate+0xf7/0x150 [ 58.936323] crypto_mod_put+0x6b/0x80 [ 58.936329] crypto_destroy_tfm+0x4e/0xa0 [ 58.936336] pkcs1pad_exit_tfm+0x15/0x20 [ 58.936345] crypto_akcipher_exit_tfm+0x13/0x20 [ 58.936352] crypto_destroy_tfm+0x43/0xa0 [ 58.936358] public_key_verify_signature+0x2dc/0x3c0 [ 58.936366] ? find_asymmetric_key+0xd2/0x1d0 [ 58.936374] ? kfree+0x1f7/0x250 [ 58.936385] public_key_verify_signature_2+0x15/0x20 [ 58.936389] verify_signature+0x37/0x60 [ 58.936393] pkcs7_validate_trust_one.constprop.0+0x156/0x1e0 [ 58.936400] pkcs7_validate_trust+0x4a/0xa0 [ 58.936406]
[Kernel-packages] [Bug 2034447] Re: `refcount_t: underflow; use-after-free.` on hidon w/ 5.15.0-85-generic
apport information ** Tags added: apport-collected jammy uec-images ** Description changed: Seeing a panic on hidon (an Nvidia H100) after booting the 5.15.0-85-generic kernel: [ 58.935877] [ cut here ] [ 58.935893] refcount_t: underflow; use-after-free. [ 58.935920] WARNING: CPU: 207 PID: 2985 at lib/refcount.c:28 refcount_warn_saturate+0xf7/0x150 [ 58.935943] Modules linked in: x86_pkg_temp_thermal(+) intel_powerclamp coretemp nls_iso8859_1 rapl irdma(+) i40e qat_4xxx(+) isst_if_mbox_pci intel_qat pmt_telemetry pmt_crashlog idxd(+) isst_if_mmio pmt_class isst_if_common authenc idxd_bus intel_th_gth mei_me intel_th_pci intel_th mei switchtec ipmi_ssif acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler mac_hid sch_fq_codel dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ramoops reed_solomon pstore_blk pstore_zone efi_pstore ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 multipath linear mlx5_ib ib_uverbs ib_core ast i2c_algo_bit drm_vram_helper drm_ttm_helper ttm drm_kms_helper raid0 mlx5_core syscopyarea sysfillrect sysimgblt crct10dif_pclmul fb_sys_fops crc32_pclmul ixgbe cec mlxfw ghash_clmulni_intel aesni_intel psample crypto_simd xfrm_algo ice rc_core cryptd tls nvme i2c_i801 dca xhci_pci intel_pmt drm [ 58.936077] pci_hyperv_intf i2c_ismt i2c_smbus [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936083] mdio [ 58.936096] xhci_pci_renesas nvme_core wmi pinctrl_emmitsburg [ 58.936106] CPU: 207 PID: 2985 Comm: systemd-udevd Not tainted 5.15.0-85-generic #95-Ubuntu [ 58.936115] Hardware name: NVIDIA DGXH100/DGXH100, BIOS 1.0.7 05/08/2023 [ 58.936119] RIP: 0010:refcount_warn_saturate+0xf7/0x150 [ 58.936130] Code: eb 9e 0f b6 1d 5e e6 b9 01 80 fb 01 0f 87 f4 63 6f 00 83 e3 01 75 89 48 c7 c7 88 c3 23 9e c6 05 42 e6 b9 01 01 e8 d8 e4 6b 00 <0f> 0b e9 6f ff ff ff 0f b6 1d 2d e6 b9 01 80 fb 01 0f 87 b1 63 6f [ 58.936135] RSP: 0018:ff4d5d94b2c7fa28 EFLAGS: 00010282 [ 58.936142] RAX: RBX: RCX: 0027 [ 58.936146] RDX: ff314dbbbf9e0588 RSI: 0001 RDI: ff314dbbbf9e0580 [ 58.936149] RBP: ff4d5d94b2c7fa30 R08: 0026 R09: ff4d5d94b2c7f9c0 [ 58.936153] R10: 0028 R11: 0001 R12: [ 58.936156] R13: ff314cbfdbcb6900 R14: ff314cbfdbcb67b8 R15: ff314cbfd24b4000 [ 58.936159] FS: 7fadd2f6c8c0() GS:ff314dbbbf9c() knlGS: [ 58.936163] CS: 0010 DS: ES: CR0: 80050033 [ 58.936167] CR2: 7fadd243b584 CR3: 00012972c006 CR4: 00771ee0 [ 58.936171] DR0: DR1: DR2: [ 58.936174] DR3: DR6: fffe07f0 DR7: 0400 [ 58.936177] PKRU: 5554 [ 58.936179] Call Trace: [ 58.936184] [ 58.936188] ? show_trace_log_lvl+0x1d6/0x2ea [ 58.936204] ? show_trace_log_lvl+0x1d6/0x2ea [ 58.936212] ? crypto_mod_put+0x6b/0x80 [ 58.936225] ? show_regs.part.0+0x23/0x29 [ 58.936232] ? show_regs.cold+0x8/0xd [ 58.936239] ? refcount_warn_saturate+0xf7/0x150 [ 58.936246] ? __warn+0x8c/0x100 [ 58.936255] ? refcount_warn_saturate+0xf7/0x150 [ 58.936263] ? report_bug+0xa4/0xd0 [ 58.936274] ? down_trylock+0x2e/0x40 [ 58.936285] ? handle_bug+0x39/0x90 [ 58.936296] ? exc_invalid_op+0x19/0x70 [ 58.936301] ? asm_exc_invalid_op+0x1b/0x20 [ 58.936310] ? refcount_warn_saturate+0xf7/0x150 [ 58.936317] ? refcount_warn_saturate+0xf7/0x150 [ 58.936323] crypto_mod_put+0x6b/0x80 [ 58.936329] crypto_destroy_tfm+0x4e/0xa0 [ 58.936336] pkcs1pad_exit_tfm+0x15/0x20 [ 58.936345] crypto_akcipher_exit_tfm+0x13/0x20 [ 58.936352] crypto_destroy_tfm+0x43/0xa0 [ 58.936358] public_key_verify_signature+0x2dc/0x3c0 [ 58.936366] ? find_asymmetric_key+0xd2/0x1d0 [ 58.936374] ? kfree+0x1f7/0x250 [ 58.936385] public_key_verify_signature_2+0x15/0x20 [ 58.936389] verify_signature+0x37/0x60 [ 58.936393] pkcs7_validate_trust_one.constprop.0+0x156/0x1e0 [ 58.936400] pkcs7_validate_trust+0x4a/0xa0 [ 58.936406] verify_pkcs7_message_sig+0x83/0x120 [ 58.936418] verify_pkcs7_signature+0x4f/0x80 [ 58.936424] mod_verify_sig+0xb5/0xf0 [ 58.936435] load_module+0x275/0xbc0 [ 58.936440] ? kernel_read_file_from_fd+0x56/0xa0 [ 58.936450] __do_sys_finit_module+0xbf/0x120 [ 58.936496] __x64_sys_finit_module+0x18/0x20 [ 58.936504] do
[Kernel-packages] [Bug 2034447] Re: `refcount_t: underflow; use-after-free.` on hidon w/ 5.15.0-85-generic
Here's the full log from where that snippet was pulled. ** Attachment added: "hidon.log.1" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2034447/+attachment/5697793/+files/hidon.log.1 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2034447 Title: `refcount_t: underflow; use-after-free.` on hidon w/ 5.15.0-85-generic Status in linux package in Ubuntu: Incomplete Bug description: Seeing a panic on hidon (an Nvidia H100) after booting the 5.15.0-85-generic kernel: [ 58.935877] [ cut here ] [ 58.935893] refcount_t: underflow; use-after-free. [ 58.935920] WARNING: CPU: 207 PID: 2985 at lib/refcount.c:28 refcount_warn_saturate+0xf7/0x150 [ 58.935943] Modules linked in: x86_pkg_temp_thermal(+) intel_powerclamp coretemp nls_iso8859_1 rapl irdma(+) i40e qat_4xxx(+) isst_if_mbox_pci intel_qat pmt_telemetry pmt_crashlog idxd(+) isst_if_mmio pmt_class isst_if_common authenc idxd_bus intel_th_gth mei_me intel_th_pci intel_th mei switchtec ipmi_ssif acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler mac_hid sch_fq_codel dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ramoops reed_solomon pstore_blk pstore_zone efi_pstore ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 multipath linear mlx5_ib ib_uverbs ib_core ast i2c_algo_bit drm_vram_helper drm_ttm_helper ttm drm_kms_helper raid0 mlx5_core syscopyarea sysfillrect sysimgblt crct10dif_pclmul fb_sys_fops crc32_pclmul ixgbe cec mlxfw ghash_clmulni_intel aesni_intel psample crypto_simd xfrm_algo ice rc_core cryptd tls nvme i2c_i801 dca xhci_pci intel_pmt drm [ 58.936077] pci_hyperv_intf i2c_ismt i2c_smbus [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936083] mdio [ 58.936096] xhci_pci_renesas nvme_core wmi pinctrl_emmitsburg [ 58.936106] CPU: 207 PID: 2985 Comm: systemd-udevd Not tainted 5.15.0-85-generic #95-Ubuntu [ 58.936115] Hardware name: NVIDIA DGXH100/DGXH100, BIOS 1.0.7 05/08/2023 [ 58.936119] RIP: 0010:refcount_warn_saturate+0xf7/0x150 [ 58.936130] Code: eb 9e 0f b6 1d 5e e6 b9 01 80 fb 01 0f 87 f4 63 6f 00 83 e3 01 75 89 48 c7 c7 88 c3 23 9e c6 05 42 e6 b9 01 01 e8 d8 e4 6b 00 <0f> 0b e9 6f ff ff ff 0f b6 1d 2d e6 b9 01 80 fb 01 0f 87 b1 63 6f [ 58.936135] RSP: 0018:ff4d5d94b2c7fa28 EFLAGS: 00010282 [ 58.936142] RAX: RBX: RCX: 0027 [ 58.936146] RDX: ff314dbbbf9e0588 RSI: 0001 RDI: ff314dbbbf9e0580 [ 58.936149] RBP: ff4d5d94b2c7fa30 R08: 0026 R09: ff4d5d94b2c7f9c0 [ 58.936153] R10: 0028 R11: 0001 R12: [ 58.936156] R13: ff314cbfdbcb6900 R14: ff314cbfdbcb67b8 R15: ff314cbfd24b4000 [ 58.936159] FS: 7fadd2f6c8c0() GS:ff314dbbbf9c() knlGS: [ 58.936163] CS: 0010 DS: ES: CR0: 80050033 [ 58.936167] CR2: 7fadd243b584 CR3: 00012972c006 CR4: 00771ee0 [ 58.936171] DR0: DR1: DR2: [ 58.936174] DR3: DR6: fffe07f0 DR7: 0400 [ 58.936177] PKRU: 5554 [ 58.936179] Call Trace: [ 58.936184] [ 58.936188] ? show_trace_log_lvl+0x1d6/0x2ea [ 58.936204] ? show_trace_log_lvl+0x1d6/0x2ea [ 58.936212] ? crypto_mod_put+0x6b/0x80 [ 58.936225] ? show_regs.part.0+0x23/0x29 [ 58.936232] ? show_regs.cold+0x8/0xd [ 58.936239] ? refcount_warn_saturate+0xf7/0x150 [ 58.936246] ? __warn+0x8c/0x100 [ 58.936255] ? refcount_warn_saturate+0xf7/0x150 [ 58.936263] ? report_bug+0xa4/0xd0 [ 58.936274] ? down_trylock+0x2e/0x40 [ 58.936285] ? handle_bug+0x39/0x90 [ 58.936296] ? exc_invalid_op+0x19/0x70 [ 58.936301] ? asm_exc_invalid_op+0x1b/0x20 [ 58.936310] ? refcount_warn_saturate+0xf7/0x150 [ 58.936317] ? refcount_warn_saturate+0xf7/0x150 [ 58.936323] crypto_mod_put+0x6b/0x80 [ 58.936329] crypto_destroy_tfm+0x4e/0xa0 [ 58.936336] pkcs1pad_exit_tfm+0x15/0x20 [ 58.936345] crypto_akcipher_exit_tfm+0x13/0x20 [ 58.936352] crypto_destroy_tfm+0x43/0xa0 [ 58.936358] public_key_verify_signature+0x2dc/0x3c0 [ 58.936366] ? find_asymmetric_key+0xd2/0x1d0 [ 58.936374] ? kfree+0x1f7/0x250 [ 58.936385] public_key_verify_signature_2+0x15/0x20 [ 58.936389] verify_signature+0x37/0x60 [ 58.936393] pkcs7_validate_trust_one.constprop.0+0x156/0x1e0 [
[Kernel-packages] [Bug 2034447] [NEW] `refcount_t: underflow; use-after-free.` on hidon w/ 5.15.0-85-generic
Public bug reported: Seeing a panic on hidon (an Nvidia H100) after booting the 5.15.0-85-generic kernel: [ 58.935877] [ cut here ] [ 58.935893] refcount_t: underflow; use-after-free. [ 58.935920] WARNING: CPU: 207 PID: 2985 at lib/refcount.c:28 refcount_warn_saturate+0xf7/0x150 [ 58.935943] Modules linked in: x86_pkg_temp_thermal(+) intel_powerclamp coretemp nls_iso8859_1 rapl irdma(+) i40e qat_4xxx(+) isst_if_mbox_pci intel_qat pmt_telemetry pmt_crashlog idxd(+) isst_if_mmio pmt_class isst_if_common authenc idxd_bus intel_th_gth mei_me intel_th_pci intel_th mei switchtec ipmi_ssif acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler mac_hid sch_fq_codel dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ramoops reed_solomon pstore_blk pstore_zone efi_pstore ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 multipath linear mlx5_ib ib_uverbs ib_core ast i2c_algo_bit drm_vram_helper drm_ttm_helper ttm drm_kms_helper raid0 mlx5_core syscopyarea sysfillrect sysimgblt crct10dif_pclmul fb_sys_fops crc32_pclmul ixgbe cec mlxfw ghash_clmulni_intel aesni_intel psample crypto_simd xfrm_algo ice rc_core cryptd tls nvme i2c_i801 dca xhci_pci intel_pmt drm [ 58.936077] pci_hyperv_intf i2c_ismt i2c_smbus [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936080] QAT: Could not find a device on node 1 [ 58.936083] mdio [ 58.936096] xhci_pci_renesas nvme_core wmi pinctrl_emmitsburg [ 58.936106] CPU: 207 PID: 2985 Comm: systemd-udevd Not tainted 5.15.0-85-generic #95-Ubuntu [ 58.936115] Hardware name: NVIDIA DGXH100/DGXH100, BIOS 1.0.7 05/08/2023 [ 58.936119] RIP: 0010:refcount_warn_saturate+0xf7/0x150 [ 58.936130] Code: eb 9e 0f b6 1d 5e e6 b9 01 80 fb 01 0f 87 f4 63 6f 00 83 e3 01 75 89 48 c7 c7 88 c3 23 9e c6 05 42 e6 b9 01 01 e8 d8 e4 6b 00 <0f> 0b e9 6f ff ff ff 0f b6 1d 2d e6 b9 01 80 fb 01 0f 87 b1 63 6f [ 58.936135] RSP: 0018:ff4d5d94b2c7fa28 EFLAGS: 00010282 [ 58.936142] RAX: RBX: RCX: 0027 [ 58.936146] RDX: ff314dbbbf9e0588 RSI: 0001 RDI: ff314dbbbf9e0580 [ 58.936149] RBP: ff4d5d94b2c7fa30 R08: 0026 R09: ff4d5d94b2c7f9c0 [ 58.936153] R10: 0028 R11: 0001 R12: [ 58.936156] R13: ff314cbfdbcb6900 R14: ff314cbfdbcb67b8 R15: ff314cbfd24b4000 [ 58.936159] FS: 7fadd2f6c8c0() GS:ff314dbbbf9c() knlGS: [ 58.936163] CS: 0010 DS: ES: CR0: 80050033 [ 58.936167] CR2: 7fadd243b584 CR3: 00012972c006 CR4: 00771ee0 [ 58.936171] DR0: DR1: DR2: [ 58.936174] DR3: DR6: fffe07f0 DR7: 0400 [ 58.936177] PKRU: 5554 [ 58.936179] Call Trace: [ 58.936184] [ 58.936188] ? show_trace_log_lvl+0x1d6/0x2ea [ 58.936204] ? show_trace_log_lvl+0x1d6/0x2ea [ 58.936212] ? crypto_mod_put+0x6b/0x80 [ 58.936225] ? show_regs.part.0+0x23/0x29 [ 58.936232] ? show_regs.cold+0x8/0xd [ 58.936239] ? refcount_warn_saturate+0xf7/0x150 [ 58.936246] ? __warn+0x8c/0x100 [ 58.936255] ? refcount_warn_saturate+0xf7/0x150 [ 58.936263] ? report_bug+0xa4/0xd0 [ 58.936274] ? down_trylock+0x2e/0x40 [ 58.936285] ? handle_bug+0x39/0x90 [ 58.936296] ? exc_invalid_op+0x19/0x70 [ 58.936301] ? asm_exc_invalid_op+0x1b/0x20 [ 58.936310] ? refcount_warn_saturate+0xf7/0x150 [ 58.936317] ? refcount_warn_saturate+0xf7/0x150 [ 58.936323] crypto_mod_put+0x6b/0x80 [ 58.936329] crypto_destroy_tfm+0x4e/0xa0 [ 58.936336] pkcs1pad_exit_tfm+0x15/0x20 [ 58.936345] crypto_akcipher_exit_tfm+0x13/0x20 [ 58.936352] crypto_destroy_tfm+0x43/0xa0 [ 58.936358] public_key_verify_signature+0x2dc/0x3c0 [ 58.936366] ? find_asymmetric_key+0xd2/0x1d0 [ 58.936374] ? kfree+0x1f7/0x250 [ 58.936385] public_key_verify_signature_2+0x15/0x20 [ 58.936389] verify_signature+0x37/0x60 [ 58.936393] pkcs7_validate_trust_one.constprop.0+0x156/0x1e0 [ 58.936400] pkcs7_validate_trust+0x4a/0xa0 [ 58.936406] verify_pkcs7_message_sig+0x83/0x120 [ 58.936418] verify_pkcs7_signature+0x4f/0x80 [ 58.936424] mod_verify_sig+0xb5/0xf0 [ 58.936435] load_module+0x275/0xbc0 [ 58.936440] ? kernel_read_file_from_fd+0x56/0xa0 [ 58.936450] __do_sys_finit_module+0xbf/0x120 [ 58.936496] __x64_sys_finit_module+0x18/0x20 [ 58.936504] do_syscall_64+0x59/0xc0 [ 58.936510] ? exit_to_user_mode_prepare+0x37/0xb0 [ 58.936521] ? syscall_exit_to_user_mode+0x35/0x50 [ 58.936530] ? __x64_sys_mmap+0x33/0x50 [ 58.936539] ? do_syscall_64+0x69/0xc0 [
[Kernel-packages] [Bug 2026891] Re: linux-nvidia-6.2 on DGX servers: "WARNING: CPU: 0 PID: 0 at init/main.c:1065 start_kernel+0x4da/0x540"
I built and tested a 6.2.0-1004-nvidia based kernel with this patch applied and did not see the warning message on boot. I'll follow up further with Ian on Monday. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-nvidia-6.2 in Ubuntu. https://bugs.launchpad.net/bugs/2026891 Title: linux-nvidia-6.2 on DGX servers: "WARNING: CPU: 0 PID: 0 at init/main.c:1065 start_kernel+0x4da/0x540" Status in linux-nvidia-6.2 package in Ubuntu: New Bug description: We started testing the jammy/linux-nvidia-6.2 kernels on the nvidia servers (DGX-1/DGX-2/H100) and hit the following warning during boot: [7.690486] [ cut here ] [7.690487] Interrupts were enabled early [7.690490] WARNING: CPU: 0 PID: 0 at init/main.c:1065 start_kernel+0x4da/0x540 [7.690498] Modules linked in: [7.690501] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.2.0-1004-nvidia #4~22.04.1-Ubuntu [7.690504] Hardware name: NVIDIA NVIDIA DGX-2/NVIDIA DGX-2, BIOS 0.29 06/07/2021 [7.690505] RIP: 0010:start_kernel+0x4da/0x540 [7.690508] Code: ff 48 c7 c7 e8 26 f0 97 e8 b3 59 a8 fd 0f 0b e9 96 fd ff ff e8 a7 1d 04 00 e9 7c fe ff ff 48 c7 c7 18 27 f0 97 e8 96 59 a8 fd <0f> 0b e9 ed fd ff ff 48 c7 c7 b0 26 f0 97 e8 83 59 a8 fd 0f 0b ff [7.690510] RSP: :98803f08 EFLAGS: 00010246 [7.690512] RAX: RBX: RCX: [7.690513] RDX: RSI: RDI: [7.690514] RBP: 98803f20 R08: R09: [7.690515] R10: R11: R12: 00e0 [7.690516] R13: 5a1ccde0 R14: 5a1c7469 R15: 5a1d7ee0 [7.690518] FS: () GS:96490060() knlGS: [7.690520] CS: 0010 DS: ES: CR0: 80050033 [7.690521] CR2: 970bf000 CR3: 00ecd7810001 CR4: 000606f0 [7.690522] DR0: DR1: DR2: [7.690523] DR3: DR6: fffe0ff0 DR7: 0400 [7.690524] Call Trace: [7.690526] [7.690529] x86_64_start_kernel+0x102/0x180 [7.690536] secondary_startup_64_no_verify+0xe5/0xeb [7.690544] [7.690544] ---[ end trace ]--- I also see pretty much the same thing on some Ampere based arm64 servers: [0.000519] [ cut here ] [0.000521] Interrupts were enabled early [0.000525] WARNING: CPU: 0 PID: 0 at init/main.c:1065 start_kernel+0x3ac/0x514 [0.000531] Modules linked in: [0.000535] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.2.0-1004-nvidia #4~22.04.1-Ubuntu [0.000538] pstate: 6049 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [0.000540] pc : start_kernel+0x3ac/0x514 [0.000543] lr : start_kernel+0x3ac/0x514 [0.000545] sp : dec5ff733e60 [0.000546] x29: dec5ff733e60 x28: 0819aa09baac x27: 403ffdd124e0 [0.000549] x26: bfdf3788 x25: 9b6fc000 x24: 001dba7b [0.000552] x23: 5ec57c98 x22: 0819ab2a x21: dec5ff749140 [0.000555] x20: dec5ff73d9c0 x19: dec5ffbe4000 x18: dec5ff74a1c8 [0.000558] x17: x16: x15: [0.000560] x14: x13: 0a796c7261652064 x12: 656c62616e652065 [0.000563] x11: 656820747563205b x10: 2d2d2d2d2d2d2d2d x9 : [0.000565] x8 : x7 : x6 : [0.000568] x5 : x4 : x3 : [0.000571] x2 : x1 : x0 : [0.000573] Call trace: [0.000574] start_kernel+0x3ac/0x514 [0.000577] __primary_switched+0xc0/0xc8 [0.000580] ---[ end trace ]--- The warning does not appear on an older thunderx2 server. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-nvidia-6.2/+bug/2026891/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2026891] Re: linux-nvidia-6.2 on DGX servers: "WARNING: CPU: 0 PID: 0 at init/main.c:1065 start_kernel+0x4da/0x540"
I ran through several kernels on our DGX-2 server, only the latest 6.2.0-1004-nvidia kernel emitted the warning. Here are all the kernels I tried: Lunar 6.2.0-24.24 generic - PASS Jammy 5.15.0-1028-nvidia - PASS Jammy 5.19.0-46-generic - PASS Jammy 5.19.0-1014-nvidia - PASS Jammy 6.2.0-25-generic - PASS Jammy 6.2.0-1003-nvidia - PASS Jammy 6.2.0-1004-nvidia - FAIL -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-nvidia-6.2 in Ubuntu. https://bugs.launchpad.net/bugs/2026891 Title: linux-nvidia-6.2 on DGX servers: "WARNING: CPU: 0 PID: 0 at init/main.c:1065 start_kernel+0x4da/0x540" Status in linux-nvidia-6.2 package in Ubuntu: New Bug description: We started testing the jammy/linux-nvidia-6.2 kernels on the nvidia servers (DGX-1/DGX-2/H100) and hit the following warning during boot: [7.690486] [ cut here ] [7.690487] Interrupts were enabled early [7.690490] WARNING: CPU: 0 PID: 0 at init/main.c:1065 start_kernel+0x4da/0x540 [7.690498] Modules linked in: [7.690501] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.2.0-1004-nvidia #4~22.04.1-Ubuntu [7.690504] Hardware name: NVIDIA NVIDIA DGX-2/NVIDIA DGX-2, BIOS 0.29 06/07/2021 [7.690505] RIP: 0010:start_kernel+0x4da/0x540 [7.690508] Code: ff 48 c7 c7 e8 26 f0 97 e8 b3 59 a8 fd 0f 0b e9 96 fd ff ff e8 a7 1d 04 00 e9 7c fe ff ff 48 c7 c7 18 27 f0 97 e8 96 59 a8 fd <0f> 0b e9 ed fd ff ff 48 c7 c7 b0 26 f0 97 e8 83 59 a8 fd 0f 0b ff [7.690510] RSP: :98803f08 EFLAGS: 00010246 [7.690512] RAX: RBX: RCX: [7.690513] RDX: RSI: RDI: [7.690514] RBP: 98803f20 R08: R09: [7.690515] R10: R11: R12: 00e0 [7.690516] R13: 5a1ccde0 R14: 5a1c7469 R15: 5a1d7ee0 [7.690518] FS: () GS:96490060() knlGS: [7.690520] CS: 0010 DS: ES: CR0: 80050033 [7.690521] CR2: 970bf000 CR3: 00ecd7810001 CR4: 000606f0 [7.690522] DR0: DR1: DR2: [7.690523] DR3: DR6: fffe0ff0 DR7: 0400 [7.690524] Call Trace: [7.690526] [7.690529] x86_64_start_kernel+0x102/0x180 [7.690536] secondary_startup_64_no_verify+0xe5/0xeb [7.690544] [7.690544] ---[ end trace ]--- I also see pretty much the same thing on some Ampere based arm64 servers: [0.000519] [ cut here ] [0.000521] Interrupts were enabled early [0.000525] WARNING: CPU: 0 PID: 0 at init/main.c:1065 start_kernel+0x3ac/0x514 [0.000531] Modules linked in: [0.000535] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.2.0-1004-nvidia #4~22.04.1-Ubuntu [0.000538] pstate: 6049 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [0.000540] pc : start_kernel+0x3ac/0x514 [0.000543] lr : start_kernel+0x3ac/0x514 [0.000545] sp : dec5ff733e60 [0.000546] x29: dec5ff733e60 x28: 0819aa09baac x27: 403ffdd124e0 [0.000549] x26: bfdf3788 x25: 9b6fc000 x24: 001dba7b [0.000552] x23: 5ec57c98 x22: 0819ab2a x21: dec5ff749140 [0.000555] x20: dec5ff73d9c0 x19: dec5ffbe4000 x18: dec5ff74a1c8 [0.000558] x17: x16: x15: [0.000560] x14: x13: 0a796c7261652064 x12: 656c62616e652065 [0.000563] x11: 656820747563205b x10: 2d2d2d2d2d2d2d2d x9 : [0.000565] x8 : x7 : x6 : [0.000568] x5 : x4 : x3 : [0.000571] x2 : x1 : x0 : [0.000573] Call trace: [0.000574] start_kernel+0x3ac/0x514 [0.000577] __primary_switched+0xc0/0xc8 [0.000580] ---[ end trace ]--- The warning does not appear on an older thunderx2 server. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-nvidia-6.2/+bug/2026891/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2026891] [NEW] linux-nvidia-6.2 on DGX servers: "WARNING: CPU: 0 PID: 0 at init/main.c:1065 start_kernel+0x4da/0x540"
Public bug reported: We started testing the jammy/linux-nvidia-6.2 kernels on the nvidia servers (DGX-1/DGX-2/H100) and hit the following warning during boot: [7.690486] [ cut here ] [7.690487] Interrupts were enabled early [7.690490] WARNING: CPU: 0 PID: 0 at init/main.c:1065 start_kernel+0x4da/0x540 [7.690498] Modules linked in: [7.690501] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.2.0-1004-nvidia #4~22.04.1-Ubuntu [7.690504] Hardware name: NVIDIA NVIDIA DGX-2/NVIDIA DGX-2, BIOS 0.29 06/07/2021 [7.690505] RIP: 0010:start_kernel+0x4da/0x540 [7.690508] Code: ff 48 c7 c7 e8 26 f0 97 e8 b3 59 a8 fd 0f 0b e9 96 fd ff ff e8 a7 1d 04 00 e9 7c fe ff ff 48 c7 c7 18 27 f0 97 e8 96 59 a8 fd <0f> 0b e9 ed fd ff ff 48 c7 c7 b0 26 f0 97 e8 83 59 a8 fd 0f 0b ff [7.690510] RSP: :98803f08 EFLAGS: 00010246 [7.690512] RAX: RBX: RCX: [7.690513] RDX: RSI: RDI: [7.690514] RBP: 98803f20 R08: R09: [7.690515] R10: R11: R12: 00e0 [7.690516] R13: 5a1ccde0 R14: 5a1c7469 R15: 5a1d7ee0 [7.690518] FS: () GS:96490060() knlGS: [7.690520] CS: 0010 DS: ES: CR0: 80050033 [7.690521] CR2: 970bf000 CR3: 00ecd7810001 CR4: 000606f0 [7.690522] DR0: DR1: DR2: [7.690523] DR3: DR6: fffe0ff0 DR7: 0400 [7.690524] Call Trace: [7.690526] [7.690529] x86_64_start_kernel+0x102/0x180 [7.690536] secondary_startup_64_no_verify+0xe5/0xeb [7.690544] [7.690544] ---[ end trace ]--- I also see pretty much the same thing on some Ampere based arm64 servers: [0.000519] [ cut here ] [0.000521] Interrupts were enabled early [0.000525] WARNING: CPU: 0 PID: 0 at init/main.c:1065 start_kernel+0x3ac/0x514 [0.000531] Modules linked in: [0.000535] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.2.0-1004-nvidia #4~22.04.1-Ubuntu [0.000538] pstate: 6049 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [0.000540] pc : start_kernel+0x3ac/0x514 [0.000543] lr : start_kernel+0x3ac/0x514 [0.000545] sp : dec5ff733e60 [0.000546] x29: dec5ff733e60 x28: 0819aa09baac x27: 403ffdd124e0 [0.000549] x26: bfdf3788 x25: 9b6fc000 x24: 001dba7b [0.000552] x23: 5ec57c98 x22: 0819ab2a x21: dec5ff749140 [0.000555] x20: dec5ff73d9c0 x19: dec5ffbe4000 x18: dec5ff74a1c8 [0.000558] x17: x16: x15: [0.000560] x14: x13: 0a796c7261652064 x12: 656c62616e652065 [0.000563] x11: 656820747563205b x10: 2d2d2d2d2d2d2d2d x9 : [0.000565] x8 : x7 : x6 : [0.000568] x5 : x4 : x3 : [0.000571] x2 : x1 : x0 : [0.000573] Call trace: [0.000574] start_kernel+0x3ac/0x514 [0.000577] __primary_switched+0xc0/0xc8 [0.000580] ---[ end trace ]--- The warning does not appear on an older thunderx2 server. ** Affects: linux-nvidia-6.2 (Ubuntu) Importance: Undecided Status: New -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-nvidia-6.2 in Ubuntu. https://bugs.launchpad.net/bugs/2026891 Title: linux-nvidia-6.2 on DGX servers: "WARNING: CPU: 0 PID: 0 at init/main.c:1065 start_kernel+0x4da/0x540" Status in linux-nvidia-6.2 package in Ubuntu: New Bug description: We started testing the jammy/linux-nvidia-6.2 kernels on the nvidia servers (DGX-1/DGX-2/H100) and hit the following warning during boot: [7.690486] [ cut here ] [7.690487] Interrupts were enabled early [7.690490] WARNING: CPU: 0 PID: 0 at init/main.c:1065 start_kernel+0x4da/0x540 [7.690498] Modules linked in: [7.690501] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.2.0-1004-nvidia #4~22.04.1-Ubuntu [7.690504] Hardware name: NVIDIA NVIDIA DGX-2/NVIDIA DGX-2, BIOS 0.29 06/07/2021 [7.690505] RIP: 0010:start_kernel+0x4da/0x540 [7.690508] Code: ff 48 c7 c7 e8 26 f0 97 e8 b3 59 a8 fd 0f 0b e9 96 fd ff ff e8 a7 1d 04 00 e9 7c fe ff ff 48 c7 c7 18 27 f0 97 e8 96 59 a8 fd <0f> 0b e9 ed fd ff ff 48 c7 c7 b0 26 f0 97 e8 83 59 a8 fd 0f 0b ff [7.690510] RSP: :98803f08 EFLAGS: 00010246 [7.690512] RAX: RBX: RCX: [7.690513] RDX: RSI: 0
[Kernel-packages] [Bug 2024675] Re: NVIDIA CVE-2023-25515, CVE-2023-25516
Automated testing of the DKMS drivers, (450-server, 470-server, 525-server, 470, 525 and 535) has completed across bionic, focal, jammy, kinetic and lunar. This was performed with: * Deploy host with gpgpu * Install latest `linux-generic` kernel * Install driver from ppa using `nvidia-driver-${DRIVER_NUMBER}` package * Reboot * Install cuda * Execute select cuda samples * Verify nvidia-smi output matches the expected DRIVER_NUMBER and version. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to nvidia-graphics-drivers-470 in Ubuntu. https://bugs.launchpad.net/bugs/2024675 Title: NVIDIA CVE-2023-25515, CVE-2023-25516 Status in fabric-manager-450 package in Ubuntu: In Progress Status in fabric-manager-470 package in Ubuntu: Triaged Status in fabric-manager-525 package in Ubuntu: Triaged Status in libnvidia-nscq-450 package in Ubuntu: Triaged Status in libnvidia-nscq-470 package in Ubuntu: Triaged Status in libnvidia-nscq-525 package in Ubuntu: Triaged Status in nvidia-graphics-drivers-450-server package in Ubuntu: Triaged Status in nvidia-graphics-drivers-470 package in Ubuntu: Triaged Status in nvidia-graphics-drivers-470-server package in Ubuntu: Triaged Status in nvidia-graphics-drivers-525 package in Ubuntu: Triaged Status in nvidia-graphics-drivers-525-server package in Ubuntu: Triaged Status in nvidia-graphics-drivers-530 package in Ubuntu: Triaged Status in fabric-manager-450 source package in Focal: New Status in fabric-manager-470 source package in Focal: New Status in fabric-manager-525 source package in Focal: New Status in libnvidia-nscq-450 source package in Focal: New Status in libnvidia-nscq-470 source package in Focal: New Status in libnvidia-nscq-525 source package in Focal: New Status in nvidia-graphics-drivers-450-server source package in Focal: New Status in nvidia-graphics-drivers-470 source package in Focal: New Status in nvidia-graphics-drivers-470-server source package in Focal: New Status in nvidia-graphics-drivers-525 source package in Focal: New Status in nvidia-graphics-drivers-525-server source package in Focal: New Status in nvidia-graphics-drivers-530 source package in Focal: New Status in fabric-manager-450 source package in Jammy: New Status in fabric-manager-470 source package in Jammy: New Status in fabric-manager-525 source package in Jammy: New Status in libnvidia-nscq-450 source package in Jammy: New Status in libnvidia-nscq-470 source package in Jammy: New Status in libnvidia-nscq-525 source package in Jammy: New Status in nvidia-graphics-drivers-450-server source package in Jammy: New Status in nvidia-graphics-drivers-470 source package in Jammy: New Status in nvidia-graphics-drivers-470-server source package in Jammy: New Status in nvidia-graphics-drivers-525 source package in Jammy: New Status in nvidia-graphics-drivers-525-server source package in Jammy: New Status in nvidia-graphics-drivers-530 source package in Jammy: New Status in fabric-manager-450 source package in Kinetic: New Status in fabric-manager-470 source package in Kinetic: New Status in fabric-manager-525 source package in Kinetic: New Status in libnvidia-nscq-450 source package in Kinetic: New Status in libnvidia-nscq-470 source package in Kinetic: New Status in libnvidia-nscq-525 source package in Kinetic: New Status in nvidia-graphics-drivers-450-server source package in Kinetic: New Status in nvidia-graphics-drivers-470 source package in Kinetic: New Status in nvidia-graphics-drivers-470-server source package in Kinetic: New Status in nvidia-graphics-drivers-525 source package in Kinetic: New Status in nvidia-graphics-drivers-525-server source package in Kinetic: New Status in nvidia-graphics-drivers-530 source package in Kinetic: New Status in fabric-manager-450 source package in Lunar: New Status in fabric-manager-470 source package in Lunar: New Status in fabric-manager-525 source package in Lunar: New Status in libnvidia-nscq-450 source package in Lunar: New Status in libnvidia-nscq-470 source package in Lunar: New Status in libnvidia-nscq-525 source package in Lunar: New Status in nvidia-graphics-drivers-450-server source package in Lunar: New Status in nvidia-graphics-drivers-470 source package in Lunar: New Status in nvidia-graphics-drivers-470-server source package in Lunar: New Status in nvidia-graphics-drivers-525 source package in Lunar: New Status in nvidia-graphics-drivers-525-server source package in Lunar: New Status in nvidia-graphics-drivers-530 source package in Lunar: New Bug description: CVE-2023-25516, CVE-2023-25516 https://nvidia.custhelp.com/app/answers/detail/a_id/5468 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/fabric-manager-450/+bug/2024675/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-p
[Kernel-packages] [Bug 2023986] Re: Drivers not working using kernel linux-image-6.2.0-1003-oracle
Thanks to everyone supplying their logs. I'm still looking through these to try to understand what's going on here. For most that hit this issue, the solution would be interrupt the boot loader to boot back into the generic kernel, then remove the oracle and lowlatency kernels. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-signed-oracle in Ubuntu. https://bugs.launchpad.net/bugs/2023986 Title: Drivers not working using kernel linux-image-6.2.0-1003-oracle Status in linux-signed-oracle package in Ubuntu: Confirmed Bug description: My Ubuntu 23.04 installed two kernels linux-image-6.2.0-1003-oracle and linux-image-6.2.0-1003-lowlatency using software updater along with other updates. After install and restart, most drivers including wifi, bluetooth, touchpad and ethernet stopped working. Rebooting using 6.2.0-1003-lowlatency or 6.2.0-20-generic solved the driver issues. ProblemType: Bug DistroRelease: Ubuntu 23.04 Package: linux-image-6.2.0-1003-oracle 6.2.0-1003.3 ProcVersionSignature: Ubuntu 6.2.0-1003.3-lowlatency 6.2.6 Uname: Linux 6.2.0-1003-lowlatency x86_64 ApportVersion: 2.26.1-0ubuntu2 Architecture: amd64 CasperMD5CheckResult: unknown CurrentDesktop: ubuntu:GNOME Date: Thu Jun 15 16:34:56 2023 ProcEnviron: LANG=en_US.UTF-8 PATH=(custom, no user) SHELL=/bin/bash TERM=xterm-256color XDG_RUNTIME_DIR= SourcePackage: linux-signed-oracle UpgradeStatus: No upgrade log present (probably fresh install) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-signed-oracle/+bug/2023986/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2023986] Re: Drivers not working using kernel linux-image-6.2.0-1003-oracle
@navroop005, Hello, would you mind please sharing a copy of your `/var/log/apt/history.log`? This looks like a possible package dependency issue. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-signed-oracle in Ubuntu. https://bugs.launchpad.net/bugs/2023986 Title: Drivers not working using kernel linux-image-6.2.0-1003-oracle Status in linux-signed-oracle package in Ubuntu: Confirmed Bug description: My Ubuntu 23.04 installed two kernels linux-image-6.2.0-1003-oracle and linux-image-6.2.0-1003-lowlatency using software updater along with other updates. After install and restart, most drivers including wifi, bluetooth, touchpad and ethernet stopped working. Rebooting using 6.2.0-1003-lowlatency or 6.2.0-20-generic solved the driver issues. ProblemType: Bug DistroRelease: Ubuntu 23.04 Package: linux-image-6.2.0-1003-oracle 6.2.0-1003.3 ProcVersionSignature: Ubuntu 6.2.0-1003.3-lowlatency 6.2.6 Uname: Linux 6.2.0-1003-lowlatency x86_64 ApportVersion: 2.26.1-0ubuntu2 Architecture: amd64 CasperMD5CheckResult: unknown CurrentDesktop: ubuntu:GNOME Date: Thu Jun 15 16:34:56 2023 ProcEnviron: LANG=en_US.UTF-8 PATH=(custom, no user) SHELL=/bin/bash TERM=xterm-256color XDG_RUNTIME_DIR= SourcePackage: linux-signed-oracle UpgradeStatus: No upgrade log present (probably fresh install) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-signed-oracle/+bug/2023986/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2023042] Re: "couldn't communicate with the NVIDIA driver" when installing open dkms and LRM drivers concurrently
I've found a flaw in the test script in which it was installing the wrong LRM modules for the running kernel. It was installing the generic modules for a gcp kernel. Once I corrected this to install the gcp modules, it now passes. Attached are the logs with the addition of `lsmod` and `modinfo nvidia` I think this can now be closed as a test error. ** Attachment added: "lunar-525-open-to-lrm-PASSED.txt" https://bugs.launchpad.net/ubuntu/+source/nvidia-graphics-drivers-525/+bug/2023042/+attachment/5679750/+files/lunar-525-open-to-lrm-PASSED.txt ** Changed in: nvidia-graphics-drivers-525 (Ubuntu) Status: New => Invalid -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to nvidia-graphics-drivers-525 in Ubuntu. https://bugs.launchpad.net/bugs/2023042 Title: "couldn't communicate with the NVIDIA driver" when installing open dkms and LRM drivers concurrently Status in nvidia-graphics-drivers-525 package in Ubuntu: Invalid Bug description: Installing "nvidia-driver-525-open" followed by "nvidia-headless-no- dkms-525 linux-modules-nvidia-525-gcp nvidia-utils-525" led to a system which complained about a "Driver/library version mismatch". Specifically what was done is: Deploy a clean google VM with: gcloud compute instances create fginther-kinetic-gpgpu-525 --image- project ubuntu-os-cloud --image-family ubuntu-2210-amd64 --machine- type n1-standard-4 --boot-disk-size=32GB --accelerator type=nvidia- tesla-t4,count=1 --maintenance-policy TERMINATE --restart-on-failure Enable kinetic-proposed (this was done with the 525.116.04-0ubuntu0.22.10.1 driver package). Install the 525-open driver first: apt-get install -y nvidia-driver-525-open Then install the proprietary driver: apt-get install nvidia-headless-no-dkms-525 linux-modules- nvidia-525-gcp nvidia-utils-525 After rebooting, "nvidia-smi" complained of the driver/library mismatch: ubuntu@fginther-kinetic-gpgpu-525:~$ nvidia-smi NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running. The /var/log/apt/history.log is attached which details the packages installed and removed. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/nvidia-graphics-drivers-525/+bug/2023042/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2023611] Re: Unable to remove efi variable with 6.2.0-21.21 or newer lunar kernel
I've reproduced this with the 6.3.0-7-generic kernel from mantic- proposed. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2023611 Title: Unable to remove efi variable with 6.2.0-21.21 or newer lunar kernel Status in linux package in Ubuntu: Incomplete Bug description: I'm seeing an issue on an isolated host, howzit, in which it fails to remove boot entries. In my limited testing this worked with the 6.2.0-20.20 kernel, but not the 21.21 or 23.23 kernel. I have not yet tried any of the 6.3 kernels. I've only seen this on one host so far, howzit, which is an arm64 server. I have tested on three other arm64 servers and they don't appear to be impacted, so this could be some firmware issue. It adversely impacts maas installs and will cause a mantic install (which is using 6.2.0-21.21) to fail because it can't modify the boot paths. Here's what I see trying to remove a boot entry: ubuntu@howzit-kernel:~$ efibootmgr -v BootCurrent: 0005 Timeout: 5 seconds BootOrder: 0005,0007,0004,0006 Boot0004 UEFI: Built-in EFI Shell VenMedia(5023b95c-db26-429b-a648-bd47664c8012)..BO Boot0005* UEFI: PXE IPv4 Mellanox Network Adapter - 0C:42:A1:52:3D:4C PcieRoot(0x3)/Pci(0x1,0x0)/Pci(0x0,0x0)/MAC(0c42a1523d4c,1)/IPv4(0.0.0.00.0.0.0,0,0)..BO Boot0006 UEFI: PXE IPv4 Mellanox Network Adapter - 0C:42:A1:52:3D:4D PcieRoot(0x3)/Pci(0x1,0x0)/Pci(0x0,0x1)/MAC(0c42a1523d4d,1)/IPv4(0.0.0.00.0.0.0,0,0)..BO Boot0007* ubuntu HD(1,GPT,6d8df92f-72ad-4c24-bc8d-8236a4c5e222,0x800,0x10)/File(\EFI\UBUNTU\GRUBAA64.EFI)..BO ubuntu@howzit-kernel:~$ sudo efibootmgr -B -b 0007 Could not delete variable: Invalid argument The same command will work with the 6.2.0-20.20 kernel. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2023611/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2023611] [NEW] Unable to remove efi variable with 6.2.0-21.21 or newer lunar kernel
Public bug reported: I'm seeing an issue on an isolated host, howzit, in which it fails to remove boot entries. In my limited testing this worked with the 6.2.0-20.20 kernel, but not the 21.21 or 23.23 kernel. I have not yet tried any of the 6.3 kernels. I've only seen this on one host so far, howzit, which is an arm64 server. I have tested on three other arm64 servers and they don't appear to be impacted, so this could be some firmware issue. It adversely impacts maas installs and will cause a mantic install (which is using 6.2.0-21.21) to fail because it can't modify the boot paths. Here's what I see trying to remove a boot entry: ubuntu@howzit-kernel:~$ efibootmgr -v BootCurrent: 0005 Timeout: 5 seconds BootOrder: 0005,0007,0004,0006 Boot0004 UEFI: Built-in EFI Shell VenMedia(5023b95c-db26-429b-a648-bd47664c8012)..BO Boot0005* UEFI: PXE IPv4 Mellanox Network Adapter - 0C:42:A1:52:3D:4C PcieRoot(0x3)/Pci(0x1,0x0)/Pci(0x0,0x0)/MAC(0c42a1523d4c,1)/IPv4(0.0.0.00.0.0.0,0,0)..BO Boot0006 UEFI: PXE IPv4 Mellanox Network Adapter - 0C:42:A1:52:3D:4D PcieRoot(0x3)/Pci(0x1,0x0)/Pci(0x0,0x1)/MAC(0c42a1523d4d,1)/IPv4(0.0.0.00.0.0.0,0,0)..BO Boot0007* ubuntu HD(1,GPT,6d8df92f-72ad-4c24-bc8d-8236a4c5e222,0x800,0x10)/File(\EFI\UBUNTU\GRUBAA64.EFI)..BO ubuntu@howzit-kernel:~$ sudo efibootmgr -B -b 0007 Could not delete variable: Invalid argument The same command will work with the 6.2.0-20.20 kernel. ** Affects: linux (Ubuntu) Importance: Undecided Status: New ** Tags: lunar ** Tags added: lunar -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2023611 Title: Unable to remove efi variable with 6.2.0-21.21 or newer lunar kernel Status in linux package in Ubuntu: New Bug description: I'm seeing an issue on an isolated host, howzit, in which it fails to remove boot entries. In my limited testing this worked with the 6.2.0-20.20 kernel, but not the 21.21 or 23.23 kernel. I have not yet tried any of the 6.3 kernels. I've only seen this on one host so far, howzit, which is an arm64 server. I have tested on three other arm64 servers and they don't appear to be impacted, so this could be some firmware issue. It adversely impacts maas installs and will cause a mantic install (which is using 6.2.0-21.21) to fail because it can't modify the boot paths. Here's what I see trying to remove a boot entry: ubuntu@howzit-kernel:~$ efibootmgr -v BootCurrent: 0005 Timeout: 5 seconds BootOrder: 0005,0007,0004,0006 Boot0004 UEFI: Built-in EFI Shell VenMedia(5023b95c-db26-429b-a648-bd47664c8012)..BO Boot0005* UEFI: PXE IPv4 Mellanox Network Adapter - 0C:42:A1:52:3D:4C PcieRoot(0x3)/Pci(0x1,0x0)/Pci(0x0,0x0)/MAC(0c42a1523d4c,1)/IPv4(0.0.0.00.0.0.0,0,0)..BO Boot0006 UEFI: PXE IPv4 Mellanox Network Adapter - 0C:42:A1:52:3D:4D PcieRoot(0x3)/Pci(0x1,0x0)/Pci(0x0,0x1)/MAC(0c42a1523d4d,1)/IPv4(0.0.0.00.0.0.0,0,0)..BO Boot0007* ubuntu HD(1,GPT,6d8df92f-72ad-4c24-bc8d-8236a4c5e222,0x800,0x10)/File(\EFI\UBUNTU\GRUBAA64.EFI)..BO ubuntu@howzit-kernel:~$ sudo efibootmgr -B -b 0007 Could not delete variable: Invalid argument The same command will work with the 6.2.0-20.20 kernel. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2023611/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2023042] [NEW] "Driver/library version mismatch" when installing open and proprietary drivers concurrently
Public bug reported: Installing "nvidia-driver-525-open" followed by "nvidia-headless-no- dkms-525 linux-modules-nvidia-525-gcp nvidia-utils-525" led to a system which complained about a "Driver/library version mismatch". Specifically what was done is: Deploy a clean google VM with: gcloud compute instances create fginther-kinetic-gpgpu-525 --image- project ubuntu-os-cloud --image-family ubuntu-2210-amd64 --machine-type n1-standard-4 --boot-disk-size=32GB --accelerator type=nvidia- tesla-t4,count=1 --maintenance-policy TERMINATE --restart-on-failure Enable kinetic-proposed (this was done with the 525.116.04-0ubuntu0.22.10.1 driver package). Install the 525-open driver first: apt-get install -y nvidia-driver-525-open Then install the proprietary driver: apt-get install nvidia-headless-no-dkms-525 linux-modules-nvidia-525-gcp nvidia-utils-525 After rebooting, "nvidia-smi" complained of the driver/library mismatch: ubuntu@fginther-kinetic-gpgpu-525:~$ nvidia-smi Failed to initialize NVML: Driver/library version mismatch The /var/log/apt/history.log is attached which details the packages installed and removed. ** Affects: nvidia-graphics-drivers-525 (Ubuntu) Importance: Undecided Status: New ** Attachment added: "history.log" https://bugs.launchpad.net/bugs/2023042/+attachment/5678203/+files/history.log -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to nvidia-graphics-drivers-525 in Ubuntu. https://bugs.launchpad.net/bugs/2023042 Title: "Driver/library version mismatch" when installing open and proprietary drivers concurrently Status in nvidia-graphics-drivers-525 package in Ubuntu: New Bug description: Installing "nvidia-driver-525-open" followed by "nvidia-headless-no- dkms-525 linux-modules-nvidia-525-gcp nvidia-utils-525" led to a system which complained about a "Driver/library version mismatch". Specifically what was done is: Deploy a clean google VM with: gcloud compute instances create fginther-kinetic-gpgpu-525 --image- project ubuntu-os-cloud --image-family ubuntu-2210-amd64 --machine- type n1-standard-4 --boot-disk-size=32GB --accelerator type=nvidia- tesla-t4,count=1 --maintenance-policy TERMINATE --restart-on-failure Enable kinetic-proposed (this was done with the 525.116.04-0ubuntu0.22.10.1 driver package). Install the 525-open driver first: apt-get install -y nvidia-driver-525-open Then install the proprietary driver: apt-get install nvidia-headless-no-dkms-525 linux-modules- nvidia-525-gcp nvidia-utils-525 After rebooting, "nvidia-smi" complained of the driver/library mismatch: ubuntu@fginther-kinetic-gpgpu-525:~$ nvidia-smi Failed to initialize NVML: Driver/library version mismatch The /var/log/apt/history.log is attached which details the packages installed and removed. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/nvidia-graphics-drivers-525/+bug/2023042/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2016908] Re: udev fails to make prctl() syscall with apparmor=0 (as used by maas by default)
I can confirm @xnox's findings with my maas server deploying lunar. Adding `apparmor=1` to the settings/configuration/kernel-parameters allows for a successful deployment with the lunar 6.2.0-20.20 kernel. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2016908 Title: udev fails to make prctl() syscall with apparmor=0 (as used by maas by default) Status in MAAS: Triaged Status in maas-images: Invalid Status in linux package in Ubuntu: Triaged Status in systemd package in Ubuntu: Invalid Bug description: I'm assuming the image being used for these deploys is 20230417 or 20230417.1 based on the fact that I saw a 6.2 kernel being used which I don't believe was part of the 20230319 serial. I don't have access to the maas server, so I can't directly check any log files. MAAS Version: 3.3.2 Here's where the serial log indicates it can't download the squashfs. The full log is attached as scobee-lunar-no-squashfs.log (there are some other console message intermixed): no search or nameservers found in /run/net-BOOTIF.conf /run/net-*.conf /run/net6 -*.conf :: root=squash:http://10.229.32.21:5248/images/ubuntu/arm64/ga-23.04/lunar/candi date/squa[ 206.804704] Btrfs loaded, crc32c=crc32c-generic, zoned=yes, fsverity =yes shfs :: mount_squash downloading http://10.229.32.21:5248/images/ubuntu/arm64/ga-23.0 4/lunar/candidate/squashfs to /root.tmp.img Connecting to 10.229.32.21:5248 (10.229.32.21:5248) wget: can't connect to remote host (10.229.32.21): Network is unreachable :: mount -t squashfs -o loop '/root.tmp.img' '/root.tmp' mount: mounting /root.tmp.img on /root.tmp failed: No such file or directory done. Still gathering logs and info and will update as I go. Kernel Bug / Apparmor reproducer $ wget https://images.maas.io/ephemeral-v3/candidate/lunar/amd64/20230419/ga-23.04/generic/boot-kernel $ wget https://images.maas.io/ephemeral-v3/candidate/lunar/amd64/20230419/ga-23.04/generic/boot-initrd $ qemu-system-x86_64 -nographic -m 2G -kernel ./boot-kernel -initrd ./boot-initrd -append 'console=ttyS0 break=modules apparmor=0' #start the VM Starting systemd-udevd version 252.5-2ubuntu3 Spawning shell within the initramfs BusyBox v1.35.0 (Ubuntu 1:1.35.0-4ubuntu1) built-in shell (ash) Enter 'help' for a list of built-in commands. (initramfs) udevadm info --export-db Failed to set death signal: Invalid argument Observe that udevadm fails to setup death signal, with in systemd code is this https://github.com/systemd/systemd/blob/08c2f9c626e0f0052d505b1b7e52f335c0fbfa1d/src/basic/process- util.c#L1252 if (flags & (FORK_DEATHSIG|FORK_DEATHSIG_SIGINT)) if (prctl(PR_SET_PDEATHSIG, (flags & FORK_DEATHSIG_SIGINT) ? SIGINT : SIGTERM) < 0) { log_full_errno(prio, errno, "Failed to set death signal: %m"); _exit(EXIT_FAILURE); } workaround set kernel commandline to `apparmor=1` MAAS bug Why is maas setting `apparmor=0` ? Ubuntu shouldn't be used without apparmor. Even for deployment and commisioning. To manage notifications about this bug go to: https://bugs.launchpad.net/maas/+bug/2016908/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2012529] Re: NVIDIA CVE-2023-{0180 to 0195}
Cuda testing passed for all drivers (470, 515, 525, 450-server, 470-server, 515-server, 525-server) on bionic, focal, jammy and kinetic using both DKMS and LRM (when using the appropriate stream 2 ppa for the LRM packages). DKMS testing also passed on lunar. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-restricted-modules in Ubuntu. https://bugs.launchpad.net/bugs/2012529 Title: NVIDIA CVE-2023-{0180 to 0195} Status in fabric-manager-450 package in Ubuntu: New Status in fabric-manager-470 package in Ubuntu: New Status in fabric-manager-515 package in Ubuntu: New Status in fabric-manager-525 package in Ubuntu: New Status in libnvidia-nscq-450 package in Ubuntu: New Status in libnvidia-nscq-470 package in Ubuntu: New Status in libnvidia-nscq-515 package in Ubuntu: New Status in libnvidia-nscq-525 package in Ubuntu: New Status in linux-restricted-modules package in Ubuntu: Triaged Status in nvidia-graphics-drivers-450-server package in Ubuntu: Fix Released Status in nvidia-graphics-drivers-470 package in Ubuntu: Fix Released Status in nvidia-graphics-drivers-470-server package in Ubuntu: Fix Released Status in nvidia-graphics-drivers-515 package in Ubuntu: Fix Released Status in nvidia-graphics-drivers-515-server package in Ubuntu: Fix Released Status in nvidia-graphics-drivers-525 package in Ubuntu: Triaged Status in nvidia-graphics-drivers-525-server package in Ubuntu: Fix Released Status in fabric-manager-450 source package in Bionic: New Status in fabric-manager-470 source package in Bionic: New Status in fabric-manager-515 source package in Bionic: New Status in fabric-manager-525 source package in Bionic: New Status in libnvidia-nscq-450 source package in Bionic: New Status in libnvidia-nscq-470 source package in Bionic: New Status in libnvidia-nscq-515 source package in Bionic: New Status in libnvidia-nscq-525 source package in Bionic: New Status in linux-restricted-modules source package in Bionic: New Status in nvidia-graphics-drivers-450-server source package in Bionic: In Progress Status in nvidia-graphics-drivers-470 source package in Bionic: New Status in nvidia-graphics-drivers-470-server source package in Bionic: New Status in nvidia-graphics-drivers-515 source package in Bionic: New Status in nvidia-graphics-drivers-515-server source package in Bionic: New Status in nvidia-graphics-drivers-525 source package in Bionic: New Status in nvidia-graphics-drivers-525-server source package in Bionic: New Status in fabric-manager-450 source package in Focal: New Status in fabric-manager-470 source package in Focal: New Status in fabric-manager-515 source package in Focal: New Status in fabric-manager-525 source package in Focal: New Status in libnvidia-nscq-450 source package in Focal: New Status in libnvidia-nscq-470 source package in Focal: New Status in libnvidia-nscq-515 source package in Focal: New Status in libnvidia-nscq-525 source package in Focal: New Status in linux-restricted-modules source package in Focal: New Status in nvidia-graphics-drivers-450-server source package in Focal: In Progress Status in nvidia-graphics-drivers-470 source package in Focal: New Status in nvidia-graphics-drivers-470-server source package in Focal: New Status in nvidia-graphics-drivers-515 source package in Focal: New Status in nvidia-graphics-drivers-515-server source package in Focal: New Status in nvidia-graphics-drivers-525 source package in Focal: New Status in nvidia-graphics-drivers-525-server source package in Focal: New Status in fabric-manager-450 source package in Jammy: New Status in fabric-manager-470 source package in Jammy: New Status in fabric-manager-515 source package in Jammy: New Status in fabric-manager-525 source package in Jammy: New Status in libnvidia-nscq-450 source package in Jammy: New Status in libnvidia-nscq-470 source package in Jammy: New Status in libnvidia-nscq-515 source package in Jammy: New Status in libnvidia-nscq-525 source package in Jammy: New Status in linux-restricted-modules source package in Jammy: New Status in nvidia-graphics-drivers-450-server source package in Jammy: New Status in nvidia-graphics-drivers-470 source package in Jammy: New Status in nvidia-graphics-drivers-470-server source package in Jammy: New Status in nvidia-graphics-drivers-515 source package in Jammy: New Status in nvidia-graphics-drivers-515-server source package in Jammy: New Status in nvidia-graphics-drivers-525 source package in Jammy: New Status in nvidia-graphics-drivers-525-server source package in Jammy: New Status in fabric-manager-450 source package in Kinetic: New Status in fabric-manager-470 source package in Kinetic: New Status in fabric-manager-515 source package in Kinetic: New Status in fabric-manager-525 source package in Kinetic: New Status in libnvidia-nscq-450 source pack
[Kernel-packages] [Bug 2000778] Re: pmtu.sh in net from ubunut_kernel_selftests crash SUT with K-5.19
Still failing on baltar.ppc64el.9 during 2023.02.27 sru cycle. The kuzzle and scobee (another arm64 server) passed. ** Tags added: sru-20230227 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2000778 Title: pmtu.sh in net from ubunut_kernel_selftests crash SUT with K-5.19 Status in ubuntu-kernel-tests: New Status in linux package in Ubuntu: Incomplete Status in linux source package in Kinetic: Incomplete Bug description: Issue found with Kinetic 5.19.0-27.28 and 5.19.0-28.29 in this cycle (20221114) on these SUTs * P9 baltar * ARM64 kuzzle * ARM64 howzit-kernel This should not be considered as a regression as the net test cannot be built in 5.19.0-24.25 Test log: ubuntu@baltar:~/autotest/client/tmp/ubuntu_kernel_selftests/src/linux/tools/testing/selftests/net$ sudo ./pmtu.sh TEST: ipv4: PMTU exceptions [ OK ] TEST: ipv4: PMTU exceptions - nexthop objects [ OK ] TEST: ipv6: PMTU exceptions [ OK ] TEST: ipv6: PMTU exceptions - nexthop objects [ OK ] TEST: ICMPv4 with DSCP and ECN: PMTU exceptions [ OK ] TEST: ICMPv4 with DSCP and ECN: PMTU exceptions - nexthop objects [ OK ] 'socat' command not found; skipping tests TEST: UDPv4 with DSCP and ECN: PMTU exceptions [SKIP] TEST: IPv4 over vxlan4: PMTU exceptions [ OK ] TEST: IPv4 over vxlan4: PMTU exceptions - nexthop objects [ OK ] TEST: IPv6 over vxlan4: PMTU exceptions [ OK ] TEST: IPv6 over vxlan4: PMTU exceptions - nexthop objects [ OK ] TEST: IPv4 over vxlan6: PMTU exceptions [ OK ] TEST: IPv4 over vxlan6: PMTU exceptions - nexthop objects [ OK ] TEST: IPv6 over vxlan6: PMTU exceptions [ OK ] TEST: IPv6 over vxlan6: PMTU exceptions - nexthop objects [ OK ] TEST: IPv4 over geneve4: PMTU exceptions[ OK ] TEST: IPv4 over geneve4: PMTU exceptions - nexthop objects [ OK ] TEST: IPv6 over geneve4: PMTU exceptions[ OK ] TEST: IPv6 over geneve4: PMTU exceptions - nexthop objects [ OK ] TEST: IPv4 over geneve6: PMTU exceptions[ OK ] TEST: IPv4 over geneve6: PMTU exceptions - nexthop objects [ OK ] TEST: IPv6 over geneve6: PMTU exceptions[ OK ] TEST: IPv6 over geneve6: PMTU exceptions - nexthop objects [ OK ] TEST: IPv4, bridged vxlan4: PMTU exceptions [ OK ] TEST: IPv4, bridged vxlan4: PMTU exceptions - nexthop objects [ OK ] TEST: IPv6, bridged vxlan4: PMTU exceptions [ OK ] TEST: IPv6, bridged vxlan4: PMTU exceptions - nexthop objects [ OK ] TEST: IPv4, bridged vxlan6: PMTU exceptions [ OK ] TEST: IPv4, bridged vxlan6: PMTU exceptions - nexthop objects [ OK ] TEST: IPv6, bridged vxlan6: PMTU exceptions [ OK ] TEST: IPv6, bridged vxlan6: PMTU exceptions - nexthop objects [ OK ] TEST: IPv4, bridged geneve4: PMTU exceptions[ OK ] TEST: IPv4, bridged geneve4: PMTU exceptions - nexthop objects [ OK ] TEST: IPv6, bridged geneve4: PMTU exceptions[ OK ] TEST: IPv6, bridged geneve4: PMTU exceptions - nexthop objects [ OK ] TEST: IPv4, bridged geneve6: PMTU exceptions[ OK ] TEST: IPv4, bridged geneve6: PMTU exceptions - nexthop objects [ OK ] TEST: IPv6, bridged geneve6: PMTU exceptions[ OK ] TEST: IPv6, bridged geneve6: PMTU exceptions - nexthop objects [ OK ] ovs_bridge not supported TEST: IPv4, OVS vxlan4: PMTU exceptions [SKIP] ovs_bridge not supported TEST: IPv6, OVS vxlan4: PMTU exceptions [SKIP] ovs_bridge not supported TEST: IPv4, OVS vxlan6: PMTU exceptions [SKIP] ovs_bridge not supported TEST: IPv6, OVS vxlan6: PMTU exceptions [SKIP] ovs_bridge not supported TEST: IPv4, OVS geneve4: PMTU exceptions[SKIP] ovs_bridge not supported TEST: IPv6, OVS geneve4: PMTU exceptions[SKIP] ovs_bridge not supported TEST: IPv4, OVS geneve6: PMTU exceptions[SKIP] ovs_bridge not supported TEST: IPv6, OVS geneve6: PMTU exceptions[SKIP] TEST: IPv4 over fou4: PMTU exceptions [ OK ] TEST: IPv4 over fou4: PMTU exceptions - nexthop objects [ OK ] TEST: IPv6 ove
[Kernel-packages] [Bug 2003995] Re: Update the 525 and 525-server NVIDIA driver series in Bionic, Focal, Jammy, and Kinetic
No regressions found for either 515-server or 525-server. Both were tested as DKMS and as LRMs using the generic kernel in all releases (lunar could not be installed with lrm). Jammy was also tested with the linux-nvidia kernel and LRMs. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-restricted-modules in Ubuntu. https://bugs.launchpad.net/bugs/2003995 Title: Update the 525 and 525-server NVIDIA driver series in Bionic, Focal, Jammy, and Kinetic Status in fabric-manager-515 package in Ubuntu: Fix Released Status in fabric-manager-525 package in Ubuntu: Fix Released Status in libnvidia-nscq-515 package in Ubuntu: Fix Released Status in libnvidia-nscq-525 package in Ubuntu: Fix Released Status in linux-restricted-modules package in Ubuntu: New Status in linux-restricted-modules-hwe package in Ubuntu: New Status in nvidia-graphics-drivers-515-server package in Ubuntu: Triaged Status in nvidia-graphics-drivers-525 package in Ubuntu: Triaged Status in nvidia-graphics-drivers-525-server package in Ubuntu: Fix Released Status in fabric-manager-515 source package in Bionic: Triaged Status in fabric-manager-525 source package in Bionic: Triaged Status in libnvidia-nscq-515 source package in Bionic: Triaged Status in libnvidia-nscq-525 source package in Bionic: Triaged Status in linux-restricted-modules source package in Bionic: New Status in linux-restricted-modules-hwe source package in Bionic: New Status in nvidia-graphics-drivers-515-server source package in Bionic: Triaged Status in nvidia-graphics-drivers-525 source package in Bionic: Triaged Status in nvidia-graphics-drivers-525-server source package in Bionic: Triaged Status in fabric-manager-515 source package in Focal: Triaged Status in fabric-manager-525 source package in Focal: Triaged Status in libnvidia-nscq-515 source package in Focal: Triaged Status in libnvidia-nscq-525 source package in Focal: Triaged Status in linux-restricted-modules source package in Focal: New Status in linux-restricted-modules-hwe source package in Focal: New Status in nvidia-graphics-drivers-515-server source package in Focal: Triaged Status in nvidia-graphics-drivers-525 source package in Focal: Triaged Status in nvidia-graphics-drivers-525-server source package in Focal: Triaged Status in fabric-manager-515 source package in Jammy: Triaged Status in fabric-manager-525 source package in Jammy: Triaged Status in libnvidia-nscq-515 source package in Jammy: Triaged Status in libnvidia-nscq-525 source package in Jammy: Triaged Status in linux-restricted-modules source package in Jammy: New Status in linux-restricted-modules-hwe source package in Jammy: New Status in nvidia-graphics-drivers-515-server source package in Jammy: Triaged Status in nvidia-graphics-drivers-525 source package in Jammy: Triaged Status in nvidia-graphics-drivers-525-server source package in Jammy: Triaged Status in fabric-manager-515 source package in Kinetic: Triaged Status in fabric-manager-525 source package in Kinetic: Triaged Status in libnvidia-nscq-515 source package in Kinetic: Triaged Status in libnvidia-nscq-525 source package in Kinetic: Triaged Status in linux-restricted-modules source package in Kinetic: New Status in linux-restricted-modules-hwe source package in Kinetic: New Status in nvidia-graphics-drivers-515-server source package in Kinetic: Triaged Status in nvidia-graphics-drivers-525 source package in Kinetic: Triaged Status in nvidia-graphics-drivers-525-server source package in Kinetic: Triaged Bug description: [Impact] These releases provide both bug fixes and new features, and we would like to make sure all of our users have access to these improvements. See the changelog entry below for a full list of changes and bugs. [Test Case] The following development and SRU process was followed: https://wiki.ubuntu.com/NVidiaUpdates Certification test suite must pass on a range of hardware: https://git.launchpad.net/plainbox-provider-sru/tree/units/sru.pxu The QA team that executed the tests will be in charge of attaching the artifacts and console output of the appropriate run to the bug. nVidia maintainers team members will not mark ‘verification-done’ until this has happened. [Regression Potential] In order to mitigate the regression potential, the results of the aforementioned system level tests are attached to this bug. [Discussion] [Changelog] 525 * New upstream release (LP: #2003995): - Improved the reliability of suspend and resume on UEFI systems when using certain display panels. - Fixed a bug that could cause VK_ERROR_DEVICE_LOST when using VK_MEMORY_ALLOCATE_DEVICE_ADDRESS_CAPTURE_REPLAY_BIT to allocate memory. - Disabled Fixed Rate Link (FRL) when using passive DisplayPort to HDMI dongles, which are incompatible wit
[Kernel-packages] [Bug 2006620] Re: linux-aws-5.19 hibernation tasks sometimes fail to freeze
Here is the full syslog from which the portion in the bug description was extracted from. ** Attachment added: "c5.12xlarge-3-syslog.log" https://bugs.launchpad.net/ubuntu/+source/linux-aws/+bug/2006620/+attachment/5645600/+files/c5.12xlarge-3-syslog.log -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-aws in Ubuntu. https://bugs.launchpad.net/bugs/2006620 Title: linux-aws-5.19 hibernation tasks sometimes fail to freeze Status in linux-aws package in Ubuntu: New Bug description: Hibernation on AWS instances with jammy/5.19.0-1019-aws sometimes fails due to the following failure to freeze: Feb 1 01:09:05 ip-172-31-54-178 kernel: [ 443.247854] PM: hibernation: hibernation entry Feb 1 01:09:05 ip-172-31-54-178 kernel: [ 443.347353] TSC found unstable after boot, most likely due to broken BIOS. Use 'tsc=unstable'. Feb 1 01:09:05 ip-172-31-54-178 kernel: [ 443.347355] sched_clock: Marking unstable (442909362062, 1007864825)<-(443748056670, -400707172) Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 443.940489] Filesystems sync: 0.022 seconds Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 443.940492] Freezing user space processes ... (elapsed 0.001 seconds) done. Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 443.941611] OOM killer disabled. Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 443.943036] PM: hibernation: Marking nosave pages: [mem 0x-0x0fff] Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 443.943039] PM: hibernation: Marking nosave pages: [mem 0x0009f000-0x000f] Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 443.943041] PM: hibernation: Marking nosave pages: [mem 0xbffe8000-0x] Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 443.943950] PM: hibernation: Basic memory bitmaps created Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 443.943961] PM: hibernation: Preallocating image memory Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 630.782421] PM: hibernation: Allocated 9655951 pages for snapshot Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 630.782424] PM: hibernation: Allocated 38623804 kbytes in 186.83 seconds (206.73 MB/s) Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 630.782426] Freezing remaining freezable tasks ... Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.789826] Freezing of tasks failed after 20.007 seconds (1 tasks refusing to freeze, wq_busy=0): Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792830] task:kswapd0 state:D stack:0 pid: 328 ppid: 2 flags:0x4000 Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792833] Call Trace: Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792835] Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792837] __schedule+0x248/0x5d0 Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792842] schedule+0x58/0x100 Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792844] io_schedule+0x46/0x80 Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792846] blk_mq_get_tag+0x117/0x2e0 Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792852] ? destroy_sched_domains_rcu+0x40/0x40 Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792857] __blk_mq_alloc_requests+0xc4/0x1e0 Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792859] blk_mq_get_new_requests+0xce/0x190 Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792861] blk_mq_submit_bio+0x1e6/0x430 Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792864] __submit_bio+0xf6/0x190 Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792866] submit_bio_noacct_nocheck+0xc2/0x120 Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792869] submit_bio_noacct+0x1c5/0x540 Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792871] ? sio_write_complete+0x1f0/0x1f0 Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792875] submit_bio+0x47/0xf0 Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792877] __swap_writepage+0x157/0x570 Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792879] swap_writepage+0x2f/0x80 Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792880] pageout+0xe2/0x2f0 Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792883] shrink_page_list+0x60b/0xc80 Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792885] shrink_inactive_list+0x1bc/0x4d0 Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792886] shrink_lruvec+0x2f5/0x450 Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792888] shrink_node_memcgs+0x166/0x1d0 Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792890] shrink_node+0x156/0x5a0 Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792891] ? __schedule+0x250/0x5d0 Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792893] balance_pgdat+0x37b/0x880 Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792894] ? zone_watermark_ok_safe+0x4f/0x100 Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792899] ? balance_pgdat+0x880/0x880 Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792900] kswapd+0x10c/0x1c0 Feb 1 01:12:33 ip-
[Kernel-packages] [Bug 2006620] [NEW] linux-aws-5.19 hibernation tasks sometimes fail to freeze
Public bug reported: Hibernation on AWS instances with jammy/5.19.0-1019-aws sometimes fails due to the following failure to freeze: Feb 1 01:09:05 ip-172-31-54-178 kernel: [ 443.247854] PM: hibernation: hibernation entry Feb 1 01:09:05 ip-172-31-54-178 kernel: [ 443.347353] TSC found unstable after boot, most likely due to broken BIOS. Use 'tsc=unstable'. Feb 1 01:09:05 ip-172-31-54-178 kernel: [ 443.347355] sched_clock: Marking unstable (442909362062, 1007864825)<-(443748056670, -400707172) Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 443.940489] Filesystems sync: 0.022 seconds Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 443.940492] Freezing user space processes ... (elapsed 0.001 seconds) done. Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 443.941611] OOM killer disabled. Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 443.943036] PM: hibernation: Marking nosave pages: [mem 0x-0x0fff] Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 443.943039] PM: hibernation: Marking nosave pages: [mem 0x0009f000-0x000f] Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 443.943041] PM: hibernation: Marking nosave pages: [mem 0xbffe8000-0x] Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 443.943950] PM: hibernation: Basic memory bitmaps created Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 443.943961] PM: hibernation: Preallocating image memory Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 630.782421] PM: hibernation: Allocated 9655951 pages for snapshot Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 630.782424] PM: hibernation: Allocated 38623804 kbytes in 186.83 seconds (206.73 MB/s) Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 630.782426] Freezing remaining freezable tasks ... Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.789826] Freezing of tasks failed after 20.007 seconds (1 tasks refusing to freeze, wq_busy=0): Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792830] task:kswapd0 state:D stack:0 pid: 328 ppid: 2 flags:0x4000 Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792833] Call Trace: Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792835] Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792837] __schedule+0x248/0x5d0 Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792842] schedule+0x58/0x100 Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792844] io_schedule+0x46/0x80 Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792846] blk_mq_get_tag+0x117/0x2e0 Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792852] ? destroy_sched_domains_rcu+0x40/0x40 Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792857] __blk_mq_alloc_requests+0xc4/0x1e0 Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792859] blk_mq_get_new_requests+0xce/0x190 Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792861] blk_mq_submit_bio+0x1e6/0x430 Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792864] __submit_bio+0xf6/0x190 Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792866] submit_bio_noacct_nocheck+0xc2/0x120 Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792869] submit_bio_noacct+0x1c5/0x540 Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792871] ? sio_write_complete+0x1f0/0x1f0 Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792875] submit_bio+0x47/0xf0 Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792877] __swap_writepage+0x157/0x570 Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792879] swap_writepage+0x2f/0x80 Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792880] pageout+0xe2/0x2f0 Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792883] shrink_page_list+0x60b/0xc80 Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792885] shrink_inactive_list+0x1bc/0x4d0 Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792886] shrink_lruvec+0x2f5/0x450 Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792888] shrink_node_memcgs+0x166/0x1d0 Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792890] shrink_node+0x156/0x5a0 Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792891] ? __schedule+0x250/0x5d0 Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792893] balance_pgdat+0x37b/0x880 Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792894] ? zone_watermark_ok_safe+0x4f/0x100 Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792899] ? balance_pgdat+0x880/0x880 Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792900] kswapd+0x10c/0x1c0 Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792901] ? balance_pgdat+0x880/0x880 Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792903] kthread+0xd1/0xf0 Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792906] ? kthread_complete_and_exit+0x20/0x20 Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792909] ret_from_fork+0x22/0x30 Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792913] Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792921] Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 650.792922] Restarting kernel threads ... done. Feb 1 01:12:33 ip-172-31-54-178 kernel: [ 651.516499] PM: hibernation: Basic memory bitmaps freed Feb 1 0
[Kernel-packages] [Bug 1993665] Re: Update the 470-server NVIDIA driver
The A100 is down with some hardware issues and there is no ETA when it will be up again. Given that the testing passed on the DGX2 and the A100 is having hardware issues which quite likely impacted the testing, I'm going to consider the kinetic testing as verified. ** Tags removed: verification-failed-kinetic ** Tags added: verification-done-kinetic -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to nvidia-graphics-drivers-470-server in Ubuntu. https://bugs.launchpad.net/bugs/1993665 Title: Update the 470-server NVIDIA driver Status in fabric-manager-470 package in Ubuntu: In Progress Status in libnvidia-nscq-470 package in Ubuntu: In Progress Status in nvidia-graphics-drivers-470-server package in Ubuntu: In Progress Status in fabric-manager-470 source package in Bionic: Fix Released Status in libnvidia-nscq-470 source package in Bionic: Fix Released Status in nvidia-graphics-drivers-470-server source package in Bionic: Fix Released Status in fabric-manager-470 source package in Focal: Fix Released Status in libnvidia-nscq-470 source package in Focal: Fix Released Status in nvidia-graphics-drivers-470-server source package in Focal: Fix Released Status in fabric-manager-470 source package in Jammy: Fix Released Status in libnvidia-nscq-470 source package in Jammy: Fix Released Status in nvidia-graphics-drivers-470-server source package in Jammy: Fix Released Status in fabric-manager-470 source package in Kinetic: In Progress Status in libnvidia-nscq-470 source package in Kinetic: In Progress Status in nvidia-graphics-drivers-470-server source package in Kinetic: In Progress Bug description: [Impact] These releases provide both bug fixes and new features, and we would like to make sure all of our users have access to these improvements. See the changelog entry below for a full list of changes and bugs. [Test Case] The following development and SRU process was followed: https://wiki.ubuntu.com/NVidiaUpdates Certification test suite must pass on a range of hardware: https://git.launchpad.net/plainbox-provider-sru/tree/units/sru.pxu The QA team that executed the tests will be in charge of attaching the artifacts and console output of the appropriate run to the bug. nVidia maintainers team members will not mark ‘verification-done’ until this has happened. [Regression Potential] In order to mitigate the regression potential, the results of the aforementioned system level tests are attached to this bug. [Discussion] [Changelog] To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/fabric-manager-470/+bug/1993665/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1993665] Re: Update the 470-server NVIDIA driver
Re-running through the testing on our DGX2 now passes for both DKMS and LRM. I will need to retry the testing on A100 again and see if I missed something like the fabricmanager not being ready yet. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to nvidia-graphics-drivers-470-server in Ubuntu. https://bugs.launchpad.net/bugs/1993665 Title: Update the 470-server NVIDIA driver Status in fabric-manager-470 package in Ubuntu: In Progress Status in libnvidia-nscq-470 package in Ubuntu: In Progress Status in nvidia-graphics-drivers-470-server package in Ubuntu: In Progress Status in fabric-manager-470 source package in Bionic: Fix Released Status in libnvidia-nscq-470 source package in Bionic: Fix Released Status in nvidia-graphics-drivers-470-server source package in Bionic: Fix Released Status in fabric-manager-470 source package in Focal: Fix Released Status in libnvidia-nscq-470 source package in Focal: Fix Released Status in nvidia-graphics-drivers-470-server source package in Focal: Fix Released Status in fabric-manager-470 source package in Jammy: Fix Released Status in libnvidia-nscq-470 source package in Jammy: Fix Released Status in nvidia-graphics-drivers-470-server source package in Jammy: Fix Released Status in fabric-manager-470 source package in Kinetic: In Progress Status in libnvidia-nscq-470 source package in Kinetic: In Progress Status in nvidia-graphics-drivers-470-server source package in Kinetic: In Progress Bug description: [Impact] These releases provide both bug fixes and new features, and we would like to make sure all of our users have access to these improvements. See the changelog entry below for a full list of changes and bugs. [Test Case] The following development and SRU process was followed: https://wiki.ubuntu.com/NVidiaUpdates Certification test suite must pass on a range of hardware: https://git.launchpad.net/plainbox-provider-sru/tree/units/sru.pxu The QA team that executed the tests will be in charge of attaching the artifacts and console output of the appropriate run to the bug. nVidia maintainers team members will not mark ‘verification-done’ until this has happened. [Regression Potential] In order to mitigate the regression potential, the results of the aforementioned system level tests are attached to this bug. [Discussion] [Changelog] To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/fabric-manager-470/+bug/1993665/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1993665] Re: Update the 470-server NVIDIA driver
Verification on kinetic is incomplete. Things do work on a cloud instance with a single gpgpu. In these cases, both the DKMS and LRM version of the driver works with the cuda samples test. Problems are encountered when running on either the DGX2 or A100 systems. For the A100, I have not been able to get either the 470.141.03 (in -release) or 470.141.10 (in -proposed) drivers to work. The 470.141.03 driver did pass the cuda tests on the DGX2, testing with 470.141.10 is still in progress. Both the DGX2 and A100 require the fabric-manager package. This could be where the problem lies or it could be something in the driver that is only exposed by these systems. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to nvidia-graphics-drivers-470-server in Ubuntu. https://bugs.launchpad.net/bugs/1993665 Title: Update the 470-server NVIDIA driver Status in fabric-manager-470 package in Ubuntu: In Progress Status in libnvidia-nscq-470 package in Ubuntu: In Progress Status in nvidia-graphics-drivers-470-server package in Ubuntu: In Progress Status in fabric-manager-470 source package in Bionic: Fix Released Status in libnvidia-nscq-470 source package in Bionic: Fix Released Status in nvidia-graphics-drivers-470-server source package in Bionic: Fix Released Status in fabric-manager-470 source package in Focal: Fix Released Status in libnvidia-nscq-470 source package in Focal: Fix Released Status in nvidia-graphics-drivers-470-server source package in Focal: Fix Released Status in fabric-manager-470 source package in Jammy: Fix Released Status in libnvidia-nscq-470 source package in Jammy: Fix Released Status in nvidia-graphics-drivers-470-server source package in Jammy: Fix Released Status in fabric-manager-470 source package in Kinetic: In Progress Status in libnvidia-nscq-470 source package in Kinetic: In Progress Status in nvidia-graphics-drivers-470-server source package in Kinetic: In Progress Bug description: [Impact] These releases provide both bug fixes and new features, and we would like to make sure all of our users have access to these improvements. See the changelog entry below for a full list of changes and bugs. [Test Case] The following development and SRU process was followed: https://wiki.ubuntu.com/NVidiaUpdates Certification test suite must pass on a range of hardware: https://git.launchpad.net/plainbox-provider-sru/tree/units/sru.pxu The QA team that executed the tests will be in charge of attaching the artifacts and console output of the appropriate run to the bug. nVidia maintainers team members will not mark ‘verification-done’ until this has happened. [Regression Potential] In order to mitigate the regression potential, the results of the aforementioned system level tests are attached to this bug. [Discussion] [Changelog] To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/fabric-manager-470/+bug/1993665/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1993665] Re: Update the 470-server NVIDIA driver
Tested bionic, focal and jammy on VMs and a DGX2. All cuda tests passed. There is no updated kinetic driver, so unable to test there. ** Tags added: verification-done-bionic verification-done-focal verification-done-jammy verification-failed-kinetic -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to nvidia-graphics-drivers-470-server in Ubuntu. https://bugs.launchpad.net/bugs/1993665 Title: Update the 470-server NVIDIA driver Status in fabric-manager-470 package in Ubuntu: In Progress Status in libnvidia-nscq-470 package in Ubuntu: In Progress Status in nvidia-graphics-drivers-470-server package in Ubuntu: In Progress Status in fabric-manager-470 source package in Bionic: In Progress Status in libnvidia-nscq-470 source package in Bionic: In Progress Status in nvidia-graphics-drivers-470-server source package in Bionic: In Progress Status in fabric-manager-470 source package in Focal: In Progress Status in libnvidia-nscq-470 source package in Focal: In Progress Status in nvidia-graphics-drivers-470-server source package in Focal: In Progress Status in fabric-manager-470 source package in Jammy: In Progress Status in libnvidia-nscq-470 source package in Jammy: In Progress Status in nvidia-graphics-drivers-470-server source package in Jammy: In Progress Status in fabric-manager-470 source package in Kinetic: In Progress Status in libnvidia-nscq-470 source package in Kinetic: In Progress Status in nvidia-graphics-drivers-470-server source package in Kinetic: In Progress Bug description: [Impact] These releases provide both bug fixes and new features, and we would like to make sure all of our users have access to these improvements. See the changelog entry below for a full list of changes and bugs. [Test Case] The following development and SRU process was followed: https://wiki.ubuntu.com/NVidiaUpdates Certification test suite must pass on a range of hardware: https://git.launchpad.net/plainbox-provider-sru/tree/units/sru.pxu The QA team that executed the tests will be in charge of attaching the artifacts and console output of the appropriate run to the bug. nVidia maintainers team members will not mark ‘verification-done’ until this has happened. [Regression Potential] In order to mitigate the regression potential, the results of the aforementioned system level tests are attached to this bug. [Discussion] [Changelog] To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/fabric-manager-470/+bug/1993665/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1991676] Re: Package grub-efi-arm64-signed 1.173.2~18.04.1+2.04-1ubuntu47.4 from bionic-proposed fails to install/upgrade (grub-install: error: efibootmgr: not found.)
@juliank, ah, I found another detail. This appears to only break when the package is updated in the ADT testbed. My assumption is if the latest package version is already in the base image, there is no package update and therefore no breakage. For example: [1] older image, fails: https://autopkgtest.ubuntu.com/results/autopkgtest-bionic/bionic/arm64/d/dpdk/20221027_190645_aae03@/log.gz [2] newer image, passes: https://autopkgtest.ubuntu.com/results/autopkgtest-bionic/bionic/arm64/d/dpdk/20221028_185258_2a00f@/log.gz The second run occurred about a day later. I can't tell if this is using a new image, but when I inspected the artifacts from the run. I do see `grub-efi-arm64-bin 2.04-1ubuntu47.4`, which is the latest and the version that [1] tried to upgrade to. So I guess we generally avoid this by refreshing the ADT base images. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-hwe-5.4 in Ubuntu. https://bugs.launchpad.net/bugs/1991676 Title: Package grub-efi-arm64-signed 1.173.2~18.04.1+2.04-1ubuntu47.4 from bionic-proposed fails to install/upgrade (grub-install: error: efibootmgr: not found.) Status in ubuntu-kernel-tests: New Status in grub2-signed package in Ubuntu: Invalid Status in linux-hwe-5.4 package in Ubuntu: Confirmed Status in grub2-signed source package in Bionic: Triaged Status in linux-hwe-5.4 source package in Bionic: Confirmed Bug description: The ADT tests for arm64 kernels in Bionic are failing during the setup phase with the following errors: Setting up grub-efi-arm64-signed (1.173.2~18.04.1+2.04-1ubuntu47.4) ... Installing for arm64-efi platform. grub-install: error: efibootmgr: not found. dpkg: error processing package grub-efi-arm64-signed (--configure): installed grub-efi-arm64-signed package post-installation script subprocess returned error exit status 1 Setting up libx11-6:arm64 (2:1.6.4-3ubuntu0.5) ... Processing triggers for man-db (2.8.3-2ubuntu0.1) ... Processing triggers for libc-bin (2.27-3ubuntu1.6) ... Errors were encountered while processing: grub-efi-arm64-signed E: Sub-process /usr/bin/dpkg returned an error code (1) blame: badpkg: testbed setup commands failed with status 100 autopkgtest [15:12:03]: ERROR: erroneous package: testbed setup commands failed with status 100 ADT test log: https://autopkgtest.ubuntu.com/results/autopkgtest-bionic/bionic/arm64/l/linux-hwe-5.4/20220930_151219_13ac3@/log.gz To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1991676/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1991676] Re: Package grub-efi-arm64-signed 1.173.2~18.04.1+2.04-1ubuntu47.4 from bionic-proposed fails to install/upgrade (grub-install: error: efibootmgr: not found.)
@juliank Hello, I see that you picked up https://bugs.launchpad.net/ubuntu/+source/linux-hwe-5.4/+bug/1991676. I just want to mention so that you are aware, that this is blocking most, if not all, kernel ADT testing on bionic arm64. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-hwe-5.4 in Ubuntu. https://bugs.launchpad.net/bugs/1991676 Title: Package grub-efi-arm64-signed 1.173.2~18.04.1+2.04-1ubuntu47.4 from bionic-proposed fails to install/upgrade (grub-install: error: efibootmgr: not found.) Status in ubuntu-kernel-tests: New Status in grub2-signed package in Ubuntu: Invalid Status in linux-hwe-5.4 package in Ubuntu: Confirmed Status in grub2-signed source package in Bionic: Triaged Status in linux-hwe-5.4 source package in Bionic: Confirmed Bug description: The ADT tests for arm64 kernels in Bionic are failing during the setup phase with the following errors: Setting up grub-efi-arm64-signed (1.173.2~18.04.1+2.04-1ubuntu47.4) ... Installing for arm64-efi platform. grub-install: error: efibootmgr: not found. dpkg: error processing package grub-efi-arm64-signed (--configure): installed grub-efi-arm64-signed package post-installation script subprocess returned error exit status 1 Setting up libx11-6:arm64 (2:1.6.4-3ubuntu0.5) ... Processing triggers for man-db (2.8.3-2ubuntu0.1) ... Processing triggers for libc-bin (2.27-3ubuntu1.6) ... Errors were encountered while processing: grub-efi-arm64-signed E: Sub-process /usr/bin/dpkg returned an error code (1) blame: badpkg: testbed setup commands failed with status 100 autopkgtest [15:12:03]: ERROR: erroneous package: testbed setup commands failed with status 100 ADT test log: https://autopkgtest.ubuntu.com/results/autopkgtest-bionic/bionic/arm64/l/linux-hwe-5.4/20220930_151219_13ac3@/log.gz To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1991676/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1988592] [NEW] [5.4.1089, arm64] eBPF opensnoop does not display PATH
Public bug reported: Hi. FIrst, I hope you are fine and the same for your relatives. The actual kernel used on AKS arm64 (i.e. 5.4.1089) suffers from this problem: https://github.com/iovisor/bcc/issues/2253 As a consequence, opensnoop does not display PATH: # Run the following from Canonical:UbuntuServer:18_04-daily-lts-arm64:18.04.202208290 $ uname -a Linux francis-vm-arm64-ubuntu18vm 5.4.0-1089-azure #94~18.04.1-Ubuntu SMP Fri Aug 5 12:36:48 UTC 2022 aarch64 aarch64 aarch64 GNU/Linux $ lsb_release -rd Description:Ubuntu 18.04.6 LTS Release:18.04 $ git clone --recurse-submodules https://github.com/iovisor/bcc Linux francis-vm-arm64-ubuntu18vm 5.4.0-1089-azure #94~18.04.1-Ubuntu SMP Fri Aug 5 12:36:48 UTC 2022 aarch64 aarch64 aarch64 GNU/Linux $ sudo sh -c 'apt update && apt install -qy clang-10 llvm-10 make gcc pkg-config libelf-dev libz-dev' ... $ cd bcc/libbpf-tools $ CLANG=clang-10 LLVM_STRIP=llvm-strip-10 make -j opensnoop ... BINARY opensnoop $ sudo ./opensnoop PIDCOMM FD ERR PATH 1672 python33 0 9746 opensnoop 20 0 1672 python33 0 1672 python33 0 1672 python3 -1 2 1672 python33 0 1 systemd 18 0 1672 python36 0 1672 python33 0 1672 python33 0 1672 python33 0 1672 python33 0 1672 python33 0 ^C As you can see, nothing is printed for the PATH while normal behavior prints the path of the opened file: $ uname -a Linux pwmachine 5.15.0-46-generic #49~20.04.1-Ubuntu SMP Thu Aug 4 19:15:44 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux $ sudo ./opensnoop PIDCOMM FD ERR PATH 2704 systemd 23 0 virtual 2704 systemd 22 0 misc 2704 systemd 23 0 fuse 2704 systemd 22 0 /sys/devices/virtual/misc/fuse/uevent 2704 systemd 22 0 /run/udev/data/c10:229 2704 systemd 22 0 /proc/2704/status 2704 systemd 22 0 /proc/2704/status 2704 systemd 22 0 /proc/2704/status ^C This bug was fixed in upstream patch: https://github.com/torvalds/linux/commit/6ae08ae3dea2cfa03dd3665a3c8475c2d429ef47 Sadly, this patch was not back ported, so it is not present in stable kernels. I backported the patches myself (see attachment) and was able to build the kernel package with the following command: sudo LANG=C $(dpkg-architecture -aarm64) CROSS_COMPILE=aarch64-linux-gnu- fakeroot debian/rules binary skipdbg=false Sadly, I was not able to successfully boot it on Azure, either by installing the package or using kexec. I suspect this is because my image is not signed. Best regards and thank you in advance. ** Affects: linux-azure (Ubuntu) Importance: Undecided Status: New ** Patch added: "Concatenation of backported upstream patches" https://bugs.launchpad.net/bugs/1988592/+attachment/5613332/+files/concat.patch ** Description changed: Hi. - FIrst, I hope you are fine and the same for your relatives. The actual kernel used on AKS arm64 (i.e. 5.4.1089) suffers from this problem: https://github.com/iovisor/bcc/issues/2253 As a consequence, opensnoop does not display PATH: # Run the following from Canonical:UbuntuServer:18_04-daily-lts-arm64:18.04.202208290 $ uname -a Linux francis-vm-arm64-ubuntu18vm 5.4.0-1089-azure #94~18.04.1-Ubuntu SMP Fri Aug 5 12:36:48 UTC 2022 aarch64 aarch64 aarch64 GNU/Linux $ lsb_release -rd Description:Ubuntu 18.04.6 LTS Release:18.04 $ git clone --recurse-submodules https://github.com/iovisor/bcc Linux francis-vm-arm64-ubuntu18vm 5.4.0-1089-azure #94~18.04.1-Ubuntu SMP Fri Aug 5 12:36:48 UTC 2022 aarch64 aarch64 aarch64 GNU/Linux $ sudo sh -c 'apt update && apt install -qy clang-10 llvm-10 make gcc pkg-config libelf-dev libz-dev' ... $ cd bcc/libbpf-tools $ CLANG=clang-10 LLVM_STRIP=llvm-strip-10 make -j opensnoop ... BINARY opensnoop $ sudo ./opensnoop PIDCOMM FD ERR PATH - 1672 python33 0 - 9746 opensnoop 20 0 - 1672 python33 0 - 1672 python33 0 - 1672 python3 -1 2 - 1672 python33 0 - 1 systemd 18 0 - 1672 python36 0 - 1672 python33 0 - 1672 python33 0 - 1672 python33 0 - 1672 python33 0 + 1672 python33 0 + 9746 opensnoop 20 0 + 1672 python33 0 + 1672 python33 0 + 1672 python3 -1 2 + 1672 python33 0 + 1 systemd 18 0 + 1672 python36 0 + 1672 python33 0 + 1672 python33 0 + 1672 python33 0 + 1672 python33 0 1672 python33
[Kernel-packages] [Bug 1923114] Re: ubuntu_kernel_selftests: ./cpu-on-off-test.sh: line 94: echo: write error: Device or resource busy
** Tags added: 5.4 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-azure-4.15 in Ubuntu. https://bugs.launchpad.net/bugs/1923114 Title: ubuntu_kernel_selftests: ./cpu-on-off-test.sh: line 94: echo: write error: Device or resource busy Status in ubuntu-kernel-tests: In Progress Status in linux-azure package in Ubuntu: New Status in linux-azure-4.15 package in Ubuntu: New Status in linux-azure source package in Trusty: New Status in linux-azure-4.15 source package in Trusty: New Status in linux-azure source package in Xenial: New Status in linux-azure-4.15 source package in Xenial: New Status in linux-azure source package in Bionic: New Status in linux-azure-4.15 source package in Bionic: New Status in linux-azure source package in Groovy: New Status in linux-azure-4.15 source package in Groovy: New Bug description: Test cpu-hotplug from ubuntu_kernel_selftests failed with bionic:linux-azure-4.15 running on a Basic A2 with 2 cores (besides other instance types): selftests: cpu-on-off-test.sh pid 28041's current affinity mask: 3 pid 28041's new affinity mask: 1 CPU online/offline summary: present_cpus = 0-1 present_max = 1 Cpus in online state: 0-1 Cpus in offline state: 0 Limited scope test: one hotplug cpu (leaves cpu in the original state): online to offline to online: cpu 1 not ok 1..1 selftests: cpu-on-off-test.sh [FAIL] ./cpu-on-off-test.sh: line 94: echo: write error: Device or resource busy offline_cpu_expect_success 1: unexpected fail http://10.246.72.46/4.15.0-1112.124~16.04.1-azure/xenial-linux-azure- azure- amd64-4.15.0-Basic_A2-ubuntu_kernel_selftests/ubuntu_kernel_selftests/results/ubuntu_kernel_selftests.cpu- hotplug/debug/ubuntu_kernel_selftests.cpu-hotplug.DEBUG.html The problem happens at "autotest-client- tests/ubuntu_kernel_selftests/cpu-on-off-test.sh" when executing: echo 0 > $SYSFS/devices/system/cpu/cpu$1/online To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1923114/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1975509] Re: Update to the 510.73.08 ERD NVIDIA driver series in Bionic, Focal, Impish, Jammy, and Kinetic
Testing of nvidia-fabricmanager-510 and libnvidia-nscq-510 has been successfully performed again against the packages in -proposed. These are good to release from a testing perspective. ** Tags removed: verification-needed verification-needed-bionic verification-needed-focal verification-needed-impish verification-needed-jammy ** Tags added: verification-done verification-done-bionic verification-done-focal verification-done-impish verification-done-jammy -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-restricted-modules in Ubuntu. https://bugs.launchpad.net/bugs/1975509 Title: Update to the 510.73.08 ERD NVIDIA driver series in Bionic, Focal, Impish, Jammy, and Kinetic Status in fabric-manager-510 package in Ubuntu: Fix Committed Status in libnvidia-nscq-510 package in Ubuntu: Fix Committed Status in linux-restricted-modules package in Ubuntu: Confirmed Status in nvidia-graphics-drivers-510-server package in Ubuntu: Fix Committed Status in fabric-manager-510 source package in Bionic: Fix Committed Status in libnvidia-nscq-510 source package in Bionic: Fix Committed Status in linux-restricted-modules source package in Bionic: Confirmed Status in nvidia-graphics-drivers-510-server source package in Bionic: Fix Released Status in fabric-manager-510 source package in Focal: Fix Committed Status in libnvidia-nscq-510 source package in Focal: Fix Committed Status in linux-restricted-modules source package in Focal: Confirmed Status in nvidia-graphics-drivers-510-server source package in Focal: Fix Released Status in fabric-manager-510 source package in Impish: Fix Committed Status in libnvidia-nscq-510 source package in Impish: Fix Committed Status in linux-restricted-modules source package in Impish: Confirmed Status in nvidia-graphics-drivers-510-server source package in Impish: Fix Released Status in fabric-manager-510 source package in Jammy: Fix Committed Status in libnvidia-nscq-510 source package in Jammy: Fix Committed Status in linux-restricted-modules source package in Jammy: Confirmed Status in nvidia-graphics-drivers-510-server source package in Jammy: Fix Released Status in fabric-manager-510 source package in Kinetic: Fix Committed Status in libnvidia-nscq-510 source package in Kinetic: Fix Committed Status in linux-restricted-modules source package in Kinetic: Confirmed Status in nvidia-graphics-drivers-510-server source package in Kinetic: Fix Committed Bug description: [Impact] These releases provide both bug fixes and new features, and we would like to make sure all of our users have access to these improvements. See the changelog entry below for a full list of changes and bugs. [Test Case] The following development and SRU process was followed: https://wiki.ubuntu.com/NVidiaUpdates Certification test suite must pass on a range of hardware: https://git.launchpad.net/plainbox-provider-sru/tree/units/sru.pxu The QA team that executed the tests will be in charge of attaching the artifacts and console output of the appropriate run to the bug. nVidia maintainers team members will not mark ‘verification-done’ until this has happened. [Regression Potential] In order to mitigate the regression potential, the results of the aforementioned system level tests are attached to this bug. [Discussion] [Changelog] === 510 kinetic/jammy/impish/focal/bionic === * New upstream release (LP: #1975509): - When calculating the address of grid barrier allocated for a CUDA stream, there was an off-by-one error. The address calculation is corrected in thisrelease. - An issue that caused an AC cycle test to fail with "AssertionError: NVLink links with inappropriate status found" is resolved. - An issue that caused NX 11 to become nonresponsive during a graphics operation is resolved. - Linking issues were observed when using libnvfm.so. Now and other depend tools use dynamic linking with libstdc++ and libgcc. - An intermittent error CUDA_ERROR_NVLINK_UNCORRECTABLE caused by some non-fatal nvlink interrupts is resolved. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/fabric-manager-510/+bug/1975509/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1978475] Re: Docker container ports cannot be allocated
Hello Sebastian, I've been unable to reproduce this issue with the 5.13.0-1029-aws kernel and the docker-compose example available from [1]. Are you able to provide complete steps to reproduce? [1] - https://docs.docker.com/compose/gettingstarted/ Thanks -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-aws-5.13 in Ubuntu. https://bugs.launchpad.net/bugs/1978475 Title: Docker container ports cannot be allocated Status in linux-aws-5.13 package in Ubuntu: Confirmed Bug description: This is a follow-up bug to https://bugs.launchpad.net/ubuntu/+source/linux-aws-5.13/+bug/1977919 I can confirm that the problem is indeed not fully fixed. @electricdaemon said: > Test kernel posted fixes crash but has another bug with unkillable stuck defunct docker-proxy service causing more issues. Bug is not solved. Tested on Linux AWS Lightsail instance. What I'm seeing is that docker-compose stacks either don't start at all or only start partially. In both cases the affected containers cannot start due to their host port being already allocated. I can say with absolute certainty that the ports on the host are dedicated to container applications and no other service is actually bound to the affected port numbers. # uname -a Linux ip-10-0-69-193 5.13.0-1029-aws #32~20.04.1-Ubuntu SMP Thu Jun 9 13:03:13 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux # apt-cache policy docker containerd docker: Installed: (none) Candidate: 1.5-2 Version table: 1.5-2 500 500 http://eu-central-1.ec2.archive.ubuntu.com/ubuntu focal/universe amd64 Packages containerd: Installed: (none) Candidate: 1.5.9-0ubuntu1~20.04.4 Version table: 1.5.9-0ubuntu1~20.04.4 500 500 http://eu-central-1.ec2.archive.ubuntu.com/ubuntu focal-updates/main amd64 Packages 500 http://security.ubuntu.com/ubuntu focal-security/main amd64 Packages 1.3.3-0ubuntu2 500 500 http://eu-central-1.ec2.archive.ubuntu.com/ubuntu focal/main amd64 Packages # docker-compose --version docker-compose version 1.29.2, build 5becea4c root@ip-10-0-69-193:/opt/myapp8/myappserv/int# docker-compose up -d Creating network "myappserv-int_default" with the default driver Creating myapp-migrator-int ... done Creating myapp-dealer-int ... Creating myapp-offer-int... Creating myapp-customer-int ... Creating myapp-customer-int ... error Creating myapp-dealer-int ... done Creating myapp-offer-int... done : port is already allocated ERROR: for customer Cannot start service customer: driver failed programming external connectivity on endpoint myapp8-customer-int (fe4112364528b0e7d192c793929c579e8a81af715118c8f83ad7e65e7397f3be): Bind for 0.0.0.0:9001 failed: port is already allocated ERROR: Encountered errors while bringing up the project. root@ip-10-0-69-193:/opt/myapp8/myappserv/int# docker-compose down Stopping myapp8-offer-int ... done Stopping myapp8-dealer-int ... done Removing myapp8-customer-int ... done Removing myapp8-offer-int... done Removing myapp8-dealer-int ... done Removing myapp8-migrator-int ... done Removing network myappserv-int_default root@ip-10-0-69-193:/opt/myapp8/myappserv/int# docker-compose up -d Creating network "myappserv-int_default" with the default driver Creating myapp8-migrator-int ... done Creating myapp8-offer-int... Creating myapp8-customer-int ... Creating myapp8-customer-int ... error WARNING: Host is already in use by another container Creating myapp8-offer-int... done ERROR: for myapp8-customer-int Cannot start service customer: driver failed programming external connectivity on endpoint myapp8-customer-int (72fc08854cd278e63cd3234e7fb03c08cb045efdcfb9e42075a1250d893645d5): Bind for 0.0.0.0:9001 failed Creating myapp8-dealer-int ... done ERROR: for customer Cannot start service customer: driver failed programming external connectivity on endpoint myapp8-customer-int (72fc08854cd278e63cd3234e7fb03c08cb045efdcfb9e42075a1250d893645d5): Bind for 0.0.0.0:9001 failed: port is already allocated ERROR: Encountered errors while bringing up the project. # docker-compose config services: customer: container_name: myapp8-customer-int depends_on: migrator: condition: service_completed_successfully image: reg.mydomain.tld/myapp8/customer:430d4ca ports: - published: 9001 target: 9001 restart: always dealer: container_name: myapp8-dealer-int depends_on: migrator: condition: service_completed_successfully image: reg.mydomain.tld/myapp8/dealer:430d4ca ports: - published: 9002 target: 9002 restart: always migrator: container_name: myapp8-migrator-int image: reg.mydomain.tld/myapp8/migrator:430d4ca offer: container_name: my
[Kernel-packages] [Bug 1977919] Re: Docker container creation causes kernel oops on linux-aws 5.13.0.1028.31~20.04.22
All of the updated 5.13 kernels have now made it to the archive and into both the focal-updates and focal-security pockets. That list of kernels is: linux-aws-5.13 - 5.13.0-1029.32~20.04.1 linux-azure-5.13 - 5.13.0-1029.34~20.04.1 linux-gcp-5.13 - 5.13.0-1031.37~20.04.1 linux-oracle-5.13 - 5.13.0-1034.40~20.04.1 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-aws-5.13 in Ubuntu. https://bugs.launchpad.net/bugs/1977919 Title: Docker container creation causes kernel oops on linux-aws 5.13.0.1028.31~20.04.22 Status in linux-aws-5.13 package in Ubuntu: Confirmed Status in linux-azure-5.13 package in Ubuntu: Confirmed Status in linux-gcp-5.13 package in Ubuntu: Confirmed Status in linux-intel-iotg-5.15 package in Ubuntu: Confirmed Status in linux-oracle-5.13 package in Ubuntu: Confirmed Status in linux-aws-5.13 source package in Focal: Fix Committed Status in linux-azure-5.13 source package in Focal: Fix Committed Status in linux-gcp-5.13 source package in Focal: Fix Committed Status in linux-intel-iotg-5.15 source package in Focal: Won't Fix Status in linux-oracle-5.13 source package in Focal: Fix Committed Bug description: Running the attached script on the latest AWS AMI for Ubuntu 20.04, I get a kernel panic and hard reset of the node. [ 12.314552] VFS: Close: file count is 0 [ 12.351090] [ cut here ] [ 12.351093] kernel BUG at include/linux/fs.h:3104! [ 12.355272] invalid opcode: [#1] SMP PTI [ 12.358963] CPU: 1 PID: 863 Comm: sed Not tainted 5.13.0-1028-aws #31~20.04.1-Ubuntu [ 12.366241] Hardware name: Amazon EC2 m5.large/, BIOS 1.0 10/16/2017 [ 12.371130] RIP: 0010:__fput+0x247/0x250 [ 12.374897] Code: 00 48 85 ff 0f 84 8b fe ff ff f6 c7 40 0f 85 82 fe ff ff e8 ab 38 00 00 e9 78 fe ff ff 4c 89 f7 e8 2e 88 02 00 e9 b5 fe ff ff <0f> 0b 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 53 31 db 48 [ 12.389075] RSP: 0018:b50280d9fd88 EFLAGS: 00010246 [ 12.393425] RAX: RBX: 000a801d RCX: 9152e0716000 [ 12.398679] RDX: 9152cf075280 RSI: 0001 RDI: [ 12.403879] RBP: b50280d9fdb0 R08: 0001 R09: 9152dfcba2c8 [ 12.409102] R10: b50280d9fd88 R11: 9152d04e9d10 R12: 9152d04e9d00 [ 12.414333] R13: 9152dfcba2c8 R14: 9152cf0752a0 R15: 9152dfc2e180 [ 12.419533] FS: () GS:9153ea90() knlGS: [ 12.426937] CS: 0010 DS: ES: CR0: 80050033 [ 12.431506] CR2: 556cf30250a8 CR3: bce10006 CR4: 007706e0 [ 12.436716] DR0: DR1: DR2: [ 12.441941] DR3: DR6: fffe0ff0 DR7: 0400 [ 12.447170] PKRU: 5554 [ 12.450355] Call Trace: [ 12.453408] [ 12.456296] fput+0xe/0x10 [ 12.459633] task_work_run+0x70/0xb0 [ 12.463157] do_exit+0x37b/0xaf0 [ 12.466570] do_group_exit+0x43/0xb0 [ 12.470142] __x64_sys_exit_group+0x18/0x20 [ 12.473989] do_syscall_64+0x61/0xb0 [ 12.477565] ? exit_to_user_mode_prepare+0x9b/0x1c0 [ 12.481734] ? do_user_addr_fault+0x1d0/0x650 [ 12.485665] ? irqentry_exit_to_user_mode+0x9/0x20 [ 12.489790] ? irqentry_exit+0x19/0x30 [ 12.493443] ? exc_page_fault+0x8f/0x170 [ 12.497199] ? asm_exc_page_fault+0x8/0x30 [ 12.501013] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 12.505289] RIP: 0033:0x7f80d42a1bd6 [ 12.508868] Code: Unable to access opcode bytes at RIP 0x7f80d42a1bac. [ 12.513783] RSP: 002b:7ffe924f9ed8 EFLAGS: 0246 ORIG_RAX: 00e7 [ 12.520897] RAX: ffda RBX: 7f80d45a4740 RCX: 7f80d42a1bd6 [ 12.526115] RDX: RSI: 003c RDI: [ 12.531328] RBP: R08: 00e7 R09: fe98 [ 12.536484] R10: 7f80d3d422a0 R11: 0246 R12: 7f80d45a4740 [ 12.541687] R13: 0002 R14: 7f80d45ad708 R15: [ 12.546916] [ 12.549829] Modules linked in: xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_filter iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c bpfilter br_netfilter bridge stp llc aufs overlay nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua crct10dif_pclmul ppdev crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd psmouse cryptd parport_pc input_leds parport ena serio_raw sch_fq_codel ipmi_devintf ipmi_msghandler msr drm ip_tables x_tables autofs4 [ 12.583913] ---[ end trace 77367fed4d782aa4 ]--- [ 12.587963] RIP: 0010:__fput+0x247/0x250 [ 12.591729] Code: 00 48 85 ff 0f 84 8b fe ff ff f6 c7 40 0f 85 82 fe ff ff e8 ab 38 00 00 e9 78 fe ff ff 4c 89 f7 e8 2e 8
[Kernel-packages] [Bug 1977919] Re: Docker container creation causes kernel oops on linux-aws 5.13.0.1028.31~20.04.22
Updated kernels are in flight. The updated kernel packages and versions are: linux-aws-5.13- 5.13.0-1029.32~20.04.1 linux-azure-5.13 - 5.13.0-1029.34~20.04.1 linux-gcp-5.13- 5.13.0-1031.37~20.04.1 linux-oracle-5.13 - 5.13.0-1034.40~20.04.1 The azure and gcp kernels are already in focal-updates. The aws kernel is in focal-proposed and the oracle kernel should be there very soon. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-aws-5.13 in Ubuntu. https://bugs.launchpad.net/bugs/1977919 Title: Docker container creation causes kernel oops on linux-aws 5.13.0.1028.31~20.04.22 Status in linux-aws-5.13 package in Ubuntu: Confirmed Status in linux-azure-5.13 package in Ubuntu: Confirmed Status in linux-gcp-5.13 package in Ubuntu: Confirmed Status in linux-intel-iotg-5.15 package in Ubuntu: Confirmed Status in linux-oracle-5.13 package in Ubuntu: Confirmed Status in linux-aws-5.13 source package in Focal: Fix Committed Status in linux-azure-5.13 source package in Focal: Fix Committed Status in linux-gcp-5.13 source package in Focal: Fix Committed Status in linux-intel-iotg-5.15 source package in Focal: Won't Fix Status in linux-oracle-5.13 source package in Focal: Fix Committed Bug description: Running the attached script on the latest AWS AMI for Ubuntu 20.04, I get a kernel panic and hard reset of the node. [ 12.314552] VFS: Close: file count is 0 [ 12.351090] [ cut here ] [ 12.351093] kernel BUG at include/linux/fs.h:3104! [ 12.355272] invalid opcode: [#1] SMP PTI [ 12.358963] CPU: 1 PID: 863 Comm: sed Not tainted 5.13.0-1028-aws #31~20.04.1-Ubuntu [ 12.366241] Hardware name: Amazon EC2 m5.large/, BIOS 1.0 10/16/2017 [ 12.371130] RIP: 0010:__fput+0x247/0x250 [ 12.374897] Code: 00 48 85 ff 0f 84 8b fe ff ff f6 c7 40 0f 85 82 fe ff ff e8 ab 38 00 00 e9 78 fe ff ff 4c 89 f7 e8 2e 88 02 00 e9 b5 fe ff ff <0f> 0b 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 53 31 db 48 [ 12.389075] RSP: 0018:b50280d9fd88 EFLAGS: 00010246 [ 12.393425] RAX: RBX: 000a801d RCX: 9152e0716000 [ 12.398679] RDX: 9152cf075280 RSI: 0001 RDI: [ 12.403879] RBP: b50280d9fdb0 R08: 0001 R09: 9152dfcba2c8 [ 12.409102] R10: b50280d9fd88 R11: 9152d04e9d10 R12: 9152d04e9d00 [ 12.414333] R13: 9152dfcba2c8 R14: 9152cf0752a0 R15: 9152dfc2e180 [ 12.419533] FS: () GS:9153ea90() knlGS: [ 12.426937] CS: 0010 DS: ES: CR0: 80050033 [ 12.431506] CR2: 556cf30250a8 CR3: bce10006 CR4: 007706e0 [ 12.436716] DR0: DR1: DR2: [ 12.441941] DR3: DR6: fffe0ff0 DR7: 0400 [ 12.447170] PKRU: 5554 [ 12.450355] Call Trace: [ 12.453408] [ 12.456296] fput+0xe/0x10 [ 12.459633] task_work_run+0x70/0xb0 [ 12.463157] do_exit+0x37b/0xaf0 [ 12.466570] do_group_exit+0x43/0xb0 [ 12.470142] __x64_sys_exit_group+0x18/0x20 [ 12.473989] do_syscall_64+0x61/0xb0 [ 12.477565] ? exit_to_user_mode_prepare+0x9b/0x1c0 [ 12.481734] ? do_user_addr_fault+0x1d0/0x650 [ 12.485665] ? irqentry_exit_to_user_mode+0x9/0x20 [ 12.489790] ? irqentry_exit+0x19/0x30 [ 12.493443] ? exc_page_fault+0x8f/0x170 [ 12.497199] ? asm_exc_page_fault+0x8/0x30 [ 12.501013] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 12.505289] RIP: 0033:0x7f80d42a1bd6 [ 12.508868] Code: Unable to access opcode bytes at RIP 0x7f80d42a1bac. [ 12.513783] RSP: 002b:7ffe924f9ed8 EFLAGS: 0246 ORIG_RAX: 00e7 [ 12.520897] RAX: ffda RBX: 7f80d45a4740 RCX: 7f80d42a1bd6 [ 12.526115] RDX: RSI: 003c RDI: [ 12.531328] RBP: R08: 00e7 R09: fe98 [ 12.536484] R10: 7f80d3d422a0 R11: 0246 R12: 7f80d45a4740 [ 12.541687] R13: 0002 R14: 7f80d45ad708 R15: [ 12.546916] [ 12.549829] Modules linked in: xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_filter iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c bpfilter br_netfilter bridge stp llc aufs overlay nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua crct10dif_pclmul ppdev crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd psmouse cryptd parport_pc input_leds parport ena serio_raw sch_fq_codel ipmi_devintf ipmi_msghandler msr drm ip_tables x_tables autofs4 [ 12.583913] ---[ end trace 77367fed4d782aa4 ]--- [ 12.587963] RIP: 0010:__fput+0x247/0x250 [ 12.591729] Code: 00 48 85 ff 0f 84 8b fe ff
[Kernel-packages] [Bug 1975509] Re: Update to the 510.73.08 ERD NVIDIA driver series in Bionic, Focal, Impish, Jammy, and Kinetic
The fabric-manager-510 and libnvidia-nscq-510 were tested across all series on an A100 system. All testing passed the standard cuda testing. The packages tested were from https://launchpad.net/~canonical-kernel- team/+archive/ubuntu/ppa/+packages?field.name_filter=-510&field.status_filter=published&field.series_filter= -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-restricted-modules in Ubuntu. https://bugs.launchpad.net/bugs/1975509 Title: Update to the 510.73.08 ERD NVIDIA driver series in Bionic, Focal, Impish, Jammy, and Kinetic Status in fabric-manager-510 package in Ubuntu: New Status in libnvidia-nscq-510 package in Ubuntu: New Status in linux-restricted-modules package in Ubuntu: Confirmed Status in nvidia-graphics-drivers-510-server package in Ubuntu: Fix Committed Status in fabric-manager-510 source package in Bionic: New Status in libnvidia-nscq-510 source package in Bionic: New Status in linux-restricted-modules source package in Bionic: Confirmed Status in nvidia-graphics-drivers-510-server source package in Bionic: Fix Released Status in fabric-manager-510 source package in Focal: New Status in libnvidia-nscq-510 source package in Focal: New Status in linux-restricted-modules source package in Focal: Confirmed Status in nvidia-graphics-drivers-510-server source package in Focal: Fix Released Status in fabric-manager-510 source package in Impish: New Status in libnvidia-nscq-510 source package in Impish: New Status in linux-restricted-modules source package in Impish: Confirmed Status in nvidia-graphics-drivers-510-server source package in Impish: Fix Released Status in fabric-manager-510 source package in Jammy: New Status in libnvidia-nscq-510 source package in Jammy: New Status in linux-restricted-modules source package in Jammy: Confirmed Status in nvidia-graphics-drivers-510-server source package in Jammy: Fix Released Status in fabric-manager-510 source package in Kinetic: New Status in libnvidia-nscq-510 source package in Kinetic: New Status in linux-restricted-modules source package in Kinetic: Confirmed Status in nvidia-graphics-drivers-510-server source package in Kinetic: Fix Committed Bug description: [Impact] These releases provide both bug fixes and new features, and we would like to make sure all of our users have access to these improvements. See the changelog entry below for a full list of changes and bugs. [Test Case] The following development and SRU process was followed: https://wiki.ubuntu.com/NVidiaUpdates Certification test suite must pass on a range of hardware: https://git.launchpad.net/plainbox-provider-sru/tree/units/sru.pxu The QA team that executed the tests will be in charge of attaching the artifacts and console output of the appropriate run to the bug. nVidia maintainers team members will not mark ‘verification-done’ until this has happened. [Regression Potential] In order to mitigate the regression potential, the results of the aforementioned system level tests are attached to this bug. [Discussion] [Changelog] === 510 kinetic/jammy/impish/focal/bionic === * New upstream release (LP: #1975509): - When calculating the address of grid barrier allocated for a CUDA stream, there was an off-by-one error. The address calculation is corrected in thisrelease. - An issue that caused an AC cycle test to fail with "AssertionError: NVLink links with inappropriate status found" is resolved. - An issue that caused NX 11 to become nonresponsive during a graphics operation is resolved. - Linking issues were observed when using libnvfm.so. Now and other depend tools use dynamic linking with libstdc++ and libgcc. - An intermittent error CUDA_ERROR_NVLINK_UNCORRECTABLE caused by some non-fatal nvlink interrupts is resolved. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/fabric-manager-510/+bug/1975509/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1977919] Re: Docker container creation causes kernel oops on linux-aws 5.13.0.1028.31~20.04.22
Work on this issue continues. We have identified the following impacted kernels and versions: focal linux-aws-5.13 5.13.0-1028.31~20.04.1 focal linux-azure-5.13 5.13.0-1028.33~20.04.1 focal linux-gcp-5.13 5.13.0-1030.36~20.04.1 focal linux-oracle-5.13 5.13.0-1033.39~20.04.1 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-aws in Ubuntu. https://bugs.launchpad.net/bugs/1977919 Title: Docker container creation causes kernel oops on linux-aws 5.13.0.1028.31~20.04.22 Status in linux-aws package in Ubuntu: Confirmed Status in linux-gcp package in Ubuntu: Confirmed Bug description: Running the attached script on the latest AWS AMI for Ubuntu 20.04, I get a kernel panic and hard reset of the node. [ 12.314552] VFS: Close: file count is 0 [ 12.351090] [ cut here ] [ 12.351093] kernel BUG at include/linux/fs.h:3104! [ 12.355272] invalid opcode: [#1] SMP PTI [ 12.358963] CPU: 1 PID: 863 Comm: sed Not tainted 5.13.0-1028-aws #31~20.04.1-Ubuntu [ 12.366241] Hardware name: Amazon EC2 m5.large/, BIOS 1.0 10/16/2017 [ 12.371130] RIP: 0010:__fput+0x247/0x250 [ 12.374897] Code: 00 48 85 ff 0f 84 8b fe ff ff f6 c7 40 0f 85 82 fe ff ff e8 ab 38 00 00 e9 78 fe ff ff 4c 89 f7 e8 2e 88 02 00 e9 b5 fe ff ff <0f> 0b 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 53 31 db 48 [ 12.389075] RSP: 0018:b50280d9fd88 EFLAGS: 00010246 [ 12.393425] RAX: RBX: 000a801d RCX: 9152e0716000 [ 12.398679] RDX: 9152cf075280 RSI: 0001 RDI: [ 12.403879] RBP: b50280d9fdb0 R08: 0001 R09: 9152dfcba2c8 [ 12.409102] R10: b50280d9fd88 R11: 9152d04e9d10 R12: 9152d04e9d00 [ 12.414333] R13: 9152dfcba2c8 R14: 9152cf0752a0 R15: 9152dfc2e180 [ 12.419533] FS: () GS:9153ea90() knlGS: [ 12.426937] CS: 0010 DS: ES: CR0: 80050033 [ 12.431506] CR2: 556cf30250a8 CR3: bce10006 CR4: 007706e0 [ 12.436716] DR0: DR1: DR2: [ 12.441941] DR3: DR6: fffe0ff0 DR7: 0400 [ 12.447170] PKRU: 5554 [ 12.450355] Call Trace: [ 12.453408] [ 12.456296] fput+0xe/0x10 [ 12.459633] task_work_run+0x70/0xb0 [ 12.463157] do_exit+0x37b/0xaf0 [ 12.466570] do_group_exit+0x43/0xb0 [ 12.470142] __x64_sys_exit_group+0x18/0x20 [ 12.473989] do_syscall_64+0x61/0xb0 [ 12.477565] ? exit_to_user_mode_prepare+0x9b/0x1c0 [ 12.481734] ? do_user_addr_fault+0x1d0/0x650 [ 12.485665] ? irqentry_exit_to_user_mode+0x9/0x20 [ 12.489790] ? irqentry_exit+0x19/0x30 [ 12.493443] ? exc_page_fault+0x8f/0x170 [ 12.497199] ? asm_exc_page_fault+0x8/0x30 [ 12.501013] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 12.505289] RIP: 0033:0x7f80d42a1bd6 [ 12.508868] Code: Unable to access opcode bytes at RIP 0x7f80d42a1bac. [ 12.513783] RSP: 002b:7ffe924f9ed8 EFLAGS: 0246 ORIG_RAX: 00e7 [ 12.520897] RAX: ffda RBX: 7f80d45a4740 RCX: 7f80d42a1bd6 [ 12.526115] RDX: RSI: 003c RDI: [ 12.531328] RBP: R08: 00e7 R09: fe98 [ 12.536484] R10: 7f80d3d422a0 R11: 0246 R12: 7f80d45a4740 [ 12.541687] R13: 0002 R14: 7f80d45ad708 R15: [ 12.546916] [ 12.549829] Modules linked in: xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_filter iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c bpfilter br_netfilter bridge stp llc aufs overlay nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua crct10dif_pclmul ppdev crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd psmouse cryptd parport_pc input_leds parport ena serio_raw sch_fq_codel ipmi_devintf ipmi_msghandler msr drm ip_tables x_tables autofs4 [ 12.583913] ---[ end trace 77367fed4d782aa4 ]--- [ 12.587963] RIP: 0010:__fput+0x247/0x250 [ 12.591729] Code: 00 48 85 ff 0f 84 8b fe ff ff f6 c7 40 0f 85 82 fe ff ff e8 ab 38 00 00 e9 78 fe ff ff 4c 89 f7 e8 2e 88 02 00 e9 b5 fe ff ff <0f> 0b 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 53 31 db 48 [ 12.605796] RSP: 0018:b50280d9fd88 EFLAGS: 00010246 [ 12.610166] RAX: RBX: 000a801d RCX: 9152e0716000 [ 12.615417] RDX: 9152cf075280 RSI: 0001 RDI: [ 12.620635] RBP: b50280d9fdb0 R08: 0001 R09: 9152dfcba2c8 [ 12.625878] R10: b50280d9fd88 R11: 9152d04e9d10 R12: 9152d04e9d00 [ 12.631121] R13: 9152dfcba2c8 R14: 9152cf0752a0 R15: 9152dfc2
[Kernel-packages] [Bug 1973034] Re: linux generic fails to boot on azure arm64 instance types
** Changed in: linux (Ubuntu) Status: Incomplete => Confirmed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1973034 Title: linux generic fails to boot on azure arm64 instance types Status in linux package in Ubuntu: Confirmed Bug description: Azure now has arm64 instances in a preview, for example Standard_D2pds_v5. These work with the b/linux-azure and f/linux-azure kernels, but fail to boot with linux-generic. Looks like a storage device issue (from serial console): Begin: Running /scripts/init-premount ... done. Begin: Mounting root file system ... Begin: Running /scripts/local-top ... done. Begin: Running /scripts/local-premount ... [4.651830] Btrfs loaded, crc32c=crc32c-generic [4.651830] Btrfs loaded, crc32c=crc32c-generic Scanning for Btrfs filesystems done. Begin: Waiting for root file system ... Begin: Running /scripts/local-block ... mdadm: No devices listed in conf file were found. done. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: error opening /dev/md?*: No such file or directory mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. done. Gave up waiting for root file system device. Common problems: - Boot args (cat /proc/cmdline) - Check rootdelay= (did the system wait long enough?) - Missing modules (cat /proc/modules; ls /dev) ALERT! UUID=b9c04583-2c65-4891-aa8e-a494eb94fd14 does not exist. Dropping to a shell! --- ProblemType: Bug AlsaDevices: Error: command ['ls', '-l', '/dev/snd/'] failed with exit code 2: ls: cannot access '/dev/snd/': No such file or directory AplayDevices: Error: [Errno 2] No such file or directory: 'aplay' ApportVersion: 2.20.11-0ubuntu27.23 Architecture: arm64 ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord' CRDA: Error: command ['iw', 'reg', 'get'] failed with exit code 1: nl80211 not found. CasperMD5CheckResult: skip DistroRelease: Ubuntu 20.04 IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig' Lspci-vt: -+-[3c75:00]---02.0 Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function] \-[:00]- Lsusb: Error: command ['lsusb'] failed with exit code 1: Lsusb-t: Lsusb-v: Error: command ['lsusb', '-v'] failed with exit code 1: MachineType: Microsoft Corporation Virtual Machine Package: linux (not installed) PciMultimedia: ProcEnviron: TERM=screen PATH=(custom, no user) LANG=C.UTF-8 SHELL=/bin/bash ProcFB: 0 hyperv_fb ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.13.0-1023-azure root=PARTUUID=379f8683-bd7e-48a0-8460-5c7f68b4b091 ro console=tty1 console=ttyAMA0 earlycon=pl011,0xeffec000 initcall_blacklist=arm_pmu_acpi_init panic=-1 ProcVersionSignature: Ubuntu 5.13.0-1023.27~20.04.1-azure 5.13.19 RelatedPackageVersions: linux-restricted-modules-5.13.0-1023-azure N/A linux-backports-modules-5.13.0-1023-azure N/A linux-firmware 1.187.30 RfKill: Error: [Errno 2] No such file or directory: 'rfkill' Tags: focal uec-images Uname: Linux 5.13.0-1023-azure aarch64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: N/A _MarkForUpload: True dmi.bios.date: 02/07/2022 dmi.bios.release: 4.1 dmi.bios.vendor: Microsoft Corporation dmi.bios.version: Hyper-V UEFI Release v4.1 dmi.board.asset.tag: None dmi.board.name: Virtual Machine dmi.board.vendor: Microsof
[Kernel-packages] [Bug 1973034] Re: linux generic fails to boot on azure arm64 instance types
Artifacts were collected from a new VM running focal/linux-azure just prior to rebooting to linux-generic (which gets stuck at initramfs). -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1973034 Title: linux generic fails to boot on azure arm64 instance types Status in linux package in Ubuntu: Incomplete Bug description: Azure now has arm64 instances in a preview, for example Standard_D2pds_v5. These work with the b/linux-azure and f/linux-azure kernels, but fail to boot with linux-generic. Looks like a storage device issue (from serial console): Begin: Running /scripts/init-premount ... done. Begin: Mounting root file system ... Begin: Running /scripts/local-top ... done. Begin: Running /scripts/local-premount ... [4.651830] Btrfs loaded, crc32c=crc32c-generic [4.651830] Btrfs loaded, crc32c=crc32c-generic Scanning for Btrfs filesystems done. Begin: Waiting for root file system ... Begin: Running /scripts/local-block ... mdadm: No devices listed in conf file were found. done. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: error opening /dev/md?*: No such file or directory mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. done. Gave up waiting for root file system device. Common problems: - Boot args (cat /proc/cmdline) - Check rootdelay= (did the system wait long enough?) - Missing modules (cat /proc/modules; ls /dev) ALERT! UUID=b9c04583-2c65-4891-aa8e-a494eb94fd14 does not exist. Dropping to a shell! --- ProblemType: Bug AlsaDevices: Error: command ['ls', '-l', '/dev/snd/'] failed with exit code 2: ls: cannot access '/dev/snd/': No such file or directory AplayDevices: Error: [Errno 2] No such file or directory: 'aplay' ApportVersion: 2.20.11-0ubuntu27.23 Architecture: arm64 ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord' CRDA: Error: command ['iw', 'reg', 'get'] failed with exit code 1: nl80211 not found. CasperMD5CheckResult: skip DistroRelease: Ubuntu 20.04 IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig' Lspci-vt: -+-[3c75:00]---02.0 Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function] \-[:00]- Lsusb: Error: command ['lsusb'] failed with exit code 1: Lsusb-t: Lsusb-v: Error: command ['lsusb', '-v'] failed with exit code 1: MachineType: Microsoft Corporation Virtual Machine Package: linux (not installed) PciMultimedia: ProcEnviron: TERM=screen PATH=(custom, no user) LANG=C.UTF-8 SHELL=/bin/bash ProcFB: 0 hyperv_fb ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.13.0-1023-azure root=PARTUUID=379f8683-bd7e-48a0-8460-5c7f68b4b091 ro console=tty1 console=ttyAMA0 earlycon=pl011,0xeffec000 initcall_blacklist=arm_pmu_acpi_init panic=-1 ProcVersionSignature: Ubuntu 5.13.0-1023.27~20.04.1-azure 5.13.19 RelatedPackageVersions: linux-restricted-modules-5.13.0-1023-azure N/A linux-backports-modules-5.13.0-1023-azure N/A linux-firmware 1.187.30 RfKill: Error: [Errno 2] No such file or directory: 'rfkill' Tags: focal uec-images Uname: Linux 5.13.0-1023-azure aarch64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: N/A _MarkForUpload: True dmi.bios.date: 02/07/2022 dmi.bios.release: 4.1 dmi.bios.vendor: Microsoft Corporation dmi.bios.version: Hyper-V UEFI Release v4.1 dmi.board.asset.t
[Kernel-packages] [Bug 1973034] UdevDb.txt
apport information ** Attachment added: "UdevDb.txt" https://bugs.launchpad.net/bugs/1973034/+attachment/5588652/+files/UdevDb.txt -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1973034 Title: linux generic fails to boot on azure arm64 instance types Status in linux package in Ubuntu: Incomplete Bug description: Azure now has arm64 instances in a preview, for example Standard_D2pds_v5. These work with the b/linux-azure and f/linux-azure kernels, but fail to boot with linux-generic. Looks like a storage device issue (from serial console): Begin: Running /scripts/init-premount ... done. Begin: Mounting root file system ... Begin: Running /scripts/local-top ... done. Begin: Running /scripts/local-premount ... [4.651830] Btrfs loaded, crc32c=crc32c-generic [4.651830] Btrfs loaded, crc32c=crc32c-generic Scanning for Btrfs filesystems done. Begin: Waiting for root file system ... Begin: Running /scripts/local-block ... mdadm: No devices listed in conf file were found. done. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: error opening /dev/md?*: No such file or directory mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. done. Gave up waiting for root file system device. Common problems: - Boot args (cat /proc/cmdline) - Check rootdelay= (did the system wait long enough?) - Missing modules (cat /proc/modules; ls /dev) ALERT! UUID=b9c04583-2c65-4891-aa8e-a494eb94fd14 does not exist. Dropping to a shell! --- ProblemType: Bug AlsaDevices: Error: command ['ls', '-l', '/dev/snd/'] failed with exit code 2: ls: cannot access '/dev/snd/': No such file or directory AplayDevices: Error: [Errno 2] No such file or directory: 'aplay' ApportVersion: 2.20.11-0ubuntu27.23 Architecture: arm64 ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord' CRDA: Error: command ['iw', 'reg', 'get'] failed with exit code 1: nl80211 not found. CasperMD5CheckResult: skip DistroRelease: Ubuntu 20.04 IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig' Lspci-vt: -+-[3c75:00]---02.0 Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function] \-[:00]- Lsusb: Error: command ['lsusb'] failed with exit code 1: Lsusb-t: Lsusb-v: Error: command ['lsusb', '-v'] failed with exit code 1: MachineType: Microsoft Corporation Virtual Machine Package: linux (not installed) PciMultimedia: ProcEnviron: TERM=screen PATH=(custom, no user) LANG=C.UTF-8 SHELL=/bin/bash ProcFB: 0 hyperv_fb ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.13.0-1023-azure root=PARTUUID=379f8683-bd7e-48a0-8460-5c7f68b4b091 ro console=tty1 console=ttyAMA0 earlycon=pl011,0xeffec000 initcall_blacklist=arm_pmu_acpi_init panic=-1 ProcVersionSignature: Ubuntu 5.13.0-1023.27~20.04.1-azure 5.13.19 RelatedPackageVersions: linux-restricted-modules-5.13.0-1023-azure N/A linux-backports-modules-5.13.0-1023-azure N/A linux-firmware 1.187.30 RfKill: Error: [Errno 2] No such file or directory: 'rfkill' Tags: focal uec-images Uname: Linux 5.13.0-1023-azure aarch64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: N/A _MarkForUpload: True dmi.bios.date: 02/07/2022 dmi.bios.release: 4.1 dmi.bios.vendor: Microsoft Corporation dmi.bios.version: Hyper-V UEFI Release v4.1 dmi.board.asset.tag:
[Kernel-packages] [Bug 1973034] ProcInterrupts.txt
apport information ** Attachment added: "ProcInterrupts.txt" https://bugs.launchpad.net/bugs/1973034/+attachment/5588650/+files/ProcInterrupts.txt -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1973034 Title: linux generic fails to boot on azure arm64 instance types Status in linux package in Ubuntu: Incomplete Bug description: Azure now has arm64 instances in a preview, for example Standard_D2pds_v5. These work with the b/linux-azure and f/linux-azure kernels, but fail to boot with linux-generic. Looks like a storage device issue (from serial console): Begin: Running /scripts/init-premount ... done. Begin: Mounting root file system ... Begin: Running /scripts/local-top ... done. Begin: Running /scripts/local-premount ... [4.651830] Btrfs loaded, crc32c=crc32c-generic [4.651830] Btrfs loaded, crc32c=crc32c-generic Scanning for Btrfs filesystems done. Begin: Waiting for root file system ... Begin: Running /scripts/local-block ... mdadm: No devices listed in conf file were found. done. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: error opening /dev/md?*: No such file or directory mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. done. Gave up waiting for root file system device. Common problems: - Boot args (cat /proc/cmdline) - Check rootdelay= (did the system wait long enough?) - Missing modules (cat /proc/modules; ls /dev) ALERT! UUID=b9c04583-2c65-4891-aa8e-a494eb94fd14 does not exist. Dropping to a shell! --- ProblemType: Bug AlsaDevices: Error: command ['ls', '-l', '/dev/snd/'] failed with exit code 2: ls: cannot access '/dev/snd/': No such file or directory AplayDevices: Error: [Errno 2] No such file or directory: 'aplay' ApportVersion: 2.20.11-0ubuntu27.23 Architecture: arm64 ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord' CRDA: Error: command ['iw', 'reg', 'get'] failed with exit code 1: nl80211 not found. CasperMD5CheckResult: skip DistroRelease: Ubuntu 20.04 IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig' Lspci-vt: -+-[3c75:00]---02.0 Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function] \-[:00]- Lsusb: Error: command ['lsusb'] failed with exit code 1: Lsusb-t: Lsusb-v: Error: command ['lsusb', '-v'] failed with exit code 1: MachineType: Microsoft Corporation Virtual Machine Package: linux (not installed) PciMultimedia: ProcEnviron: TERM=screen PATH=(custom, no user) LANG=C.UTF-8 SHELL=/bin/bash ProcFB: 0 hyperv_fb ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.13.0-1023-azure root=PARTUUID=379f8683-bd7e-48a0-8460-5c7f68b4b091 ro console=tty1 console=ttyAMA0 earlycon=pl011,0xeffec000 initcall_blacklist=arm_pmu_acpi_init panic=-1 ProcVersionSignature: Ubuntu 5.13.0-1023.27~20.04.1-azure 5.13.19 RelatedPackageVersions: linux-restricted-modules-5.13.0-1023-azure N/A linux-backports-modules-5.13.0-1023-azure N/A linux-firmware 1.187.30 RfKill: Error: [Errno 2] No such file or directory: 'rfkill' Tags: focal uec-images Uname: Linux 5.13.0-1023-azure aarch64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: N/A _MarkForUpload: True dmi.bios.date: 02/07/2022 dmi.bios.release: 4.1 dmi.bios.vendor: Microsoft Corporation dmi.bios.version: Hyper-V UEFI Release v4.1 dmi.
[Kernel-packages] [Bug 1973034] acpidump.txt
apport information ** Attachment added: "acpidump.txt" https://bugs.launchpad.net/bugs/1973034/+attachment/5588654/+files/acpidump.txt -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1973034 Title: linux generic fails to boot on azure arm64 instance types Status in linux package in Ubuntu: Incomplete Bug description: Azure now has arm64 instances in a preview, for example Standard_D2pds_v5. These work with the b/linux-azure and f/linux-azure kernels, but fail to boot with linux-generic. Looks like a storage device issue (from serial console): Begin: Running /scripts/init-premount ... done. Begin: Mounting root file system ... Begin: Running /scripts/local-top ... done. Begin: Running /scripts/local-premount ... [4.651830] Btrfs loaded, crc32c=crc32c-generic [4.651830] Btrfs loaded, crc32c=crc32c-generic Scanning for Btrfs filesystems done. Begin: Waiting for root file system ... Begin: Running /scripts/local-block ... mdadm: No devices listed in conf file were found. done. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: error opening /dev/md?*: No such file or directory mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. done. Gave up waiting for root file system device. Common problems: - Boot args (cat /proc/cmdline) - Check rootdelay= (did the system wait long enough?) - Missing modules (cat /proc/modules; ls /dev) ALERT! UUID=b9c04583-2c65-4891-aa8e-a494eb94fd14 does not exist. Dropping to a shell! --- ProblemType: Bug AlsaDevices: Error: command ['ls', '-l', '/dev/snd/'] failed with exit code 2: ls: cannot access '/dev/snd/': No such file or directory AplayDevices: Error: [Errno 2] No such file or directory: 'aplay' ApportVersion: 2.20.11-0ubuntu27.23 Architecture: arm64 ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord' CRDA: Error: command ['iw', 'reg', 'get'] failed with exit code 1: nl80211 not found. CasperMD5CheckResult: skip DistroRelease: Ubuntu 20.04 IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig' Lspci-vt: -+-[3c75:00]---02.0 Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function] \-[:00]- Lsusb: Error: command ['lsusb'] failed with exit code 1: Lsusb-t: Lsusb-v: Error: command ['lsusb', '-v'] failed with exit code 1: MachineType: Microsoft Corporation Virtual Machine Package: linux (not installed) PciMultimedia: ProcEnviron: TERM=screen PATH=(custom, no user) LANG=C.UTF-8 SHELL=/bin/bash ProcFB: 0 hyperv_fb ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.13.0-1023-azure root=PARTUUID=379f8683-bd7e-48a0-8460-5c7f68b4b091 ro console=tty1 console=ttyAMA0 earlycon=pl011,0xeffec000 initcall_blacklist=arm_pmu_acpi_init panic=-1 ProcVersionSignature: Ubuntu 5.13.0-1023.27~20.04.1-azure 5.13.19 RelatedPackageVersions: linux-restricted-modules-5.13.0-1023-azure N/A linux-backports-modules-5.13.0-1023-azure N/A linux-firmware 1.187.30 RfKill: Error: [Errno 2] No such file or directory: 'rfkill' Tags: focal uec-images Uname: Linux 5.13.0-1023-azure aarch64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: N/A _MarkForUpload: True dmi.bios.date: 02/07/2022 dmi.bios.release: 4.1 dmi.bios.vendor: Microsoft Corporation dmi.bios.version: Hyper-V UEFI Release v4.1 dmi.board.asset.
[Kernel-packages] [Bug 1973034] WifiSyslog.txt
apport information ** Attachment added: "WifiSyslog.txt" https://bugs.launchpad.net/bugs/1973034/+attachment/5588653/+files/WifiSyslog.txt -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1973034 Title: linux generic fails to boot on azure arm64 instance types Status in linux package in Ubuntu: Incomplete Bug description: Azure now has arm64 instances in a preview, for example Standard_D2pds_v5. These work with the b/linux-azure and f/linux-azure kernels, but fail to boot with linux-generic. Looks like a storage device issue (from serial console): Begin: Running /scripts/init-premount ... done. Begin: Mounting root file system ... Begin: Running /scripts/local-top ... done. Begin: Running /scripts/local-premount ... [4.651830] Btrfs loaded, crc32c=crc32c-generic [4.651830] Btrfs loaded, crc32c=crc32c-generic Scanning for Btrfs filesystems done. Begin: Waiting for root file system ... Begin: Running /scripts/local-block ... mdadm: No devices listed in conf file were found. done. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: error opening /dev/md?*: No such file or directory mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. done. Gave up waiting for root file system device. Common problems: - Boot args (cat /proc/cmdline) - Check rootdelay= (did the system wait long enough?) - Missing modules (cat /proc/modules; ls /dev) ALERT! UUID=b9c04583-2c65-4891-aa8e-a494eb94fd14 does not exist. Dropping to a shell! --- ProblemType: Bug AlsaDevices: Error: command ['ls', '-l', '/dev/snd/'] failed with exit code 2: ls: cannot access '/dev/snd/': No such file or directory AplayDevices: Error: [Errno 2] No such file or directory: 'aplay' ApportVersion: 2.20.11-0ubuntu27.23 Architecture: arm64 ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord' CRDA: Error: command ['iw', 'reg', 'get'] failed with exit code 1: nl80211 not found. CasperMD5CheckResult: skip DistroRelease: Ubuntu 20.04 IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig' Lspci-vt: -+-[3c75:00]---02.0 Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function] \-[:00]- Lsusb: Error: command ['lsusb'] failed with exit code 1: Lsusb-t: Lsusb-v: Error: command ['lsusb', '-v'] failed with exit code 1: MachineType: Microsoft Corporation Virtual Machine Package: linux (not installed) PciMultimedia: ProcEnviron: TERM=screen PATH=(custom, no user) LANG=C.UTF-8 SHELL=/bin/bash ProcFB: 0 hyperv_fb ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.13.0-1023-azure root=PARTUUID=379f8683-bd7e-48a0-8460-5c7f68b4b091 ro console=tty1 console=ttyAMA0 earlycon=pl011,0xeffec000 initcall_blacklist=arm_pmu_acpi_init panic=-1 ProcVersionSignature: Ubuntu 5.13.0-1023.27~20.04.1-azure 5.13.19 RelatedPackageVersions: linux-restricted-modules-5.13.0-1023-azure N/A linux-backports-modules-5.13.0-1023-azure N/A linux-firmware 1.187.30 RfKill: Error: [Errno 2] No such file or directory: 'rfkill' Tags: focal uec-images Uname: Linux 5.13.0-1023-azure aarch64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: N/A _MarkForUpload: True dmi.bios.date: 02/07/2022 dmi.bios.release: 4.1 dmi.bios.vendor: Microsoft Corporation dmi.bios.version: Hyper-V UEFI Release v4.1 dmi.board.as
[Kernel-packages] [Bug 1973034] ProcCpuinfoMinimal.txt
apport information ** Attachment added: "ProcCpuinfoMinimal.txt" https://bugs.launchpad.net/bugs/1973034/+attachment/5588649/+files/ProcCpuinfoMinimal.txt -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1973034 Title: linux generic fails to boot on azure arm64 instance types Status in linux package in Ubuntu: Incomplete Bug description: Azure now has arm64 instances in a preview, for example Standard_D2pds_v5. These work with the b/linux-azure and f/linux-azure kernels, but fail to boot with linux-generic. Looks like a storage device issue (from serial console): Begin: Running /scripts/init-premount ... done. Begin: Mounting root file system ... Begin: Running /scripts/local-top ... done. Begin: Running /scripts/local-premount ... [4.651830] Btrfs loaded, crc32c=crc32c-generic [4.651830] Btrfs loaded, crc32c=crc32c-generic Scanning for Btrfs filesystems done. Begin: Waiting for root file system ... Begin: Running /scripts/local-block ... mdadm: No devices listed in conf file were found. done. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: error opening /dev/md?*: No such file or directory mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. done. Gave up waiting for root file system device. Common problems: - Boot args (cat /proc/cmdline) - Check rootdelay= (did the system wait long enough?) - Missing modules (cat /proc/modules; ls /dev) ALERT! UUID=b9c04583-2c65-4891-aa8e-a494eb94fd14 does not exist. Dropping to a shell! --- ProblemType: Bug AlsaDevices: Error: command ['ls', '-l', '/dev/snd/'] failed with exit code 2: ls: cannot access '/dev/snd/': No such file or directory AplayDevices: Error: [Errno 2] No such file or directory: 'aplay' ApportVersion: 2.20.11-0ubuntu27.23 Architecture: arm64 ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord' CRDA: Error: command ['iw', 'reg', 'get'] failed with exit code 1: nl80211 not found. CasperMD5CheckResult: skip DistroRelease: Ubuntu 20.04 IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig' Lspci-vt: -+-[3c75:00]---02.0 Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function] \-[:00]- Lsusb: Error: command ['lsusb'] failed with exit code 1: Lsusb-t: Lsusb-v: Error: command ['lsusb', '-v'] failed with exit code 1: MachineType: Microsoft Corporation Virtual Machine Package: linux (not installed) PciMultimedia: ProcEnviron: TERM=screen PATH=(custom, no user) LANG=C.UTF-8 SHELL=/bin/bash ProcFB: 0 hyperv_fb ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.13.0-1023-azure root=PARTUUID=379f8683-bd7e-48a0-8460-5c7f68b4b091 ro console=tty1 console=ttyAMA0 earlycon=pl011,0xeffec000 initcall_blacklist=arm_pmu_acpi_init panic=-1 ProcVersionSignature: Ubuntu 5.13.0-1023.27~20.04.1-azure 5.13.19 RelatedPackageVersions: linux-restricted-modules-5.13.0-1023-azure N/A linux-backports-modules-5.13.0-1023-azure N/A linux-firmware 1.187.30 RfKill: Error: [Errno 2] No such file or directory: 'rfkill' Tags: focal uec-images Uname: Linux 5.13.0-1023-azure aarch64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: N/A _MarkForUpload: True dmi.bios.date: 02/07/2022 dmi.bios.release: 4.1 dmi.bios.vendor: Microsoft Corporation dmi.bios.version: Hyper-V UEFI Release v4.
[Kernel-packages] [Bug 1973034] ProcModules.txt
apport information ** Attachment added: "ProcModules.txt" https://bugs.launchpad.net/bugs/1973034/+attachment/5588651/+files/ProcModules.txt -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1973034 Title: linux generic fails to boot on azure arm64 instance types Status in linux package in Ubuntu: Incomplete Bug description: Azure now has arm64 instances in a preview, for example Standard_D2pds_v5. These work with the b/linux-azure and f/linux-azure kernels, but fail to boot with linux-generic. Looks like a storage device issue (from serial console): Begin: Running /scripts/init-premount ... done. Begin: Mounting root file system ... Begin: Running /scripts/local-top ... done. Begin: Running /scripts/local-premount ... [4.651830] Btrfs loaded, crc32c=crc32c-generic [4.651830] Btrfs loaded, crc32c=crc32c-generic Scanning for Btrfs filesystems done. Begin: Waiting for root file system ... Begin: Running /scripts/local-block ... mdadm: No devices listed in conf file were found. done. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: error opening /dev/md?*: No such file or directory mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. done. Gave up waiting for root file system device. Common problems: - Boot args (cat /proc/cmdline) - Check rootdelay= (did the system wait long enough?) - Missing modules (cat /proc/modules; ls /dev) ALERT! UUID=b9c04583-2c65-4891-aa8e-a494eb94fd14 does not exist. Dropping to a shell! --- ProblemType: Bug AlsaDevices: Error: command ['ls', '-l', '/dev/snd/'] failed with exit code 2: ls: cannot access '/dev/snd/': No such file or directory AplayDevices: Error: [Errno 2] No such file or directory: 'aplay' ApportVersion: 2.20.11-0ubuntu27.23 Architecture: arm64 ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord' CRDA: Error: command ['iw', 'reg', 'get'] failed with exit code 1: nl80211 not found. CasperMD5CheckResult: skip DistroRelease: Ubuntu 20.04 IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig' Lspci-vt: -+-[3c75:00]---02.0 Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function] \-[:00]- Lsusb: Error: command ['lsusb'] failed with exit code 1: Lsusb-t: Lsusb-v: Error: command ['lsusb', '-v'] failed with exit code 1: MachineType: Microsoft Corporation Virtual Machine Package: linux (not installed) PciMultimedia: ProcEnviron: TERM=screen PATH=(custom, no user) LANG=C.UTF-8 SHELL=/bin/bash ProcFB: 0 hyperv_fb ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.13.0-1023-azure root=PARTUUID=379f8683-bd7e-48a0-8460-5c7f68b4b091 ro console=tty1 console=ttyAMA0 earlycon=pl011,0xeffec000 initcall_blacklist=arm_pmu_acpi_init panic=-1 ProcVersionSignature: Ubuntu 5.13.0-1023.27~20.04.1-azure 5.13.19 RelatedPackageVersions: linux-restricted-modules-5.13.0-1023-azure N/A linux-backports-modules-5.13.0-1023-azure N/A linux-firmware 1.187.30 RfKill: Error: [Errno 2] No such file or directory: 'rfkill' Tags: focal uec-images Uname: Linux 5.13.0-1023-azure aarch64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: N/A _MarkForUpload: True dmi.bios.date: 02/07/2022 dmi.bios.release: 4.1 dmi.bios.vendor: Microsoft Corporation dmi.bios.version: Hyper-V UEFI Release v4.1 dmi.board.
[Kernel-packages] [Bug 1973034] CurrentDmesg.txt
apport information ** Attachment added: "CurrentDmesg.txt" https://bugs.launchpad.net/bugs/1973034/+attachment/5588646/+files/CurrentDmesg.txt -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1973034 Title: linux generic fails to boot on azure arm64 instance types Status in linux package in Ubuntu: Incomplete Bug description: Azure now has arm64 instances in a preview, for example Standard_D2pds_v5. These work with the b/linux-azure and f/linux-azure kernels, but fail to boot with linux-generic. Looks like a storage device issue (from serial console): Begin: Running /scripts/init-premount ... done. Begin: Mounting root file system ... Begin: Running /scripts/local-top ... done. Begin: Running /scripts/local-premount ... [4.651830] Btrfs loaded, crc32c=crc32c-generic [4.651830] Btrfs loaded, crc32c=crc32c-generic Scanning for Btrfs filesystems done. Begin: Waiting for root file system ... Begin: Running /scripts/local-block ... mdadm: No devices listed in conf file were found. done. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: error opening /dev/md?*: No such file or directory mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. done. Gave up waiting for root file system device. Common problems: - Boot args (cat /proc/cmdline) - Check rootdelay= (did the system wait long enough?) - Missing modules (cat /proc/modules; ls /dev) ALERT! UUID=b9c04583-2c65-4891-aa8e-a494eb94fd14 does not exist. Dropping to a shell! --- ProblemType: Bug AlsaDevices: Error: command ['ls', '-l', '/dev/snd/'] failed with exit code 2: ls: cannot access '/dev/snd/': No such file or directory AplayDevices: Error: [Errno 2] No such file or directory: 'aplay' ApportVersion: 2.20.11-0ubuntu27.23 Architecture: arm64 ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord' CRDA: Error: command ['iw', 'reg', 'get'] failed with exit code 1: nl80211 not found. CasperMD5CheckResult: skip DistroRelease: Ubuntu 20.04 IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig' Lspci-vt: -+-[3c75:00]---02.0 Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function] \-[:00]- Lsusb: Error: command ['lsusb'] failed with exit code 1: Lsusb-t: Lsusb-v: Error: command ['lsusb', '-v'] failed with exit code 1: MachineType: Microsoft Corporation Virtual Machine Package: linux (not installed) PciMultimedia: ProcEnviron: TERM=screen PATH=(custom, no user) LANG=C.UTF-8 SHELL=/bin/bash ProcFB: 0 hyperv_fb ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.13.0-1023-azure root=PARTUUID=379f8683-bd7e-48a0-8460-5c7f68b4b091 ro console=tty1 console=ttyAMA0 earlycon=pl011,0xeffec000 initcall_blacklist=arm_pmu_acpi_init panic=-1 ProcVersionSignature: Ubuntu 5.13.0-1023.27~20.04.1-azure 5.13.19 RelatedPackageVersions: linux-restricted-modules-5.13.0-1023-azure N/A linux-backports-modules-5.13.0-1023-azure N/A linux-firmware 1.187.30 RfKill: Error: [Errno 2] No such file or directory: 'rfkill' Tags: focal uec-images Uname: Linux 5.13.0-1023-azure aarch64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: N/A _MarkForUpload: True dmi.bios.date: 02/07/2022 dmi.bios.release: 4.1 dmi.bios.vendor: Microsoft Corporation dmi.bios.version: Hyper-V UEFI Release v4.1 dmi.boar
[Kernel-packages] [Bug 1973034] ProcCpuinfo.txt
apport information ** Attachment added: "ProcCpuinfo.txt" https://bugs.launchpad.net/bugs/1973034/+attachment/5588648/+files/ProcCpuinfo.txt -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1973034 Title: linux generic fails to boot on azure arm64 instance types Status in linux package in Ubuntu: Incomplete Bug description: Azure now has arm64 instances in a preview, for example Standard_D2pds_v5. These work with the b/linux-azure and f/linux-azure kernels, but fail to boot with linux-generic. Looks like a storage device issue (from serial console): Begin: Running /scripts/init-premount ... done. Begin: Mounting root file system ... Begin: Running /scripts/local-top ... done. Begin: Running /scripts/local-premount ... [4.651830] Btrfs loaded, crc32c=crc32c-generic [4.651830] Btrfs loaded, crc32c=crc32c-generic Scanning for Btrfs filesystems done. Begin: Waiting for root file system ... Begin: Running /scripts/local-block ... mdadm: No devices listed in conf file were found. done. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: error opening /dev/md?*: No such file or directory mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. done. Gave up waiting for root file system device. Common problems: - Boot args (cat /proc/cmdline) - Check rootdelay= (did the system wait long enough?) - Missing modules (cat /proc/modules; ls /dev) ALERT! UUID=b9c04583-2c65-4891-aa8e-a494eb94fd14 does not exist. Dropping to a shell! --- ProblemType: Bug AlsaDevices: Error: command ['ls', '-l', '/dev/snd/'] failed with exit code 2: ls: cannot access '/dev/snd/': No such file or directory AplayDevices: Error: [Errno 2] No such file or directory: 'aplay' ApportVersion: 2.20.11-0ubuntu27.23 Architecture: arm64 ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord' CRDA: Error: command ['iw', 'reg', 'get'] failed with exit code 1: nl80211 not found. CasperMD5CheckResult: skip DistroRelease: Ubuntu 20.04 IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig' Lspci-vt: -+-[3c75:00]---02.0 Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function] \-[:00]- Lsusb: Error: command ['lsusb'] failed with exit code 1: Lsusb-t: Lsusb-v: Error: command ['lsusb', '-v'] failed with exit code 1: MachineType: Microsoft Corporation Virtual Machine Package: linux (not installed) PciMultimedia: ProcEnviron: TERM=screen PATH=(custom, no user) LANG=C.UTF-8 SHELL=/bin/bash ProcFB: 0 hyperv_fb ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.13.0-1023-azure root=PARTUUID=379f8683-bd7e-48a0-8460-5c7f68b4b091 ro console=tty1 console=ttyAMA0 earlycon=pl011,0xeffec000 initcall_blacklist=arm_pmu_acpi_init panic=-1 ProcVersionSignature: Ubuntu 5.13.0-1023.27~20.04.1-azure 5.13.19 RelatedPackageVersions: linux-restricted-modules-5.13.0-1023-azure N/A linux-backports-modules-5.13.0-1023-azure N/A linux-firmware 1.187.30 RfKill: Error: [Errno 2] No such file or directory: 'rfkill' Tags: focal uec-images Uname: Linux 5.13.0-1023-azure aarch64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: N/A _MarkForUpload: True dmi.bios.date: 02/07/2022 dmi.bios.release: 4.1 dmi.bios.vendor: Microsoft Corporation dmi.bios.version: Hyper-V UEFI Release v4.1 dmi.board.
[Kernel-packages] [Bug 1973034] Lspci.txt
apport information ** Attachment added: "Lspci.txt" https://bugs.launchpad.net/bugs/1973034/+attachment/5588647/+files/Lspci.txt -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1973034 Title: linux generic fails to boot on azure arm64 instance types Status in linux package in Ubuntu: Incomplete Bug description: Azure now has arm64 instances in a preview, for example Standard_D2pds_v5. These work with the b/linux-azure and f/linux-azure kernels, but fail to boot with linux-generic. Looks like a storage device issue (from serial console): Begin: Running /scripts/init-premount ... done. Begin: Mounting root file system ... Begin: Running /scripts/local-top ... done. Begin: Running /scripts/local-premount ... [4.651830] Btrfs loaded, crc32c=crc32c-generic [4.651830] Btrfs loaded, crc32c=crc32c-generic Scanning for Btrfs filesystems done. Begin: Waiting for root file system ... Begin: Running /scripts/local-block ... mdadm: No devices listed in conf file were found. done. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: error opening /dev/md?*: No such file or directory mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. done. Gave up waiting for root file system device. Common problems: - Boot args (cat /proc/cmdline) - Check rootdelay= (did the system wait long enough?) - Missing modules (cat /proc/modules; ls /dev) ALERT! UUID=b9c04583-2c65-4891-aa8e-a494eb94fd14 does not exist. Dropping to a shell! --- ProblemType: Bug AlsaDevices: Error: command ['ls', '-l', '/dev/snd/'] failed with exit code 2: ls: cannot access '/dev/snd/': No such file or directory AplayDevices: Error: [Errno 2] No such file or directory: 'aplay' ApportVersion: 2.20.11-0ubuntu27.23 Architecture: arm64 ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord' CRDA: Error: command ['iw', 'reg', 'get'] failed with exit code 1: nl80211 not found. CasperMD5CheckResult: skip DistroRelease: Ubuntu 20.04 IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig' Lspci-vt: -+-[3c75:00]---02.0 Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function] \-[:00]- Lsusb: Error: command ['lsusb'] failed with exit code 1: Lsusb-t: Lsusb-v: Error: command ['lsusb', '-v'] failed with exit code 1: MachineType: Microsoft Corporation Virtual Machine Package: linux (not installed) PciMultimedia: ProcEnviron: TERM=screen PATH=(custom, no user) LANG=C.UTF-8 SHELL=/bin/bash ProcFB: 0 hyperv_fb ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.13.0-1023-azure root=PARTUUID=379f8683-bd7e-48a0-8460-5c7f68b4b091 ro console=tty1 console=ttyAMA0 earlycon=pl011,0xeffec000 initcall_blacklist=arm_pmu_acpi_init panic=-1 ProcVersionSignature: Ubuntu 5.13.0-1023.27~20.04.1-azure 5.13.19 RelatedPackageVersions: linux-restricted-modules-5.13.0-1023-azure N/A linux-backports-modules-5.13.0-1023-azure N/A linux-firmware 1.187.30 RfKill: Error: [Errno 2] No such file or directory: 'rfkill' Tags: focal uec-images Uname: Linux 5.13.0-1023-azure aarch64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: N/A _MarkForUpload: True dmi.bios.date: 02/07/2022 dmi.bios.release: 4.1 dmi.bios.vendor: Microsoft Corporation dmi.bios.version: Hyper-V UEFI Release v4.1 dmi.board.asset.tag: No
[Kernel-packages] [Bug 1973034] Re: linux generic fails to boot on azure arm64 instance types
apport information ** Tags added: apport-collected focal uec-images ** Description changed: Azure now has arm64 instances in a preview, for example Standard_D2pds_v5. These work with the b/linux-azure and f/linux-azure kernels, but fail to boot with linux-generic. Looks like a storage device issue (from serial console): Begin: Running /scripts/init-premount ... done. Begin: Mounting root file system ... Begin: Running /scripts/local-top ... done. Begin: Running /scripts/local-premount ... [4.651830] Btrfs loaded, crc32c=crc32c-generic [4.651830] Btrfs loaded, crc32c=crc32c-generic Scanning for Btrfs filesystems done. Begin: Waiting for root file system ... Begin: Running /scripts/local-block ... mdadm: No devices listed in conf file were found. done. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: error opening /dev/md?*: No such file or directory mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. done. Gave up waiting for root file system device. Common problems: - Boot args (cat /proc/cmdline) - Check rootdelay= (did the system wait long enough?) - Missing modules (cat /proc/modules; ls /dev) ALERT! UUID=b9c04583-2c65-4891-aa8e-a494eb94fd14 does not exist. Dropping to a shell! + --- + ProblemType: Bug + AlsaDevices: Error: command ['ls', '-l', '/dev/snd/'] failed with exit code 2: ls: cannot access '/dev/snd/': No such file or directory + AplayDevices: Error: [Errno 2] No such file or directory: 'aplay' + ApportVersion: 2.20.11-0ubuntu27.23 + Architecture: arm64 + ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord' + CRDA: Error: command ['iw', 'reg', 'get'] failed with exit code 1: nl80211 not found. + CasperMD5CheckResult: skip + DistroRelease: Ubuntu 20.04 + IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig' + Lspci-vt: + -+-[3c75:00]---02.0 Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function] + \-[:00]- + Lsusb: Error: command ['lsusb'] failed with exit code 1: + Lsusb-t: + + Lsusb-v: Error: command ['lsusb', '-v'] failed with exit code 1: + MachineType: Microsoft Corporation Virtual Machine + Package: linux (not installed) + PciMultimedia: + + ProcEnviron: + TERM=screen + PATH=(custom, no user) + LANG=C.UTF-8 + SHELL=/bin/bash + ProcFB: 0 hyperv_fb + ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.13.0-1023-azure root=PARTUUID=379f8683-bd7e-48a0-8460-5c7f68b4b091 ro console=tty1 console=ttyAMA0 earlycon=pl011,0xeffec000 initcall_blacklist=arm_pmu_acpi_init panic=-1 + ProcVersionSignature: Ubuntu 5.13.0-1023.27~20.04.1-azure 5.13.19 + RelatedPackageVersions: + linux-restricted-modules-5.13.0-1023-azure N/A + linux-backports-modules-5.13.0-1023-azure N/A + linux-firmware 1.187.30 + RfKill: Error: [Errno 2] No such file or directory: 'rfkill' + Tags: focal uec-images + Uname: Linux 5.13.0-1023-azure aarch64 + UpgradeStatus: No upgrade log present (probably fresh install) + UserGroups: N/A + _MarkForUpload: True + dmi.bios.date: 02/07/2022 + dmi.bios.release: 4.1 + dmi.bios.vendor: Microsoft Corporation + dmi.bios.version: Hyper-V UEFI Release v4.1 + dmi.board.asset.tag: None + dmi.board.name: Virtual Machine + dmi.board.vendor: Microsoft Corporation + dmi.board.version: Hyper-V UEFI Release v4.1 + dmi.chassis.asset.tag: 7783-7084-3265-9085-8269-3286-77 + dmi.chassis.type: 3 + dmi.chassis.vendor: Microsoft Corporation + dmi.chassis.version: Hyper-V UEFI Release v4.1 + dmi.modalias: dmi:bvnMicrosoftCorp
[Kernel-packages] [Bug 1973034] [NEW] linux generic fails to boot on azure arm64 instance types
Public bug reported: Azure now has arm64 instances in a preview, for example Standard_D2pds_v5. These work with the b/linux-azure and f/linux-azure kernels, but fail to boot with linux-generic. Looks like a storage device issue (from serial console): Begin: Running /scripts/init-premount ... done. Begin: Mounting root file system ... Begin: Running /scripts/local-top ... done. Begin: Running /scripts/local-premount ... [4.651830] Btrfs loaded, crc32c=crc32c-generic [4.651830] Btrfs loaded, crc32c=crc32c-generic Scanning for Btrfs filesystems done. Begin: Waiting for root file system ... Begin: Running /scripts/local-block ... mdadm: No devices listed in conf file were found. done. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: error opening /dev/md?*: No such file or directory mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. done. Gave up waiting for root file system device. Common problems: - Boot args (cat /proc/cmdline) - Check rootdelay= (did the system wait long enough?) - Missing modules (cat /proc/modules; ls /dev) ALERT! UUID=b9c04583-2c65-4891-aa8e-a494eb94fd14 does not exist. Dropping to a shell! ** Affects: linux (Ubuntu) Importance: Undecided Status: Incomplete -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1973034 Title: linux generic fails to boot on azure arm64 instance types Status in linux package in Ubuntu: Incomplete Bug description: Azure now has arm64 instances in a preview, for example Standard_D2pds_v5. These work with the b/linux-azure and f/linux-azure kernels, but fail to boot with linux-generic. Looks like a storage device issue (from serial console): Begin: Running /scripts/init-premount ... done. Begin: Mounting root file system ... Begin: Running /scripts/local-top ... done. Begin: Running /scripts/local-premount ... [4.651830] Btrfs loaded, crc32c=crc32c-generic [4.651830] Btrfs loaded, crc32c=crc32c-generic Scanning for Btrfs filesystems done. Begin: Waiting for root file system ... Begin: Running /scripts/local-block ... mdadm: No devices listed in conf file were found. done. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: error opening /dev/md?*: No such file or directory mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were found. mdadm: No devices listed in conf file were fo
[Kernel-packages] [Bug 1968062] Re: jammy/linux-aws hibernation timeout on xen instances
In this screenshot, it appears the system has resumed as the login screen is shown along with the messages from the hibernation memory consumption utility. The first memory message was generated prior to the hibernation (matches the message from the pre-hibernation image). The second message could have been generated before the hibernation or after the resume (there isn't enough data to know for sure). ** Attachment added: "Second screenshot after resume initiated" https://bugs.launchpad.net/ubuntu/+source/linux-aws/+bug/1968062/+attachment/5577701/+files/post-hibernate.12.jpg -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-aws in Ubuntu. https://bugs.launchpad.net/bugs/1968062 Title: jammy/linux-aws hibernation timeout on xen instances Status in linux-aws package in Ubuntu: New Bug description: Hibernation testing of jammy/linux-aws 5.15.0-1003-aws is failing on all xen instance types (c3/c4/i3/m3/m4/r3/r4/t2). The failure happens while attempting to resume from the first attempt to hibernate. Testing on nitro instances types (c5/m5/r5/t3) all pass. After the resume, the system is inaccessible via ssh. The console screenshot does change, but the console log obtained from `aws ec2 get-console-output` does not. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-aws/+bug/1968062/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1968062] Re: jammy/linux-aws hibernation timeout on xen instances
This screenshot was taken a few minutes after the resume attempt. These ssm-amazon-agent messages repeat every 120 seconds with a new set. But this is all the progress we see from either the screenshot or the serial console. There are no new memory consumption messages indicating that the resume was complete. ** Attachment added: "Third screenshot after resume initiated" https://bugs.launchpad.net/ubuntu/+source/linux-aws/+bug/1968062/+attachment/5577703/+files/post-hibernate.16.jpg -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-aws in Ubuntu. https://bugs.launchpad.net/bugs/1968062 Title: jammy/linux-aws hibernation timeout on xen instances Status in linux-aws package in Ubuntu: New Bug description: Hibernation testing of jammy/linux-aws 5.15.0-1003-aws is failing on all xen instance types (c3/c4/i3/m3/m4/r3/r4/t2). The failure happens while attempting to resume from the first attempt to hibernate. Testing on nitro instances types (c5/m5/r5/t3) all pass. After the resume, the system is inaccessible via ssh. The console screenshot does change, but the console log obtained from `aws ec2 get-console-output` does not. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-aws/+bug/1968062/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1968062] Re: jammy/linux-aws hibernation timeout on xen instances
** Attachment added: "Last screenshot before hibernation" https://bugs.launchpad.net/ubuntu/+source/linux-aws/+bug/1968062/+attachment/5577676/+files/pre-hibernation.04.jpg -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-aws in Ubuntu. https://bugs.launchpad.net/bugs/1968062 Title: jammy/linux-aws hibernation timeout on xen instances Status in linux-aws package in Ubuntu: New Bug description: Hibernation testing of jammy/linux-aws 5.15.0-1003-aws is failing on all xen instance types (c3/c4/i3/m3/m4/r3/r4/t2). The failure happens while attempting to resume from the first attempt to hibernate. Testing on nitro instances types (c5/m5/r5/t3) all pass. After the resume, the system is inaccessible via ssh. The console screenshot does change, but the console log obtained from `aws ec2 get-console-output` does not. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-aws/+bug/1968062/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1968062] Re: jammy/linux-aws hibernation timeout on xen instances
** Attachment added: "First screenshot after resume initiated" https://bugs.launchpad.net/ubuntu/+source/linux-aws/+bug/1968062/+attachment/5577677/+files/post-hibernate.01.jpg -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-aws in Ubuntu. https://bugs.launchpad.net/bugs/1968062 Title: jammy/linux-aws hibernation timeout on xen instances Status in linux-aws package in Ubuntu: New Bug description: Hibernation testing of jammy/linux-aws 5.15.0-1003-aws is failing on all xen instance types (c3/c4/i3/m3/m4/r3/r4/t2). The failure happens while attempting to resume from the first attempt to hibernate. Testing on nitro instances types (c5/m5/r5/t3) all pass. After the resume, the system is inaccessible via ssh. The console screenshot does change, but the console log obtained from `aws ec2 get-console-output` does not. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-aws/+bug/1968062/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1968062] [NEW] jammy/linux-aws hibernation timeout on xen instances
Public bug reported: Hibernation testing of jammy/linux-aws 5.15.0-1003-aws is failing on all xen instance types (c3/c4/i3/m3/m4/r3/r4/t2). The failure happens while attempting to resume from the first attempt to hibernate. Testing on nitro instances types (c5/m5/r5/t3) all pass. After the resume, the system is inaccessible via ssh. The console screenshot does change, but the console log obtained from `aws ec2 get- console-output` does not. ** Affects: linux-aws (Ubuntu) Importance: Undecided Status: New ** Attachment added: "serial console log" https://bugs.launchpad.net/bugs/1968062/+attachment/5577675/+files/aws-jammy-all-c3.8xlarge-9-1.txt -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-aws in Ubuntu. https://bugs.launchpad.net/bugs/1968062 Title: jammy/linux-aws hibernation timeout on xen instances Status in linux-aws package in Ubuntu: New Bug description: Hibernation testing of jammy/linux-aws 5.15.0-1003-aws is failing on all xen instance types (c3/c4/i3/m3/m4/r3/r4/t2). The failure happens while attempting to resume from the first attempt to hibernate. Testing on nitro instances types (c5/m5/r5/t3) all pass. After the resume, the system is inaccessible via ssh. The console screenshot does change, but the console log obtained from `aws ec2 get-console-output` does not. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-aws/+bug/1968062/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1960871] [NEW] linux-modules-extra-* fails to install due to dependency on unsigned package
Public bug reported: Several SRU tests are failing the test setup due to failure to install the modules-extra package: * Command: yes "" | DEBIAN_FRONTEND=noninteractive apt-get install --yes --force-yes automake bison build-essential byacc flex git keyutils libacl1-dev libaio- dev libcap-dev libmm-dev libnuma-dev libsctp-dev libselinux1-dev libssl- dev libtirpc-dev pkg-config quota xfslibs-dev xfsprogs gcc linux-modules- extra-4.15.0-1120-aws Exit status: 100 Duration: 0.908210039139 stdout: Reading package lists... Building dependency tree... Reading state information... xfsprogs is already the newest version (4.9.0+nmu1ubuntu2). xfsprogs set to manually installed. git is already the newest version (1:2.17.1-1ubuntu0.9). git set to manually installed. Some packages could not be installed. This may mean that you have requested an impossible situation or if you are using the unstable distribution that some required packages have not yet been created or been moved out of Incoming. The following information may help to resolve the situation: The following packages have unmet dependencies: linux-modules-extra-4.15.0-1120-aws : Depends: linux-image-unsigned-4.15.0-1120-aws but it is not going to be installed stderr: W: --force-yes is deprecated, use one of the options starting with --allow instead. E: Unable to correct problems, you have held broken packages. ** Affects: linux-aws (Ubuntu) Importance: Undecided Status: New -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-aws in Ubuntu. https://bugs.launchpad.net/bugs/1960871 Title: linux-modules-extra-* fails to install due to dependency on unsigned package Status in linux-aws package in Ubuntu: New Bug description: Several SRU tests are failing the test setup due to failure to install the modules-extra package: * Command: yes "" | DEBIAN_FRONTEND=noninteractive apt-get install --yes --force-yes automake bison build-essential byacc flex git keyutils libacl1-dev libaio- dev libcap-dev libmm-dev libnuma-dev libsctp-dev libselinux1-dev libssl- dev libtirpc-dev pkg-config quota xfslibs-dev xfsprogs gcc linux-modules- extra-4.15.0-1120-aws Exit status: 100 Duration: 0.908210039139 stdout: Reading package lists... Building dependency tree... Reading state information... xfsprogs is already the newest version (4.9.0+nmu1ubuntu2). xfsprogs set to manually installed. git is already the newest version (1:2.17.1-1ubuntu0.9). git set to manually installed. Some packages could not be installed. This may mean that you have requested an impossible situation or if you are using the unstable distribution that some required packages have not yet been created or been moved out of Incoming. The following information may help to resolve the situation: The following packages have unmet dependencies: linux-modules-extra-4.15.0-1120-aws : Depends: linux-image-unsigned-4.15.0-1120-aws but it is not going to be installed stderr: W: --force-yes is deprecated, use one of the options starting with --allow instead. E: Unable to correct problems, you have held broken packages. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-aws/+bug/1960871/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1960094] Re: lxc/1:4.0.6-0ubuntu1~20.04.1 undefined symbol: strlcat in Focal
This is the result of pulling the lxc test sources from the git repo, but using the lxc from the archive. Currently, the archive has version 4.0.6 and the git repo has been updated to 4.0.12 as an upload is in progress (it's in the unapproved queue as this comment is being written). The result is a mismatch in the tests and the package and test failures. Switching to a different version of the test sources results in a passing test. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1960094 Title: lxc/1:4.0.6-0ubuntu1~20.04.1 undefined symbol: strlcat in Focal Status in linux package in Ubuntu: Incomplete Status in linux source package in Focal: New Bug description: There are failures in ubuntu_lxc regression tests on Focal/linux/5.4.0-99.112 sru cycle 2022.01.03 with the error lxc-create: symbol lookup error: lxc-create: undefined symbol: strlcat These errors did not appear on previous kernels in the same cycle and now have a few tests failing on all architectures and systems as of Feb 4th 2022 it seems. Log with details is attached in the comments. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1960094/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1960094] Re: lxc/1:4.0.6-0ubuntu1~20.04.1 undefined symbol: strlcat in Focal
I've retested two released kernels that passed the lxc test last cycle: * focal/azure 5.4.0-1068.71 * focal/azure-5-11 5.11.0-1028.31~20.04.1 Both tests now show the same testcase failures where they were passing before. Will start digging into any other changes in the environment. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1960094 Title: lxc/1:4.0.6-0ubuntu1~20.04.1 undefined symbol: strlcat in Focal Status in linux package in Ubuntu: New Status in lxc package in Ubuntu: Incomplete Status in linux source package in Focal: New Status in lxc source package in Focal: Incomplete Bug description: There are failures in ubuntu_lxc regression tests on Focal/linux/5.4.0-99.112 sru cycle 2022.01.03 with the error lxc-create: symbol lookup error: lxc-create: undefined symbol: strlcat These errors did not appear on previous kernels in the same cycle and now have a few tests failing on all architectures and systems as of Feb 4th 2022 it seems. Log with details is attached in the comments. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1960094/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1888992] Re: [Realtek ALC892, Green Headphone Out, Front] Not detecting/switching to front jack after plugging in headphones
Apologies, I was unable to access the machine for testing this until recently, so have been unable to run the test kernel in #21. Revisiting this problem again on the same machine with kernel 5.13.0-27 I still see it as behaving as originally reported, so I expect neither the patch nor proper fix has made its way to the ubuntu kernel. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1888992 Title: [Realtek ALC892, Green Headphone Out, Front] Not detecting/switching to front jack after plugging in headphones Status in linux package in Ubuntu: Expired Bug description: I boot my system without any sound devices plugged into any jacks; checking Gnome settings Sound, Output Device is "Dummy Output" (no other options available in drop-down list). I plug in headphones into the front jack and play some audio, but do not hear any sound from the headphones; checking Gnome settings Sound, the Output Device is still "Dummy Output", with no other options available. This is a regression from previous behaviour, when plugging in headphones into the front jack would automatically enable/switch to the front jack and its associated controller "Family 17h (Models 00h-0fh) HD Audio Controller" (this is the motherboard's onboard audio hardware). I can currently workaround this each time I plug in the headphones by installing and running pavucontrol, and under Configuration selecting "Analogue Stereo Output (unplugged) (unavailable)" - this option becomes "Analogue Stereo Output" after I select it. This seems to be a regression in pulseaudio after an upgrade from from 1:13.99.1-1ubuntu3.3 to 1:13.99.1-1ubuntu3.5. ProblemType: Bug DistroRelease: Ubuntu 20.04 Package: alsa-base 1.0.25+dfsg-0ubuntu5 ProcVersionSignature: Ubuntu 5.4.0-42.46-generic 5.4.44 Uname: Linux 5.4.0-42-generic x86_64 ApportVersion: 2.20.11-0ubuntu27.4 Architecture: amd64 AudioDevicesInUse: USERPID ACCESS COMMAND /dev/snd/controlC1: chinf 1741 F pulseaudio /dev/snd/pcmC1D0p: chinf 1741 F...m pulseaudio /dev/snd/controlC0: chinf 1741 F pulseaudio /dev/snd/timer: chinf 1741 f pulseaudio CasperMD5CheckResult: skip CurrentDesktop: ubuntu:GNOME Date: Sun Jul 26 12:46:52 2020 InstallationDate: Installed on 2018-10-29 (635 days ago) InstallationMedia: Ubuntu 18.10 "Cosmic Cuttlefish" - Release amd64 (20181017.3) PackageArchitecture: all SourcePackage: alsa-driver Symptom: audio Symptom_AlsaPlaybackTest: ALSA playback test through plughw:Generic successful Symptom_Card: Family 17h (Models 00h-0fh) HD Audio Controller - HD-Audio Generic Symptom_DevicesInUse: USERPID ACCESS COMMAND /dev/snd/controlC1: chinf 1741 F pulseaudio /dev/snd/controlC0: chinf 1741 F pulseaudio Symptom_Jack: Green Headphone Out, Front Symptom_PulsePlaybackTest: PulseAudio playback test successful Symptom_Type: No sound at all Title: [To Be Filled By O.E.M., Realtek ALC892, Green Headphone Out, Front] No sound at all UpgradeStatus: Upgraded to focal on 2020-05-04 (83 days ago) dmi.bios.date: 12/19/2018 dmi.bios.vendor: American Megatrends Inc. dmi.bios.version: P5.40 dmi.board.name: AB350 Pro4 dmi.board.vendor: ASRock dmi.chassis.asset.tag: To Be Filled By O.E.M. dmi.chassis.type: 3 dmi.chassis.vendor: To Be Filled By O.E.M. dmi.chassis.version: To Be Filled By O.E.M. dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvrP5.40:bd12/19/2018:svnToBeFilledByO.E.M.:pnToBeFilledByO.E.M.:pvrToBeFilledByO.E.M.:rvnASRock:rnAB350Pro4:rvr:cvnToBeFilledByO.E.M.:ct3:cvrToBeFilledByO.E.M.: dmi.product.family: To Be Filled By O.E.M. dmi.product.name: To Be Filled By O.E.M. dmi.product.sku: To Be Filled By O.E.M. dmi.product.version: To Be Filled By O.E.M. dmi.sys.vendor: To Be Filled By O.E.M. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1888992/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1928888] Re: test_utils_testsuite from ubuntu_qrt_apparmor linux ADT test failure with linux/5.11.0-18.19
Tests are now passing. ** Changed in: ubuntu-kernel-tests Status: New => Invalid -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/192 Title: test_utils_testsuite from ubuntu_qrt_apparmor linux ADT test failure with linux/5.11.0-18.19 Status in QA Regression Testing: Fix Released Status in ubuntu-kernel-tests: Invalid Status in linux package in Ubuntu: Invalid Bug description: This is a scripted bug report about ADT failures while running linux tests for linux/5.11.0-18.19 on hirsute. Whether this is caused by the dep8 tests of the tested source or the kernel has yet to be determined. Not a regression. Found to occur previously on hirsute/linux 5.11.0-14.15 Testing failed on: amd64: https://autopkgtest.ubuntu.com/results/autopkgtest-hirsute/hirsute/amd64/l/linux/20210515_005957_75e5a@/log.gz arm64: https://autopkgtest.ubuntu.com/results/autopkgtest-hirsute/hirsute/arm64/l/linux/20210513_203508_96fd3@/log.gz ppc64el: https://autopkgtest.ubuntu.com/results/autopkgtest-hirsute/hirsute/ppc64el/l/linux/20210513_163708_c0203@/log.gz s390x: https://autopkgtest.ubuntu.com/results/autopkgtest-hirsute/hirsute/s390x/l/linux/20210513_144454_54b04@/log.gz test_zz_cleanup_source_tree (__main__.ApparmorTestsuites) Cleanup downloaded source ... ok == FAIL: test_utils_testsuite (__main__.ApparmorTestsuites) Run utils (make check) -- Traceback (most recent call last): File "/tmp/autopkgtest.gBRfIs/build.V37/src/autotest/client/tmp/ubuntu_qrt_apparmor/src/qa-regression-testing/scripts/./test-apparmor.py", line 1841, in test_utils_testsuite self.assertEqual(expected, rc, result + report) AssertionError: 0 != 2 : Got exit code 2, expected 0 ERROR: capability CAP_CHECKPOINT_RESTORE not found in severity.db make: *** [Makefile:81: check_severity_db] Error 1 == FAIL: test_utils_testsuite3 (__main__.ApparmorTestsuites) Run utils (make check with python3) -- Traceback (most recent call last): File "/tmp/autopkgtest.gBRfIs/build.V37/src/autotest/client/tmp/ubuntu_qrt_apparmor/src/qa-regression-testing/scripts/./test-apparmor.py", line 1862, in test_utils_testsuite3 self.assertEqual(expected, rc, result + report) AssertionError: 0 != 2 : Got exit code 2, expected 0 ERROR: capability CAP_CHECKPOINT_RESTORE not found in severity.db make: *** [Makefile:81: check_severity_db] Error 1 -- Ran 58 tests in 1448.768s FAILED (failures=2) 23:36:54 INFO | END ERROR ubuntu_qrt_apparmor.test-apparmor.py ubuntu_qrt_apparmor.test-apparmor.pytimestamp=1621035414localtime=May 14 23:36:54 To manage notifications about this bug go to: https://bugs.launchpad.net/qa-regression-testing/+bug/192/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1939673] Re: Update the 470 and the 470-server NVIDIA drivers
Retested with the latest kernels as of Sept 7, 2021. Now able to see that ubuntu-drivers now lists the 470-server driver as an available option. For both Focal and Hirsute, I first had to enable -proposed. For Bionic, it worked with the 4.15.0-156.163 kernel that was released this week. Tested combinations: Release Kernel Bionic 4.15.0-156.163 Focal 5.4.0-85.95 (from focal-proposed) Hirsute 5.11.0-35.37 (from hirsute-proposed) Tested each driver with a cuda workload and use of nvidia-smi to verify the driver functions as expected after install. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to nvidia-graphics-drivers-470 in Ubuntu. https://bugs.launchpad.net/bugs/1939673 Title: Update the 470 and the 470-server NVIDIA drivers Status in nvidia-graphics-drivers-470 package in Ubuntu: Fix Released Status in nvidia-graphics-drivers-470-server package in Ubuntu: Triaged Status in nvidia-graphics-drivers-470 source package in Bionic: Fix Committed Status in nvidia-graphics-drivers-470-server source package in Bionic: Fix Committed Status in nvidia-graphics-drivers-470 source package in Focal: Fix Committed Status in nvidia-graphics-drivers-470-server source package in Focal: Fix Committed Status in nvidia-graphics-drivers-470 source package in Hirsute: Fix Committed Status in nvidia-graphics-drivers-470-server source package in Hirsute: Fix Committed Bug description: Update the 470 (UDA) and 470-server (ERD) NVIDIA series in Bionic, Focal, Hirsute. [Impact] These releases provide both bug fixes and new features, and we would like to make sure all of our users have access to these improvements. See the changelog entry below for a full list of changes and bugs. [Test Case] The following development and SRU process was followed: https://wiki.ubuntu.com/NVidiaUpdates Certification test suite must pass on a range of hardware: https://git.launchpad.net/plainbox-provider-sru/tree/units/sru.pxu The QA team that executed the tests will be in charge of attaching the artifacts and console output of the appropriate run to the bug. nVidia maintainers team members will not mark ‘verification-done’ until this has happened. [Regression Potential] In order to mitigate the regression potential, the results of the aforementioned system level tests are attached to this bug. [Discussion] [Changelog] == 470.57.02 (470-server) == * debian/nvidia_supported, debian/rules (LP: #1939673): - Use the json database file to generate the modaliases. - Fix aliases generation. We were not matching some of the aliases because of some missing zeroes in the subdevice and subvendor ids. * debian/pm-aliases-gen: - Fix aliases generation for runtimepm. We were not matching some of the aliases because of some missing zeroes in the subdevice and subvendor ids. == 470.63.01 (470) == * New upstream release (LP: #1939673): - Added support for the following GPUs: NVIDIA RTX A2000 - Fixed a Vulkan performance regression that affected rFactor2. * debian/nvidia_supported, debian/rules: - Use the json database file to generate the modaliases. - Fix aliases generation. We were not matching some of the aliases because of some missing zeroes in the subdevice and subvendor ids. * debian/pm-aliases-gen: - Fix aliases generation for runtimepm. We were not matching some of the aliases because of some missing zeroes in the subdevice and subvendor ids. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/nvidia-graphics-drivers-470/+bug/1939673/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1939673] Re: Update the 470 and the 470-server NVIDIA drivers
While the 470-server drivers installed and passed the cuda testing across bionic/focal/hirsute, "ubuntu-drivers" does not identify this as a option for any release. The response I see is: $ sudo ubuntu-drivers list --gpgpu WARNING:root:_pkg_get_support nvidia-driver-390: package has invalid Support Legacyheader, cannot determine support level nvidia-driver-450-server, (kernel modules provided by linux-modules-nvidia-450-server-generic) nvidia-driver-390, (kernel modules provided by linux-modules-nvidia-390-generic) nvidia-driver-418-server, (kernel modules provided by linux-modules-nvidia-418-server-generic) nvidia-driver-470, (kernel modules provided by linux-modules-nvidia-470-generic) nvidia-driver-460-server, (kernel modules provided by linux-modules-nvidia-460-server-generic) nvidia-driver-460, (kernel modules provided by linux-modules-nvidia-460-generic) This is from a hirsute host running linux-generic using a "Tesla V100-SXM2-16GB". -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to nvidia-graphics-drivers-470 in Ubuntu. https://bugs.launchpad.net/bugs/1939673 Title: Update the 470 and the 470-server NVIDIA drivers Status in nvidia-graphics-drivers-470 package in Ubuntu: Triaged Status in nvidia-graphics-drivers-470-server package in Ubuntu: Triaged Status in nvidia-graphics-drivers-470 source package in Bionic: Fix Committed Status in nvidia-graphics-drivers-470-server source package in Bionic: Fix Released Status in nvidia-graphics-drivers-470 source package in Focal: Fix Committed Status in nvidia-graphics-drivers-470-server source package in Focal: Fix Released Status in nvidia-graphics-drivers-470 source package in Hirsute: Fix Committed Status in nvidia-graphics-drivers-470-server source package in Hirsute: Fix Released Bug description: Update the 470 (UDA) and 470-server (ERD) NVIDIA series in Bionic, Focal, Hirsute. [Impact] These releases provide both bug fixes and new features, and we would like to make sure all of our users have access to these improvements. See the changelog entry below for a full list of changes and bugs. [Test Case] The following development and SRU process was followed: https://wiki.ubuntu.com/NVidiaUpdates Certification test suite must pass on a range of hardware: https://git.launchpad.net/plainbox-provider-sru/tree/units/sru.pxu The QA team that executed the tests will be in charge of attaching the artifacts and console output of the appropriate run to the bug. nVidia maintainers team members will not mark ‘verification-done’ until this has happened. [Regression Potential] In order to mitigate the regression potential, the results of the aforementioned system level tests are attached to this bug. [Discussion] [Changelog] == 470.57.02 (470-server) == * debian/nvidia_supported, debian/rules (LP: #1939673): - Use the json database file to generate the modaliases. - Fix aliases generation. We were not matching some of the aliases because of some missing zeroes in the subdevice and subvendor ids. * debian/pm-aliases-gen: - Fix aliases generation for runtimepm. We were not matching some of the aliases because of some missing zeroes in the subdevice and subvendor ids. == 470.63.01 (470) == * New upstream release (LP: #1939673): - Added support for the following GPUs: NVIDIA RTX A2000 - Fixed a Vulkan performance regression that affected rFactor2. * debian/nvidia_supported, debian/rules: - Use the json database file to generate the modaliases. - Fix aliases generation. We were not matching some of the aliases because of some missing zeroes in the subdevice and subvendor ids. * debian/pm-aliases-gen: - Fix aliases generation for runtimepm. We were not matching some of the aliases because of some missing zeroes in the subdevice and subvendor ids. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/nvidia-graphics-drivers-470/+bug/1939673/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1936577] Re: Introduce the 470-server series
Testing has completed for this 470-server driver across bionic, focal and hirsute. The LRM version of the driver was tested against the linux generic kernels currently in -proposed: 4.15.0-152-generic 5.4.0-81-generic 5.11.0-25-generic Testing consisted of running a basic set of the cuda samples (from 11.0). Additional testing on bionic and focal was done with the Data Center GPU Manager to exercise nvidia-fabricmanager and libnvidia-nscq-470. ** Tags removed: verification-needed-bionic verification-needed-focal verification-needed-hirsute ** Tags added: verification-done-bionic verification-done-focal verification-done-hirsute -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-restricted-modules in Ubuntu. https://bugs.launchpad.net/bugs/1936577 Title: Introduce the 470-server series Status in linux-restricted-modules package in Ubuntu: Triaged Status in nvidia-graphics-drivers-460-server package in Ubuntu: Fix Committed Status in linux-restricted-modules source package in Bionic: Triaged Status in nvidia-graphics-drivers-460-server source package in Bionic: Fix Committed Status in linux-restricted-modules source package in Focal: Triaged Status in nvidia-graphics-drivers-460-server source package in Focal: Fix Committed Status in linux-restricted-modules source package in Hirsute: Triaged Status in nvidia-graphics-drivers-460-server source package in Hirsute: Fix Committed Bug description: Introduce the 470 NVIDIA ERD (-server) series in Bionic, Focal, Hirsute. [Impact] These releases provide both bug fixes and new features, and we would like to make sure all of our users have access to these improvements. See the changelog entry below for a full list of changes and bugs. [Test Case] The following development and SRU process was followed: https://wiki.ubuntu.com/NVidiaUpdates Certification test suite must pass on a range of hardware: https://git.launchpad.net/plainbox-provider-sru/tree/units/sru.pxu The QA team that executed the tests will be in charge of attaching the artifacts and console output of the appropriate run to the bug. nVidia maintainers team members will not mark ‘verification-done’ until this has happened. [Regression Potential] In order to mitigate the regression potential, the results of the aforementioned system level tests are attached to this bug. [Discussion] [Changelog] == 470.57.02 (470-server) == * Initial release (LP: #1936577). To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-restricted-modules/+bug/1936577/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1876687] Re: function traceon/off triggers in ftace from ubuntu_kernel_selftests failed on B/F
Failed on bionic:linux generic amd64 host spitfire sru-20210621. ** Summary changed: - function traceon/off triggers in ftace from ubuntu_kernel_selftests failed on Focal + function traceon/off triggers in ftace from ubuntu_kernel_selftests failed on B/F ** Tags added: sru-20210621 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1876687 Title: function traceon/off triggers in ftace from ubuntu_kernel_selftests failed on B/F Status in ubuntu-kernel-tests: New Status in linux package in Ubuntu: Incomplete Bug description: Issue found on Focal 5.4.0-29.33 with node amaura (passed on rizzo, rizzo failed with other failures) # [27] ftrace - test for function traceon/off triggers [FAIL] Need to retest on amaura to check if this is just a glitch. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1876687/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1931131] Re: Update the 465 and the 460 NVIDIA driver series
@albertomilone. I've completed the basic CUDA testing of the server drivers. As with the prior version, the 450-server and 460-server drivers passed across the generic kernels for bionic, focal, groovy, hirsute and impish. This applies to both the dkms and lrm installations. Impish was tested with the 5.11 kernel. For the 418-server driver, the dkms driver passed for bionic, focal and groovy did not work with hirsute or impish (both with the 5.11 kernel). The LRM driver worked across all 5 series. This was the same behavior as the prior version of the driver. I plan on doing additional testing to get coverage on more platforms, but this passes the minimal testing criteria. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to nvidia-graphics-drivers-460 in Ubuntu. https://bugs.launchpad.net/bugs/1931131 Title: Update the 465 and the 460 NVIDIA driver series Status in linux package in Ubuntu: In Progress Status in nvidia-graphics-drivers-390 package in Ubuntu: In Progress Status in nvidia-graphics-drivers-418-server package in Ubuntu: In Progress Status in nvidia-graphics-drivers-450-server package in Ubuntu: In Progress Status in nvidia-graphics-drivers-460 package in Ubuntu: In Progress Status in nvidia-graphics-drivers-460-server package in Ubuntu: New Status in nvidia-graphics-drivers-465 package in Ubuntu: In Progress Status in nvidia-settings package in Ubuntu: In Progress Status in linux source package in Bionic: In Progress Status in nvidia-graphics-drivers-390 source package in Bionic: Fix Committed Status in nvidia-graphics-drivers-418-server source package in Bionic: Fix Committed Status in nvidia-graphics-drivers-450-server source package in Bionic: Fix Committed Status in nvidia-graphics-drivers-460 source package in Bionic: Fix Committed Status in nvidia-graphics-drivers-460-server source package in Bionic: Fix Committed Status in nvidia-graphics-drivers-465 source package in Bionic: Fix Committed Status in nvidia-settings source package in Bionic: Fix Committed Status in linux source package in Focal: In Progress Status in nvidia-graphics-drivers-390 source package in Focal: Fix Committed Status in nvidia-graphics-drivers-418-server source package in Focal: Fix Committed Status in nvidia-graphics-drivers-450-server source package in Focal: Fix Committed Status in nvidia-graphics-drivers-460 source package in Focal: Fix Committed Status in nvidia-graphics-drivers-460-server source package in Focal: Fix Committed Status in nvidia-graphics-drivers-465 source package in Focal: Fix Committed Status in nvidia-settings source package in Focal: Fix Committed Status in linux source package in Groovy: In Progress Status in nvidia-graphics-drivers-390 source package in Groovy: Fix Committed Status in nvidia-graphics-drivers-418-server source package in Groovy: Fix Committed Status in nvidia-graphics-drivers-450-server source package in Groovy: Fix Committed Status in nvidia-graphics-drivers-460 source package in Groovy: Fix Committed Status in nvidia-graphics-drivers-460-server source package in Groovy: Fix Committed Status in nvidia-graphics-drivers-465 source package in Groovy: Fix Committed Status in nvidia-settings source package in Groovy: Fix Committed Status in linux source package in Hirsute: In Progress Status in nvidia-graphics-drivers-390 source package in Hirsute: Fix Committed Status in nvidia-graphics-drivers-418-server source package in Hirsute: Fix Committed Status in nvidia-graphics-drivers-450-server source package in Hirsute: Fix Committed Status in nvidia-graphics-drivers-460 source package in Hirsute: Fix Committed Status in nvidia-graphics-drivers-460-server source package in Hirsute: Fix Committed Status in nvidia-graphics-drivers-465 source package in Hirsute: Fix Committed Status in nvidia-settings source package in Hirsute: Fix Committed Bug description: Update the 465 and the 460 NVIDIA driver series, and add support for Linux 5.13 to all the driver series. [Impact] These releases provide both bug fixes and new features, and we would like to make sure all of our users have access to these improvements. See the changelog entry below for a full list of changes and bugs. [Test Case] The following development and SRU process was followed: https://wiki.ubuntu.com/NVidiaUpdates Certification test suite must pass on a range of hardware: https://git.launchpad.net/plainbox-provider-sru/tree/units/sru.pxu The QA team that executed the tests will be in charge of attaching the artifacts and console output of the appropriate run to the bug. nVidia maintainers team members will not mark ‘verification-done’ until this has happened. [Regression Potential] In order to mitigate the regression potential, the results of the aforementioned system level tests are attached to this bug. [Discussion] [Changelog]
[Kernel-packages] [Bug 1934424] [NEW] kernel NULL pointer dereference during xen hibernation
Public bug reported: Encountered the following panic while doing hibernation/resume testing with linux-aws 5.8 on Focal on an m3.xlarge (xen) instance type: [ 594.291317] ACPI: Hardware changed while hibernated, success doubtful! [ 594.411609] BUG: kernel NULL pointer dereference, address: 01f4 [ 594.424658] #PF: supervisor write access in kernel mode [ 594.424660] #PF: error_code(0x0002) - not-present page [ 594.424661] PGD 0 P4D 0 [ 594.424665] Oops: 0002 [#1] SMP PTI [ 594.424668] CPU: 3 PID: 362 Comm: systemd-timesyn Not tainted 5.8.0-1036-aws #38~20.04.1-Ubuntu [ 594.424669] Hardware name: Xen HVM domU, BIOS 4.2.amazon 08/24/2006 [ 594.424675] RIP: 0010:_raw_spin_lock_irqsave+0x23/0x40 [ 594.424678] Code: 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 54 9c 58 0f 1f 44 00 00 49 89 c4 fa 66 0f 1f 44 00 00 31 c0 ba 01 00 00 00 0f b1 17 75 07 4c 89 e0 41 5c 5d c3 89 c6 e8 e9 d1 56 ff 66 90 [ 594.424679] RSP: 0018:c94e3848 EFLAGS: 00010046 [ 594.424680] RAX: RBX: 8883bcc0d000 RCX: 0e02 [ 594.424681] RDX: 0001 RSI: RDI: 01f4 [ 594.424682] RBP: c94e3850 R08: 8883b90b5ec0 R09: 005a [ 594.424683] R10: c94e3910 R11: R12: 0206 [ 594.424684] R13: ea000ee42d40 R14: R15: 0001 [ 594.424686] FS: 7f65ba055980() GS:8883c0ac() knlGS: [ 594.424687] CS: 0010 DS: ES: CR0: 80050033 [ 594.424688] CR2: 01f4 CR3: 0003b99f0001 CR4: 001606e0 [ 594.424692] Call Trace: [ 594.424699] xennet_start_xmit+0x158/0x570 [ 594.424704] dev_hard_start_xmit+0x91/0x1f0 [ 594.424706] ? validate_xmit_skb+0x300/0x340 [ 594.424710] sch_direct_xmit+0x113/0x340 [ 594.424712] __dev_queue_xmit+0x57c/0x8e0 [ 594.424714] ? neigh_add_timer+0x37/0x60 [ 594.424716] dev_queue_xmit+0x10/0x20 [ 594.424717] neigh_resolve_output+0x112/0x1c0 [ 594.424721] ip_finish_output2+0x19b/0x590 [ 594.424723] __ip_finish_output+0xc8/0x1e0 [ 594.424725] ip_finish_output+0x2d/0xb0 [ 594.424728] ip_output+0x7a/0xf0 [ 594.424730] ? __ip_finish_output+0x1e0/0x1e0 [ 594.424732] ip_local_out+0x3d/0x50 [ 594.424734] ip_send_skb+0x19/0x40 [ 594.424737] udp_send_skb.isra.0+0x165/0x390 [ 594.424739] udp_sendmsg+0xb0e/0xd50 [ 594.424742] ? ip_reply_glue_bits+0x50/0x50 [ 594.424747] ? delete_from_swap_cache+0x6a/0x90 [ 594.424750] ? _cond_resched+0x19/0x30 [ 594.424754] ? aa_sk_perm+0x43/0x1b0 [ 594.424757] inet_sendmsg+0x65/0x70 [ 594.424759] ? security_socket_sendmsg+0x35/0x50 [ 594.424760] ? inet_sendmsg+0x65/0x70 [ 594.424764] sock_sendmsg+0x5e/0x70 [ 594.424766] __sys_sendto+0x113/0x190 [ 594.424770] ? __secure_computing+0x42/0xe0 [ 594.424774] ? syscall_trace_enter+0x10d/0x280 [ 594.424777] __x64_sys_sendto+0x29/0x30 [ 594.424781] do_syscall_64+0x49/0xc0 [ 594.424783] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 594.424785] RIP: 0033:0x7f65baee4844 [ 594.424788] Code: 42 3f f7 ff 44 8b 4c 24 2c 4c 8b 44 24 20 89 c5 44 8b 54 24 28 48 8b 54 24 18 b8 2c 00 00 00 48 8b 74 24 10 8b 7c 24 08 0f 05 <48> 3d 00 f0 ff ff 77 30 89 ef 48 89 44 24 08 e8 68 3f f7 ff 48 8b [ 594.424789] RSP: 002b:7ffe9b5fd3a0 EFLAGS: 0293 ORIG_RAX: 002c [ 594.424790] RAX: ffda RBX: 7ffe9b5fd4e0 RCX: 7f65baee4844 [ 594.424791] RDX: 0030 RSI: 7ffe9b5fd3f0 RDI: 0010 [ 594.424792] RBP: R08: 560426541678 R09: 0010 [ 594.424793] R10: 0040 R11: 0293 R12: [ 594.424794] R13: 7ffe9b5fd3e4 R14: 0068 R15: [ 594.424796] Modules linked in: btrfs blake2b_generic xor raid6_pq ufs msdos xfs libcrc32c dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd psmouse cryptd input_leds glue_helper serio_raw floppy sch_fq_codel drm ip_tables x_tables autofs4 [ 594.424813] CR2: 01f4 [ 594.424821] ---[ end trace bb5f35055c1a8060 ]--- [ 594.424822] RIP: 0010:_raw_spin_lock_irqsave+0x23/0x40 [ 594.424824] Code: 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 54 9c 58 0f 1f 44 00 00 49 89 c4 fa 66 0f 1f 44 00 00 31 c0 ba 01 00 00 00 0f b1 17 75 07 4c 89 e0 41 5c 5d c3 89 c6 e8 e9 d1 56 ff 66 90 [ 594.424824] RSP: 0018:c94e3848 EFLAGS: 00010046 [ 594.424825] RAX: RBX: 8883bcc0d000 RCX: 0e02 [ 594.424826] RDX: 0001 RSI: RDI: 01f4 [ 594.424826] RBP: c94e3850 R08: 8883b90b5ec0 R09: 005a [ 594.424827] R10: c94e3910 R11: R12: 0206 [ 594.424827] R13: ea000ee42d40 R14: R15: 0001 [ 594.424828] FS: 7f65ba055980() GS:8883c0ac0
[Kernel-packages] [Bug 1923191] Re: cpuhotplug from ubuntu_ltp failed for cpuhotplug02 cpuhotplug03 cpuhotplug04 cpuhotplug06
** Tags added: 5.4 sru-20210510 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-azure in Ubuntu. https://bugs.launchpad.net/bugs/1923191 Title: cpuhotplug from ubuntu_ltp failed for cpuhotplug02 cpuhotplug03 cpuhotplug04 cpuhotplug06 Status in ubuntu-kernel-tests: New Status in linux-azure package in Ubuntu: New Status in linux-azure-4.15 package in Ubuntu: New Status in linux-azure source package in Trusty: New Status in linux-azure-4.15 source package in Trusty: New Status in linux-azure source package in Xenial: New Status in linux-azure-4.15 source package in Xenial: New Status in linux-azure source package in Bionic: New Status in linux-azure-4.15 source package in Bionic: New Bug description: cpuhotplug02: Name: cpuhotplug02 Date: Tue Mar 30 13:32:56 UTC 2021 Desc: What happens to a process when its CPU is offlined? CPU is 1 sh: echo: I/O error cpuhotplug02 1 TFAIL: process did not change from CPU 1 tag=cpuhotplug02 stime=161776 dur=5 exit=exited stat=1 core=no cu=2 cs=2 startup='Tue Mar 30 13:33:06 2021' cpuhotplug03: Name: cpuhotplug03 Date: Tue Mar 30 13:33:06 UTC 2021 Desc: Do tasks get scheduled to a newly on-lined CPU? CPU is 1 sh: echo: I/O error cpuhotplug03 1 TBROK: CPU1 cannot be offlined USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 20613 0.0 0.0 4636 864 ? R 13:33 0:00 /bin/sh /opt/ltp/testcases/bin/cpuhotplug_do_spin_loop root 20614 0.0 0.0 4636 812 ? R 13:33 0:00 /bin/sh /opt/ltp/testcases/bin/cpuhotplug_do_spin_loop root 20615 0.0 0.0 4636 812 ? R 13:33 0:00 /bin/sh /opt/ltp/testcases/bin/cpuhotplug_do_spin_loop root 20616 0.0 0.0 4636 880 ? R 13:33 0:00 /bin/sh /opt/ltp/testcases/bin/cpuhotplug_do_spin_loop root 20620 0.0 0.0 14864 1092 ? S 13:33 0:00 grep cpuhotplug_do_spin_loop cpuhotplug03 1 TINFO: Onlining CPU 1 1 /bin/sh /opt/ltp/testcases/bin/cpuhotplug_do_spin_loop 1 /bin/sh /opt/ltp/testcases/bin/cpuhotplug_do_spin_loop 0 /bin/sh /opt/ltp/testcases/bin/cpuhotplug_do_spin_loop 0 /bin/sh /opt/ltp/testcases/bin/cpuhotplug_do_spin_loop cpuhotplug03 1 TPASS: 2 cpuhotplug_do_spin_loop processes found on CPU1 tag=cpuhotplug03 stime=161786 dur=2 exit=exited stat=2 core=no cu=242 cs=38 startup='Tue Mar 30 13:33:08 2021' cpuhotplug04: Name: cpuhotplug04 Date: Tue Mar 30 13:33:08 UTC 2021 Desc: Does it prevent us from offlining the last CPU? sh: echo: I/O error cpuhotplug04 1 TFAIL: Could not offline cpu1 tag=cpuhotplug04 stime=161788 dur=0 exit=exited stat=1 core=no cu=4 cs=4 startup='Tue Mar 30 13:33:08 2021' cpuhotplug06 Name: cpuhotplug06 Date: Tue Mar 30 13:33:08 UTC 2021 Desc: Does top work properly when CPU hotplug events occur? CPU is 1 sh: echo: I/O error cpuhotplug06 1 TBROK: CPU1 cannot be offlined 20913 ? 00:00:00 top tag=cpuhotplug06 stime=161788 dur=1 exit=exited stat=2 core=no cu=2 cs=3 startup='Tue Mar 30 13:33:15 2021' http://10.246.72.46/4.15.0-1112.125-azure/bionic-linux-azure-4.15-azure-amd64-4.15.0-Basic_A2-ubuntu_ltp/ubuntu_ltp/results/ubuntu_ltp.cpuhotplug/debug/ubuntu_ltp.cpuhotplug.DEBUG.html To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1923191/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1925522] Re: Introduce the 465 driver series, fabric-manager, and libnvidia-nscq
I've completed additional testing on bionic on a dgx2: nvidia-graphics-drivers-450-server: Both the bionic lrm and dkms versions were tested successfully with via the cuda samples test (version 11.0). fabric-manager-450 and libnvidia-nscq-450: These passed using the cuda samples test and 'dcgmi discovery -l' on a dgx-2 host using bionic. I also retested the focal lrm version nvidia-graphics-drivers-450-server on a gcp cloud VM. It now installs with the kernel and lrm packages in focal-proposed. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to nvidia-settings in Ubuntu. https://bugs.launchpad.net/bugs/1925522 Title: Introduce the 465 driver series, fabric-manager, and libnvidia-nscq Status in fabric-manager-450 package in Ubuntu: Fix Released Status in fabric-manager-460 package in Ubuntu: Fix Released Status in libnvidia-nscq-450 package in Ubuntu: Fix Released Status in libnvidia-nscq-460 package in Ubuntu: Fix Released Status in linux-restricted-modules package in Ubuntu: In Progress Status in nvidia-graphics-drivers-450-server package in Ubuntu: In Progress Status in nvidia-graphics-drivers-460 package in Ubuntu: In Progress Status in nvidia-graphics-drivers-465 package in Ubuntu: Fix Released Status in nvidia-settings package in Ubuntu: In Progress Status in fabric-manager-450 source package in Bionic: Fix Committed Status in fabric-manager-460 source package in Bionic: Fix Committed Status in libnvidia-nscq-450 source package in Bionic: Fix Committed Status in libnvidia-nscq-460 source package in Bionic: Fix Committed Status in linux-restricted-modules source package in Bionic: Fix Committed Status in nvidia-graphics-drivers-450-server source package in Bionic: Fix Committed Status in nvidia-graphics-drivers-460 source package in Bionic: Fix Committed Status in nvidia-graphics-drivers-465 source package in Bionic: Fix Committed Status in nvidia-settings source package in Bionic: Fix Committed Status in fabric-manager-450 source package in Focal: Fix Committed Status in fabric-manager-460 source package in Focal: Fix Committed Status in libnvidia-nscq-450 source package in Focal: Fix Committed Status in libnvidia-nscq-460 source package in Focal: Fix Committed Status in linux-restricted-modules source package in Focal: Fix Committed Status in nvidia-graphics-drivers-450-server source package in Focal: Fix Committed Status in nvidia-graphics-drivers-460 source package in Focal: Fix Committed Status in nvidia-graphics-drivers-465 source package in Focal: Fix Committed Status in nvidia-settings source package in Focal: Fix Committed Status in fabric-manager-450 source package in Groovy: Fix Committed Status in fabric-manager-460 source package in Groovy: Fix Committed Status in libnvidia-nscq-450 source package in Groovy: Fix Committed Status in libnvidia-nscq-460 source package in Groovy: Fix Committed Status in linux-restricted-modules source package in Groovy: Fix Committed Status in nvidia-graphics-drivers-450-server source package in Groovy: Fix Committed Status in nvidia-graphics-drivers-460 source package in Groovy: Fix Committed Status in nvidia-graphics-drivers-465 source package in Groovy: Fix Committed Status in nvidia-settings source package in Groovy: Fix Committed Status in fabric-manager-450 source package in Hirsute: Fix Committed Status in fabric-manager-460 source package in Hirsute: Fix Committed Status in libnvidia-nscq-450 source package in Hirsute: Fix Committed Status in libnvidia-nscq-460 source package in Hirsute: Fix Committed Status in linux-restricted-modules source package in Hirsute: Fix Committed Status in nvidia-graphics-drivers-450-server source package in Hirsute: Fix Committed Status in nvidia-graphics-drivers-460 source package in Hirsute: Fix Committed Status in nvidia-graphics-drivers-465 source package in Hirsute: Fix Committed Status in nvidia-settings source package in Hirsute: Fix Committed Bug description: Introduce the new NVIDIA 465 driver series, fabric-manager and libnvidia-nscq. Also migrate the UDA 450 series to the 460 series. [Impact] These releases provide both bug fixes and new features, and we would like to make sure all of our users have access to these improvements. See the changelog entry below for a full list of changes and bugs. [Test Case] The following development and SRU process was followed: https://wiki.ubuntu.com/NVidiaUpdates Certification test suite must pass on a range of hardware: https://git.launchpad.net/plainbox-provider-sru/tree/units/sru.pxu The QA team that executed the tests will be in charge of attaching the artifacts and console output of the appropriate run to the bug. nVidia maintainers team members will not mark ‘verification-done’ until this has happened. [Regression Potential] In order to mitigate the regression potential, the r
[Kernel-packages] [Bug 1925522] Re: Introduce the 465 driver series, fabric-manager, and libnvidia-nscq
fabric-manager-460 and libnvidia-nscq-460: These passed using the cuda samples test and 'dcgmi discovery -l' test on a dgx-2 host using bionic and focal. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to nvidia-settings in Ubuntu. https://bugs.launchpad.net/bugs/1925522 Title: Introduce the 465 driver series, fabric-manager, and libnvidia-nscq Status in fabric-manager-450 package in Ubuntu: Fix Released Status in fabric-manager-460 package in Ubuntu: Fix Released Status in libnvidia-nscq-450 package in Ubuntu: Fix Released Status in libnvidia-nscq-460 package in Ubuntu: Fix Released Status in linux-restricted-modules package in Ubuntu: In Progress Status in nvidia-graphics-drivers-450-server package in Ubuntu: In Progress Status in nvidia-graphics-drivers-460 package in Ubuntu: In Progress Status in nvidia-graphics-drivers-465 package in Ubuntu: Fix Released Status in nvidia-settings package in Ubuntu: In Progress Status in fabric-manager-450 source package in Bionic: Fix Committed Status in fabric-manager-460 source package in Bionic: Fix Committed Status in libnvidia-nscq-450 source package in Bionic: Fix Committed Status in libnvidia-nscq-460 source package in Bionic: Fix Committed Status in linux-restricted-modules source package in Bionic: Fix Committed Status in nvidia-graphics-drivers-450-server source package in Bionic: Fix Committed Status in nvidia-graphics-drivers-460 source package in Bionic: Fix Committed Status in nvidia-graphics-drivers-465 source package in Bionic: Fix Committed Status in nvidia-settings source package in Bionic: Fix Committed Status in fabric-manager-450 source package in Focal: Fix Committed Status in fabric-manager-460 source package in Focal: Fix Committed Status in libnvidia-nscq-450 source package in Focal: Fix Committed Status in libnvidia-nscq-460 source package in Focal: Fix Committed Status in linux-restricted-modules source package in Focal: Fix Committed Status in nvidia-graphics-drivers-450-server source package in Focal: Fix Committed Status in nvidia-graphics-drivers-460 source package in Focal: Fix Committed Status in nvidia-graphics-drivers-465 source package in Focal: Fix Committed Status in nvidia-settings source package in Focal: Fix Committed Status in fabric-manager-450 source package in Groovy: Fix Committed Status in fabric-manager-460 source package in Groovy: Fix Committed Status in libnvidia-nscq-450 source package in Groovy: Fix Committed Status in libnvidia-nscq-460 source package in Groovy: Fix Committed Status in linux-restricted-modules source package in Groovy: Fix Committed Status in nvidia-graphics-drivers-450-server source package in Groovy: Fix Committed Status in nvidia-graphics-drivers-460 source package in Groovy: Fix Committed Status in nvidia-graphics-drivers-465 source package in Groovy: Fix Committed Status in nvidia-settings source package in Groovy: Fix Committed Status in fabric-manager-450 source package in Hirsute: Fix Committed Status in fabric-manager-460 source package in Hirsute: Fix Committed Status in libnvidia-nscq-450 source package in Hirsute: Fix Committed Status in libnvidia-nscq-460 source package in Hirsute: Fix Committed Status in linux-restricted-modules source package in Hirsute: Fix Committed Status in nvidia-graphics-drivers-450-server source package in Hirsute: Fix Committed Status in nvidia-graphics-drivers-460 source package in Hirsute: Fix Committed Status in nvidia-graphics-drivers-465 source package in Hirsute: Fix Committed Status in nvidia-settings source package in Hirsute: Fix Committed Bug description: Introduce the new NVIDIA 465 driver series, fabric-manager and libnvidia-nscq. Also migrate the UDA 450 series to the 460 series. [Impact] These releases provide both bug fixes and new features, and we would like to make sure all of our users have access to these improvements. See the changelog entry below for a full list of changes and bugs. [Test Case] The following development and SRU process was followed: https://wiki.ubuntu.com/NVidiaUpdates Certification test suite must pass on a range of hardware: https://git.launchpad.net/plainbox-provider-sru/tree/units/sru.pxu The QA team that executed the tests will be in charge of attaching the artifacts and console output of the appropriate run to the bug. nVidia maintainers team members will not mark ‘verification-done’ until this has happened. [Regression Potential] In order to mitigate the regression potential, the results of the aforementioned system level tests are attached to this bug. [Discussion] [Changelog] == 450-server == * New upstream release (LP: #1925522). - Fixed an issue with the NSCQ library that caused clients such as DCGM to fail to load the library. Since the NSCQ library version needs to match the driver
[Kernel-packages] [Bug 1925522] Re: Introduce the 465 driver series, fabric-manager, and libnvidia-nscq
nvidia-graphics-drivers-450-server, 450.119.04 results: The DKMS driver was tested via the cuda samples test. This passed for bionic, focal and groovy, but failed for hirsute and impish. This failure was expected as the nvidia_uvm module doens't currently load with the 5.11 kernel in hirsute and impish. The LRM driver was also tested via the cuda samples test. This passed for groovy, hirsute and impish, but failed on bionic and focal as there isn't yet a matching lrm package for the 450.119.04 driver. When these packages are available later in the kernel SRU cycle, it will be retested. fabric-manager-450 and libnvidia-nscq-450: These passed using the cuda samples test and 'dcgmi discovery -l' on a dgx-2 host using focal. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to nvidia-settings in Ubuntu. https://bugs.launchpad.net/bugs/1925522 Title: Introduce the 465 driver series, fabric-manager, and libnvidia-nscq Status in fabric-manager-450 package in Ubuntu: Fix Released Status in fabric-manager-460 package in Ubuntu: Fix Released Status in libnvidia-nscq-450 package in Ubuntu: Fix Released Status in libnvidia-nscq-460 package in Ubuntu: Fix Released Status in linux-restricted-modules package in Ubuntu: In Progress Status in nvidia-graphics-drivers-450-server package in Ubuntu: In Progress Status in nvidia-graphics-drivers-460 package in Ubuntu: In Progress Status in nvidia-graphics-drivers-465 package in Ubuntu: Fix Released Status in nvidia-settings package in Ubuntu: In Progress Status in fabric-manager-450 source package in Bionic: Fix Committed Status in fabric-manager-460 source package in Bionic: Fix Committed Status in libnvidia-nscq-450 source package in Bionic: Fix Committed Status in libnvidia-nscq-460 source package in Bionic: Fix Committed Status in linux-restricted-modules source package in Bionic: Fix Committed Status in nvidia-graphics-drivers-450-server source package in Bionic: Fix Committed Status in nvidia-graphics-drivers-460 source package in Bionic: Fix Committed Status in nvidia-graphics-drivers-465 source package in Bionic: Fix Committed Status in nvidia-settings source package in Bionic: Fix Committed Status in fabric-manager-450 source package in Focal: Fix Committed Status in fabric-manager-460 source package in Focal: Fix Committed Status in libnvidia-nscq-450 source package in Focal: Fix Committed Status in libnvidia-nscq-460 source package in Focal: Fix Committed Status in linux-restricted-modules source package in Focal: Fix Committed Status in nvidia-graphics-drivers-450-server source package in Focal: Fix Committed Status in nvidia-graphics-drivers-460 source package in Focal: Fix Committed Status in nvidia-graphics-drivers-465 source package in Focal: Fix Committed Status in nvidia-settings source package in Focal: Fix Committed Status in fabric-manager-450 source package in Groovy: Fix Committed Status in fabric-manager-460 source package in Groovy: Fix Committed Status in libnvidia-nscq-450 source package in Groovy: Fix Committed Status in libnvidia-nscq-460 source package in Groovy: Fix Committed Status in linux-restricted-modules source package in Groovy: Fix Committed Status in nvidia-graphics-drivers-450-server source package in Groovy: Fix Committed Status in nvidia-graphics-drivers-460 source package in Groovy: Fix Committed Status in nvidia-graphics-drivers-465 source package in Groovy: Fix Committed Status in nvidia-settings source package in Groovy: Fix Committed Status in fabric-manager-450 source package in Hirsute: Fix Committed Status in fabric-manager-460 source package in Hirsute: Fix Committed Status in libnvidia-nscq-450 source package in Hirsute: Fix Committed Status in libnvidia-nscq-460 source package in Hirsute: Fix Committed Status in linux-restricted-modules source package in Hirsute: Fix Committed Status in nvidia-graphics-drivers-450-server source package in Hirsute: Fix Committed Status in nvidia-graphics-drivers-460 source package in Hirsute: Fix Committed Status in nvidia-graphics-drivers-465 source package in Hirsute: Fix Committed Status in nvidia-settings source package in Hirsute: Fix Committed Bug description: Introduce the new NVIDIA 465 driver series, fabric-manager and libnvidia-nscq. Also migrate the UDA 450 series to the 460 series. [Impact] These releases provide both bug fixes and new features, and we would like to make sure all of our users have access to these improvements. See the changelog entry below for a full list of changes and bugs. [Test Case] The following development and SRU process was followed: https://wiki.ubuntu.com/NVidiaUpdates Certification test suite must pass on a range of hardware: https://git.launchpad.net/plainbox-provider-sru/tree/units/sru.pxu The QA team that executed the tests will be in charge of attaching the artifacts
[Kernel-packages] [Bug 1925407] Re: ubuntu_aufs_smoke_test failed on Hirsute RISCV (CONFIG_AUFS_FS is not set)
This is resolved in autotest-client-tests 5b2d46bf401e50a13be5c0641e0157d58c2a7669 ** Changed in: ubuntu-kernel-tests Status: New => Fix Released ** Changed in: linux-riscv (Ubuntu) Status: New => Won't Fix ** Changed in: linux-riscv (Ubuntu Hirsute) Status: New => Won't Fix -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-riscv in Ubuntu. https://bugs.launchpad.net/bugs/1925407 Title: ubuntu_aufs_smoke_test failed on Hirsute RISCV (CONFIG_AUFS_FS is not set) Status in ubuntu-kernel-tests: Fix Released Status in linux-riscv package in Ubuntu: Won't Fix Status in linux-riscv source package in Hirsute: Won't Fix Bug description: Issue found on 5.11.0-1005.5 RISCV kernel Test failed with: Running '/home/ubuntu/autotest/client/tests/ubuntu_aufs_smoke_test/ubuntu_aufs_smoke_test.sh' mount: /tmp/aufs/aufs-root: unknown filesystem type 'aufs'. aufs: mount: FAILED: ret=32 It looks like this is simply because the kernel configs were not enabled: ubuntu@riscv64-hirsute:~$ grep AUFS /boot/config-5.11.0-1005-generic # CONFIG_AUFS_FS is not set I think it's more like a decision making here. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1925407/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1923062] Re: NVIDIA CVE-2021-1076 CVE-2021-1077
Also tested linux-modules-nvidia-460-server-generic-hwe-18.04 on bionic and linux-modules-nvidia-460-server-generic-hwe-20.04 on focal. They look good now. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to nvidia-graphics-drivers-390 in Ubuntu. https://bugs.launchpad.net/bugs/1923062 Title: NVIDIA CVE-2021-1076 CVE-2021-1077 Status in nvidia-graphics-drivers-390 package in Ubuntu: In Progress Status in nvidia-graphics-drivers-418-server package in Ubuntu: In Progress Status in nvidia-graphics-drivers-450 package in Ubuntu: In Progress Status in nvidia-graphics-drivers-450-server package in Ubuntu: In Progress Status in nvidia-graphics-drivers-460 package in Ubuntu: In Progress Status in nvidia-graphics-drivers-460-server package in Ubuntu: In Progress Status in nvidia-graphics-drivers-390 source package in Bionic: In Progress Status in nvidia-graphics-drivers-418-server source package in Bionic: In Progress Status in nvidia-graphics-drivers-450 source package in Bionic: In Progress Status in nvidia-graphics-drivers-450-server source package in Bionic: In Progress Status in nvidia-graphics-drivers-460 source package in Bionic: In Progress Status in nvidia-graphics-drivers-460-server source package in Bionic: In Progress Status in nvidia-graphics-drivers-390 source package in Focal: In Progress Status in nvidia-graphics-drivers-418-server source package in Focal: In Progress Status in nvidia-graphics-drivers-450 source package in Focal: In Progress Status in nvidia-graphics-drivers-450-server source package in Focal: In Progress Status in nvidia-graphics-drivers-460 source package in Focal: In Progress Status in nvidia-graphics-drivers-460-server source package in Focal: In Progress Status in nvidia-graphics-drivers-390 source package in Groovy: In Progress Status in nvidia-graphics-drivers-418-server source package in Groovy: In Progress Status in nvidia-graphics-drivers-450 source package in Groovy: In Progress Status in nvidia-graphics-drivers-450-server source package in Groovy: In Progress Status in nvidia-graphics-drivers-460 source package in Groovy: In Progress Status in nvidia-graphics-drivers-460-server source package in Groovy: In Progress Bug description: Here is the list of the affected drivers: 418-server - CVE-2021-1076 460, 450, 460-server, 450-server - CVE-2021-1076 CVE-2021-1077 390 - CVE-2021-1076 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/nvidia-graphics-drivers-390/+bug/1923062/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1923062] Re: NVIDIA CVE-2021-1076 CVE-2021-1077
Testing of the server drivers is also complete. All drivers were able to install as both dkms and l-r-m with no issues. A cuda workload and nvidia-smi was also tested for each. Note: no upgrade testing was attempted. Drivers were fully purged before testing the next driver. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to nvidia-graphics-drivers-390 in Ubuntu. https://bugs.launchpad.net/bugs/1923062 Title: NVIDIA CVE-2021-1076 CVE-2021-1077 Status in nvidia-graphics-drivers-390 package in Ubuntu: In Progress Status in nvidia-graphics-drivers-418-server package in Ubuntu: In Progress Status in nvidia-graphics-drivers-450 package in Ubuntu: In Progress Status in nvidia-graphics-drivers-450-server package in Ubuntu: In Progress Status in nvidia-graphics-drivers-460 package in Ubuntu: In Progress Status in nvidia-graphics-drivers-460-server package in Ubuntu: In Progress Status in nvidia-graphics-drivers-390 source package in Bionic: In Progress Status in nvidia-graphics-drivers-418-server source package in Bionic: In Progress Status in nvidia-graphics-drivers-450 source package in Bionic: In Progress Status in nvidia-graphics-drivers-450-server source package in Bionic: In Progress Status in nvidia-graphics-drivers-460 source package in Bionic: In Progress Status in nvidia-graphics-drivers-460-server source package in Bionic: In Progress Status in nvidia-graphics-drivers-390 source package in Focal: In Progress Status in nvidia-graphics-drivers-418-server source package in Focal: In Progress Status in nvidia-graphics-drivers-450 source package in Focal: In Progress Status in nvidia-graphics-drivers-450-server source package in Focal: In Progress Status in nvidia-graphics-drivers-460 source package in Focal: In Progress Status in nvidia-graphics-drivers-460-server source package in Focal: In Progress Status in nvidia-graphics-drivers-390 source package in Groovy: In Progress Status in nvidia-graphics-drivers-418-server source package in Groovy: In Progress Status in nvidia-graphics-drivers-450 source package in Groovy: In Progress Status in nvidia-graphics-drivers-450-server source package in Groovy: In Progress Status in nvidia-graphics-drivers-460 source package in Groovy: In Progress Status in nvidia-graphics-drivers-460-server source package in Groovy: In Progress Bug description: Here is the list of the affected drivers: 418-server - CVE-2021-1076 460, 450, 460-server, 450-server - CVE-2021-1076 CVE-2021-1077 390 - CVE-2021-1076 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/nvidia-graphics-drivers-390/+bug/1923062/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1922387] Re: BUG: kernel NULL pointer dereference, address: 0000000000000050
This panic occurred while running the ubuntu_kernel_selftests suite. The last bit of logs are: 13:33:20 DEBUG| [stdout] # selftests: ftrace: ftracetest 13:33:20 DEBUG| [stdout] # === Ftrace unit tests === 13:33:28 DEBUG| [stdout] # [1] Basic trace file check [PASS] 13:37:04 DEBUG| [stdout] # [2] Basic test for tracers [PASS] 13:39:48 DEBUG| [stdout] # [3] Basic trace clock test [PASS] 13:39:56 DEBUG| [stdout] # [4] Basic event tracing check [PASS] 13:40:04 DEBUG| [stdout] # [5] Change the ringbuffer size [PASS] 13:40:20 DEBUG| [stdout] # [6] Snapshot and tracing setting [PASS] 13:40:35 DEBUG| [stdout] # [7] trace_pipe and trace_marker [PASS] 13:40:51 DEBUG| [stdout] # [8] Generic dynamic event - add/remove kprobe events [PASS] 13:41:07 DEBUG| [stdout] # [9] Generic dynamic event - add/remove synthetic events [PASS] 13:41:14 DEBUG| [stdout] # [10] Generic dynamic event - selective clear (compatibility) [PASS] 13:41:22 DEBUG| [stdout] # [11] Generic dynamic event - generic clear event [PASS] 13:41:46 DEBUG| [stdout] # [12] event tracing - enable/disable with event level files [PASS] 13:42:17 DEBUG| [stdout] # [13] event tracing - restricts events based on pid [PASS] 13:42:41 DEBUG| [stdout] # [14] event tracing - enable/disable with subsystem level files [PASS] 13:43:05 DEBUG| [stdout] # [15] event tracing - enable/disable with top level files [PASS] 13:43:14 DEBUG| [stdout] # [16] Test trace_printk from module [PASS] 13:43:56 DEBUG| [stdout] # [17] ftrace - function graph filters with stack tracer [PASS] 13:44:29 DEBUG| [stdout] # [18] ftrace - function graph filters [PASS] 13:45:49 DEBUG| [stdout] # [19] ftrace - function pid filters [PASS] 13:46:06 DEBUG| [stdout] # [20] ftrace - stacktrace filter command [PASS] 13:46:38 DEBUG| [stdout] # [21] ftrace - function trace with cpumask [PASS] 13:47:13 DEBUG| [stdout] # [22] ftrace - test for function event triggers [PASS] 13:47:21 DEBUG| [stdout] # [23] ftrace - function trace on module [PASS] 13:47:31 DEBUG| [stdout] # [24] ftrace - function profiling [PASS] 13:48:07 DEBUG| [stdout] # [25] ftrace - function profiler with function tracing [PASS] 13:48:25 DEBUG| [stdout] # [26] ftrace - test reading of set_ftrace_filter [PASS] END OF MESSAGES This job was run twice. The prior run also hung before completing, but we don't have a console log for that time period, so it's unclear if it also panic'd. It's last messages were: 04:44:27 DEBUG| [stdout] # selftests: timers: nsleep-lat 04:44:48 DEBUG| [stdout] # nsleep latency CLOCK_REALTIME [OK] 04:45:09 DEBUG| [stdout] # nsleep latency CLOCK_MONOTONIC [OK] 04:45:09 DEBUG| [stdout] # nsleep latency CLOCK_MONOTONIC_RAW [UNSUPPORTED] 04:45:09 DEBUG| [stdout] # nsleep latency CLOCK_REALTIME_COARSE [UNSUPPORTED] 04:45:09 DEBUG| [stdout] # nsleep latency CLOCK_MONOTONIC_COARSE [UNSUPPORTED] 04:45:30 DEBUG| [stdout] # nsleep latency CLOCK_BOOTTIME [OK] 04:45:52 DEBUG| [stdout] # nsleep latency CLOCK_REALTIME_ALARM [OK] 04:46:13 DEBUG| [stdout] # nsleep latency CLOCK_BOOTTIME_ALARM [OK] 04:46:34 DEBUG| [stdout] # nsleep latency CLOCK_TAI [OK] 04:46:34 DEBUG| [stdout] # # Pass 0 Fail 0 Xfail 0 Xpass 0 Skip 0 Error 0 04:46:34 DEBUG| [stdout] ok 3 selftests: timers: nsleep-lat 04:46:34 DEBUG| [stdout] # selftests: timers: set-timer-lat -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1922387 Title: BUG: kernel NULL pointer dereference, address: 0050 Status in linux package in Ubuntu: New Status in linux source package in Focal: Confirmed Status in linux source package in Groovy: New Status in linux source package in Hirsute: New Bug description: I observed the following kernel panic with the 5.4.0-71.79-generic kernel while running kernel selftests: blanka login: [ 1671.958400] mmiotrace: Error taking CPU253 down: -28 [ 1672.118199] mmiotrace: Error taking CPU254 down: -28 [ 1672.230306] mmiotrace: Error taking CPU255 down: -28 [ 2503.359753] BUG: kernel NULL pointer dereference, address: 0050 [ 2503.367527] #PF: supervisor read access in kernel mode [ 2503.373257] #PF: error_code(0x) - not-present page [ 2503.378989] PGD 0 P4D 0 [ 2503.381812] Oops: [#1] SMP NOPTI [ 2503.385896] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G OE 5.4.0-71-generic #79-Ubuntu [ 2503.395795] Hardware name: NVIDIA DGXA100 920-23687-2530-000/DGXA100, BIOS 0.33 01/19/2021 [ 2503.405027] RIP: 0010:trace_event_raw_event_wbt_timer+0x6f/0x100 [ 2503.411728] Code: 59 80 e5 02 0f 85 8f 00 00 00 4c 89 e6 ba 34 00 00 00 48 8d 7d a0 e8 d0 a4 ca ff 49 89 c4 48 85 c0 74 37 49 8b 87 b8 03 00 00 <48> 8b 70 50 48 85 f6 74 45 49 8d 7c 24 08 ba 20 00 00 00 e8 59 91 [ 2503.432683] RSP: 0018:a8d6c0003d90 EFLAGS: 00010286 [ 2503.438513] RAX: RBX: RCX: 8100 [ 2503.446474] RDX: 9968a228f418 RSI: 00
[Kernel-packages] [Bug 1922387] Re: BUG: kernel NULL pointer dereference, address: 0000000000000050
This panic occurred while running the ubuntu_kernel_selftests suite. The last bit of logs are: 13:33:20 DEBUG| [stdout] # selftests: ftrace: ftracetest 13:33:20 DEBUG| [stdout] # === Ftrace unit tests === 13:33:28 DEBUG| [stdout] # [1] Basic trace file check [PASS] 13:37:04 DEBUG| [stdout] # [2] Basic test for tracers [PASS] 13:39:48 DEBUG| [stdout] # [3] Basic trace clock test [PASS] 13:39:56 DEBUG| [stdout] # [4] Basic event tracing check[PASS] 13:40:04 DEBUG| [stdout] # [5] Change the ringbuffer size [PASS] 13:40:20 DEBUG| [stdout] # [6] Snapshot and tracing setting [PASS] 13:40:35 DEBUG| [stdout] # [7] trace_pipe and trace_marker [PASS] 13:40:51 DEBUG| [stdout] # [8] Generic dynamic event - add/remove kprobe events [PASS] 13:41:07 DEBUG| [stdout] # [9] Generic dynamic event - add/remove synthetic events [PASS] 13:41:14 DEBUG| [stdout] # [10] Generic dynamic event - selective clear (compatibility) [PASS] 13:41:22 DEBUG| [stdout] # [11] Generic dynamic event - generic clear event [PASS] 13:41:46 DEBUG| [stdout] # [12] event tracing - enable/disable with event level files [PASS] 13:42:17 DEBUG| [stdout] # [13] event tracing - restricts events based on pid [PASS] 13:42:41 DEBUG| [stdout] # [14] event tracing - enable/disable with subsystem level files [PASS] 13:43:05 DEBUG| [stdout] # [15] event tracing - enable/disable with top level files [PASS] 13:43:14 DEBUG| [stdout] # [16] Test trace_printk from module [PASS] 13:43:56 DEBUG| [stdout] # [17] ftrace - function graph filters with stack tracer [PASS] 13:44:29 DEBUG| [stdout] # [18] ftrace - function graph filters [PASS] 13:45:49 DEBUG| [stdout] # [19] ftrace - function pid filters [PASS] 13:46:06 DEBUG| [stdout] # [20] ftrace - stacktrace filter command [PASS] 13:46:38 DEBUG| [stdout] # [21] ftrace - function trace with cpumask[PASS] 13:47:13 DEBUG| [stdout] # [22] ftrace - test for function event triggers [PASS] 13:47:21 DEBUG| [stdout] # [23] ftrace - function trace on module [PASS] 13:47:31 DEBUG| [stdout] # [24] ftrace - function profiling [PASS] 13:48:07 DEBUG| [stdout] # [25] ftrace - function profiler with function tracing[PASS] 13:48:25 DEBUG| [stdout] # [26] ftrace - test reading of set_ftrace_filter [PASS] END OF MESSAGES This job was run twice. The prior run also hung before completing, but we don't have a console log for that time period, so it's unclear if it also panic'd. It's last messages were: 04:44:27 DEBUG| [stdout] # selftests: timers: nsleep-lat 04:44:48 DEBUG| [stdout] # nsleep latency CLOCK_REALTIME [OK] 04:45:09 DEBUG| [stdout] # nsleep latency CLOCK_MONOTONIC[OK] 04:45:09 DEBUG| [stdout] # nsleep latency CLOCK_MONOTONIC_RAW [UNSUPPORTED] 04:45:09 DEBUG| [stdout] # nsleep latency CLOCK_REALTIME_COARSE [UNSUPPORTED] 04:45:09 DEBUG| [stdout] # nsleep latency CLOCK_MONOTONIC_COARSE [UNSUPPORTED] 04:45:30 DEBUG| [stdout] # nsleep latency CLOCK_BOOTTIME [OK] 04:45:52 DEBUG| [stdout] # nsleep latency CLOCK_REALTIME_ALARM [OK] 04:46:13 DEBUG| [stdout] # nsleep latency CLOCK_BOOTTIME_ALARM [OK] 04:46:34 DEBUG| [stdout] # nsleep latency CLOCK_TAI [OK] 04:46:34 DEBUG| [stdout] # # Pass 0 Fail 0 Xfail 0 Xpass 0 Skip 0 Error 0 04:46:34 DEBUG| [stdout] ok 3 selftests: timers: nsleep-lat 04:46:34 DEBUG| [stdout] # selftests: timers: set-timer-lat The job can be found here: http://10.246.72.4:8080/view/nvidia%20a100%20-%20blanka/job/focal-linux- generic-amd64-5.4.0-blanka-ubuntu_kernel_selftests/ -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1922387 Title: BUG: kernel NULL pointer dereference, address: 0050 Status in linux package in Ubuntu: New Status in linux source package in Focal: Confirmed Status in linux source package in Groovy: New Status in linux source package in Hirsute: New Bug description: I observed the following kernel panic with the 5.4.0-71.79-generic kernel while running kernel selftests: blanka login: [ 1671.958400] mmiotrace: Error taking CPU253 down: -28 [ 1672.118199] mmiotrace: Error taking CPU254 down: -28 [ 1672.230306] mmiotrace: Error taking CPU255 down: -28 [ 2503.359753] BUG: kernel NULL pointer dereference, address: 0050 [ 2503.367527] #PF: supervisor read access in kernel mode [ 2503.373257] #PF: error_code(0x) - not-present page [ 2503.378989] PGD 0 P4D 0 [ 2503.381812] Oops: [#1] SMP NOPTI [ 2503.385896] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G OE 5.4.0-71-generic #79-Ubuntu [ 2503.395795] Hardware name: NVIDIA DGXA100 920-23687-2530-000/DGXA100, BIOS 0.33 01/19/2021 [ 2503.405027] RIP: 0010:trace_event_raw_event_wbt_timer+0x6f/0x100 [ 2503.411728] Code: 59 80 e5 02 0f 85 8f 00 00 00 4c 89 e6 ba 34 00 00 00
[Kernel-packages] [Bug 1918226] Re: testbed auxverb failed with exit code 255 with linux on Groovy ADT failure
I also looked at this and compared with older kernels. This test has never progressed past the "Kretprobe dynamic event with maxactive" test on 5.8, but it has on 5.4: ``` 18:41:31 DEBUG| [stdout] # [43] Kprobe event parser error log check [PASS] 18:41:32 DEBUG| [stdout] # [44] Kretprobe dynamic event with arguments [PASS] 18:41:33 DEBUG| [stdout] # [45] Kretprobe dynamic event with maxactive [PASS] 18:41:49 DEBUG| [stdout] # [46] Register/unregister many kprobe events [PASS] 18:41:49 DEBUG| [stdout] # [47] Kprobe dynamic event - adding and removing [PASS] 18:41:50 DEBUG| [stdout] # [48] Uprobe event parser error log check [PASS] 18:41:50 DEBUG| [stdout] # [49] test for the preemptirqsoff tracer [UNSUPPORTED] 18:42:32 DEBUG| [stdout] # [50] Meta-selftest [PASS] ``` I suspect that next test is causing a hang and ssh dies. The autopkgtest infrastructure sees this and tries to provide some debugging info (the console log and vm details). -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1918226 Title: testbed auxverb failed with exit code 255 with linux on Groovy ADT failure Status in linux package in Ubuntu: Invalid Status in linux source package in Groovy: Confirmed Bug description: Testing failed on: arm64: https://objectstorage.prodstack4-5.canonical.com/v1/AUTH_77e2ada1e7a84929a74ba3b87153c0ac/autopkgtest-groovy/groovy/arm64/l/linux/20210308_234307_e716b@/log.gz Looks to be a flaky test with this error occurring frequently. This is not a regression. Found a previously reported (expired) bug with the same error: https://launchpad.net/bugs/1549425 [ 6606.751232] audit: backlog limit Creating nova instance adt-groovy-arm64-linux-20210308-143145 from image adt/ubuntu-groovy-arm64-server-20210308.img (UUID 1cfb02a6-bf88-46bb-a8f1-3c6589cd939e)... Creating nova instance adt-groovy-arm64-linux-20210308-143145 from image adt/ubuntu-groovy-arm64-server-20210308.img (UUID 1cfb02a6-bf88-46bb-a8f1-3c6589cd939e)... autopkgtest [23:42:55]: ERROR: testbed failure: testbed auxverb failed with exit code 255 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1918226/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1805806] Re: test_maps in ubuntu_bpf failed with "Failed sockmap unexpected timeout" on D ARM64
Seen with linux-aws 5.4.0-1038.40~18.04.1. ** Tags added: sru-20210125 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1805806 Title: test_maps in ubuntu_bpf failed with "Failed sockmap unexpected timeout" on D ARM64 Status in ubuntu-kernel-tests: Triaged Status in linux package in Ubuntu: Invalid Bug description: This issue can be found on 2 different ARM64 node, TunderX Cavium node "starmie" and Moonshot "ms10-34-mcdivittB0-kernel" Running test_maps bpf test.. Fork 1024 tasks to 'test_update_delete' Fork 1024 tasks to 'test_update_delete' Fork 1024 tasks to 'test_update_delete' Fork 1024 tasks to 'test_update_delete' Fork 100 tasks to 'test_hashmap' Fork 100 tasks to 'test_hashmap_percpu' Fork 1024 tasks to 'test_update_delete' Fork 1024 tasks to 'test_update_delete' Fork 100 tasks to 'test_hashmap' Fork 100 tasks to 'test_hashmap_percpu' Fork 1024 tasks to 'test_update_delete' Fork 1024 tasks to 'test_update_delete' Fork 100 tasks to 'test_hashmap' Fork 100 tasks to 'test_hashmap_percpu' Fork 1024 tasks to 'test_update_delete' Fork 1024 tasks to 'test_update_delete' Fork 100 tasks to 'test_hashmap' Fork 100 tasks to 'test_hashmap_percpu' Fork 1024 tasks to 'test_update_delete' Fork 1024 tasks to 'test_update_delete' Fork 100 tasks to 'test_hashmap' Fork 100 tasks to 'test_hashmap_percpu' Fork 1024 tasks to 'test_update_delete' Fork 1024 tasks to 'test_update_delete' Fork 100 tasks to 'test_hashmap' Fork 100 tasks to 'test_hashmap_percpu' Fork 1024 tasks to 'test_update_delete' Fork 1024 tasks to 'test_update_delete' Fork 100 tasks to 'test_hashmap' Fork 100 tasks to 'test_hashmap_percpu' Fork 1024 tasks to 'test_update_delete' Fork 1024 tasks to 'test_update_delete' Fork 100 tasks to 'test_hashmap' Fork 100 tasks to 'test_hashmap_percpu' Fork 1024 tasks to 'test_update_delete' Fork 1024 tasks to 'test_update_delete' Fork 100 tasks to 'test_hashmap' Fork 100 tasks to 'test_hashmap_percpu' Fork 1024 tasks to 'test_update_delete' Fork 1024 tasks to 'test_update_delete' Fork 100 tasks to 'test_hashmap' Fork 100 tasks to 'test_hashmap_percpu' Fork 1024 tasks to 'test_update_delete' Fork 1024 tasks to 'test_update_delete' Fork 100 tasks to 'test_hashmap' Fork 100 tasks to 'test_hashmap_percpu' Fork 1024 tasks to 'test_update_delete' Fork 1024 tasks to 'test_update_delete' Fork 100 tasks to 'test_hashmap' Fork 100 tasks to 'test_hashmap_percpu' Fork 1024 tasks to 'test_update_delete' Fork 1024 tasks to 'test_update_delete' Fork 100 tasks to 'test_hashmap' Fork 100 tasks to 'test_hashmap_sizes' Fork 100 tasks to 'test_hashmap_walk' Fork 100 tasks to 'test_arraymap' Fork 100 tasks to 'test_arraymap_percpu' Fork 1024 tasks to 'test_update_delete' Fork 1024 tasks to 'test_update_delete' Fork 100 tasks to 'test_hashmap' Fork 100 tasks to 'test_hashmap_percpu' Fork 100 tasks to 'test_hashmap_sizes' Fork 100 tasks to 'test_hashmap_walk' Fork 100 tasks to 'test_arraymap' Fork 100 tasks to 'test_arraymap_percpu' Fork 1024 tasks to 'test_update_delete' Fork 1024 tasks to 'test_update_delete' Fork 100 tasks to 'test_hashmap' Fork 100 tasks to 'test_hashmap_percpu' Fork 100 tasks to 'test_hashmap_sizes' Fork 100 tasks to 'test_hashmap_walk' Fork 100 tasks to 'test_arraymap' Fork 100 tasks to 'test_arraymap_percpu' Fork 1024 tasks to 'test_update_delete' Fork 1024 tasks to 'test_update_delete' Fork 100 tasks to 'test_hashmap' Fork 100 tasks to 'test_hashmap_percpu' Fork 100 tasks to 'test_hashmap_sizes' Fork 100 tasks to 'test_hashmap_walk' Fork 100 tasks to 'test_arraymap' Fork 100 tasks to 'test_arraymap_percpu' Fork 1024 tasks to 'test_update_delete' Fork 1024 tasks to 'test_update_delete' Fork 100 tasks to 'test_hashmap' Fork 100 tasks to 'test_hashmap_percpu' Fork 100 tasks to 'test_hashmap_sizes' Fork 100 tasks to 'test_hashmap_walk' Fork 100 tasks to 'test_arraymap' Fork 100 tasks to 'test_arraymap_percpu' Fork 1024 tasks to 'test_update_delete' Fork 1024 tasks to 'test_update_delete' Fork 100 tasks to 'test_hashmap' Fork 100 tasks to 'test_hashmap_percpu' Fork 100 tasks to 'test_hashmap_sizes' Fork 100 tasks to 'test_hashmap_walk' Fork 100 tasks to 'test_arraymap' Fork 100 tasks to 'test_arraymap_percpu' Failed sockmap unexpected timeout ProblemType: Bug DistroRelease: Ubuntu 18.10 Package: linux-image-4.18.0-11-generic 4.18.0-11.12 ProcVersionSignature: User Name 4.18.0-11.12-generic 4.18.12 Uname: Linux 4.18.0-11-generic aarch64 AlsaDevices: total 0 crw-rw 1 root
[Kernel-packages] [Bug 1844493] Re: ubuntu_sysdig_smoke_test failed on 5.3 / 5.4 / 5.6 /5.8 kernels
** Tags added: sru-20210125 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-gcp in Ubuntu. https://bugs.launchpad.net/bugs/1844493 Title: ubuntu_sysdig_smoke_test failed on 5.3 / 5.4 / 5.6 /5.8 kernels Status in ubuntu-kernel-tests: Triaged Status in linux package in Ubuntu: Incomplete Status in linux-aws package in Ubuntu: New Status in linux-azure package in Ubuntu: New Status in linux-gcp package in Ubuntu: New Status in linux source package in Eoan: Incomplete Status in linux-aws source package in Eoan: New Status in linux-azure source package in Eoan: New Status in linux-gcp source package in Eoan: New Status in linux source package in Focal: New Status in linux-aws source package in Focal: New Status in linux-azure source package in Focal: New Status in linux-gcp source package in Focal: New Bug description: Test failed with: FAILED (trace at least 25 reads of /dev/zero by dd) FAILED (trace at least 25 writes to /dev/null by dd) Steps: sudo apt-get install git python-minimal python-yaml gdb -y git clone --depth=1 git://kernel.ubuntu.com/ubuntu/autotest-client-tests git clone --depth=1 git://kernel.ubuntu.com/ubuntu/autotest rm -fr autotest/client/tests ln -sf ~/autotest-client-tests autotest/client/tests AUTOTEST_PATH=/home/ubuntu/autotest sudo -E autotest/client/autotest-local --verbose autotest/client/tests/ubuntu_sysdig_smoke_test/control Test output: == sysdig smoke test to trace dd, cat, read and writes == Limiting raw capture file to 16384 blocks Try 1 of 10 Sysdig capture started after 1 seconds wait Raw capture file is 16 Mbytes Converted events file is 18 Mbytes Try 2 of 10 Sysdig capture started after 1 seconds wait Raw capture file is 16 Mbytes Converted events file is 22 Mbytes Try 3 of 10 Sysdig capture started after 1 seconds wait Raw capture file is 16 Mbytes Converted events file is 21 Mbytes Try 4 of 10 Sysdig capture started after 1 seconds wait Raw capture file is 16 Mbytes Converted events file is 21 Mbytes Try 5 of 10 Sysdig capture started after 1 seconds wait Raw capture file is 16 Mbytes Converted events file is 21 Mbytes Try 6 of 10 Sysdig capture started after 1 seconds wait Raw capture file is 16 Mbytes Converted events file is 21 Mbytes Try 7 of 10 Sysdig capture started after 1 seconds wait Raw capture file is 16 Mbytes Converted events file is 21 Mbytes Try 8 of 10 Sysdig capture started after 1 seconds wait Raw capture file is 16 Mbytes Converted events file is 21 Mbytes Try 9 of 10 Sysdig capture started after 1 seconds wait Raw capture file is 16 Mbytes Converted events file is 21 Mbytes Try 10 of 10 Sysdig capture started after 1 seconds wait Raw capture file is 16 Mbytes Converted events file is 21 Mbytes Found: 279845 sysdig events 29882 context switches 0 reads from /dev/zero by dd 0 writes to /dev/null by dd PASSED (trace at least 25 context switches) FAILED (trace at least 25 reads of /dev/zero by dd) FAILED (trace at least 25 writes to /dev/null by dd) Summary: 1 passed, 2 failed ProblemType: Bug DistroRelease: Ubuntu 19.10 Package: linux-image-5.3.0-10-generic 5.3.0-10.11 ProcVersionSignature: User Name 5.3.0-10.11-generic 5.3.0-rc8 Uname: Linux 5.3.0-10-generic x86_64 AlsaDevices: total 0 crw-rw 1 root audio 116, 1 Sep 18 08:10 seq crw-rw 1 root audio 116, 33 Sep 18 08:10 timer AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay' ApportVersion: 2.20.11-0ubuntu7 Architecture: amd64 ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 'arecord' AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: CurrentDmesg: Date: Wed Sep 18 08:18:54 2019 IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig' MachineType: Intel Corporation S1200RP PciMultimedia: ProcFB: 0 mgag200drmfb ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.3.0-10-generic root=UUID=b0d2ae4e-12dd-423e-acea-272ee8b2a893 ro RelatedPackageVersions: linux-restricted-modules-5.3.0-10-generic N/A linux-backports-modules-5.3.0-10-generic N/A linux-firmware1.182 RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill' SourcePackage: linux UpgradeStatus: No upgrade log present (probably fresh install) dmi.bios.date: 07/01/2015 dmi.bios.vendor: Intel Corp. dmi.bios.version: S1200RP.86B.03.02.0003.070120151022 dmi.board.asset.tag: dmi.board.name: S1200RP dmi.board.vendor: Intel Corporation dmi.board.version: G62254-407 dmi.chassis.asset.tag: dmi.chassis.type: 17
[Kernel-packages] [Bug 1905728] Re: Found insecure W+X mapping at address on Groovy RISCV
** Attachment added: "dmesg-5.8.0-17-generic" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1905728/+attachment/5464855/+files/dmesg -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1905728 Title: Found insecure W+X mapping at address on Groovy RISCV Status in linux package in Ubuntu: Confirmed Bug description: Issue found on 5.8.0-10-generic riscv Message reported on boot. [ 13.483103] [ cut here ] [ 13.483711] riscv/mm: Found insecure W+X mapping at address (ptrval)/0xffdff800 [ 13.484542] WARNING: CPU: 5 PID: 1 at arch/riscv/mm/ptdump.c:200 note_page+0x24c/0x252 [ 13.485175] Modules linked in: [ 13.485606] CPU: 5 PID: 1 Comm: swapper/0 Not tainted 5.8.0-10-generic #12-Ubuntu [ 13.486091] epc: ffe000208f18 ra : ffe000208f18 sp : ffe1f5bfbb30 [ 13.486471] gp : ffe001728ee0 tp : ffe1f5bf5080 t0 : ffe00173ed88 [ 13.486850] t1 : ffe00173ed20 t2 : 0001fecbe000 s0 : ffe1f5bfbb80 [ 13.487250] s1 : ffe1f5bfbe10 a0 : 0053 a1 : 0020 [ 13.487633] a2 : ffe1f5bfb870 a3 : a4 : ffe0016200f8 [ 13.488040] a5 : ffe0016200f8 a6 : 00b5 a7 : ffe0006f2806 [ 13.488421] s2 : ffdff8001000 s3 : s4 : 0004 [ 13.488800] s5 : s6 : s7 : ffe1f5bfbd20 [ 13.489322] s8 : ffdff8001000 s9 : ffe00172a148 s10: ffdff8002000 [ 13.489738] s11: ffe000c16e20 t3 : 0003cec0 t4 : 0003cec0 [ 13.490119] t5 : t6 : ffe001739462 [ 13.490406] status: 0120 badaddr: cause: 0003 [ 13.490849] ---[ end trace 607c551edff1ef12 ]--- Please find attachment for the boot dmesg log. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1905728/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1905728] Re: Found insecure W+X mapping at address on Groovy RISCV
Still seeing this with the 5.8.0-17-generic riscv kernel on groovy. See attached dmesg. ** Changed in: linux (Ubuntu) Status: Expired => Confirmed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1905728 Title: Found insecure W+X mapping at address on Groovy RISCV Status in linux package in Ubuntu: Confirmed Bug description: Issue found on 5.8.0-10-generic riscv Message reported on boot. [ 13.483103] [ cut here ] [ 13.483711] riscv/mm: Found insecure W+X mapping at address (ptrval)/0xffdff800 [ 13.484542] WARNING: CPU: 5 PID: 1 at arch/riscv/mm/ptdump.c:200 note_page+0x24c/0x252 [ 13.485175] Modules linked in: [ 13.485606] CPU: 5 PID: 1 Comm: swapper/0 Not tainted 5.8.0-10-generic #12-Ubuntu [ 13.486091] epc: ffe000208f18 ra : ffe000208f18 sp : ffe1f5bfbb30 [ 13.486471] gp : ffe001728ee0 tp : ffe1f5bf5080 t0 : ffe00173ed88 [ 13.486850] t1 : ffe00173ed20 t2 : 0001fecbe000 s0 : ffe1f5bfbb80 [ 13.487250] s1 : ffe1f5bfbe10 a0 : 0053 a1 : 0020 [ 13.487633] a2 : ffe1f5bfb870 a3 : a4 : ffe0016200f8 [ 13.488040] a5 : ffe0016200f8 a6 : 00b5 a7 : ffe0006f2806 [ 13.488421] s2 : ffdff8001000 s3 : s4 : 0004 [ 13.488800] s5 : s6 : s7 : ffe1f5bfbd20 [ 13.489322] s8 : ffdff8001000 s9 : ffe00172a148 s10: ffdff8002000 [ 13.489738] s11: ffe000c16e20 t3 : 0003cec0 t4 : 0003cec0 [ 13.490119] t5 : t6 : ffe001739462 [ 13.490406] status: 0120 badaddr: cause: 0003 [ 13.490849] ---[ end trace 607c551edff1ef12 ]--- Please find attachment for the boot dmesg log. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1905728/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1748103] Re: apic test in kvm-unit-test failed with timeout
Seen with linux-oracle 4.15.0-1065.73~16.04.1. ** Tags added: sru-20210125 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-azure in Ubuntu. https://bugs.launchpad.net/bugs/1748103 Title: apic test in kvm-unit-test failed with timeout Status in ubuntu-kernel-tests: In Progress Status in linux package in Ubuntu: Incomplete Status in linux-azure package in Ubuntu: New Status in linux-azure-edge package in Ubuntu: New Status in linux source package in Xenial: New Status in linux-azure source package in Xenial: New Status in linux-azure-edge source package in Xenial: New Status in linux source package in Bionic: New Status in linux-azure source package in Bionic: New Status in linux-azure-edge source package in Bionic: New Bug description: With Joshua's comment in bug 1719524: "Nested KVM can only be tried on instance sizes with nested Hypervisor support: Ev3 and Dv3.", although the instance name is E4v3 here but I can start a KVM on it. Test apic will timeout on it. Steps: 1. git clone --depth=1 https://git.kernel.org/pub/scm/virt/kvm/kvm-unit-tests.git 2. cd kvm-unit-tests; ./configure; make 3. Run the apic test as root: # TESTNAME=apic TIMEOUT=30 ACCEL= ./x86/run x86/apic.flat -smp 2 -cpu qemu64,+x2apic,+tsc-deadline timeout -k 1s --foreground 30 /usr/bin/qemu-system-x86_64 -nodefaults -device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -vnc none -serial stdio -device pci-testdev -machine accel=kvm -kernel x86/apic.flat -smp 2 -cpu qemu64,+x2apic,+tsc-deadline # -initrd /tmp/tmp.onXtr5JVp7 enabling apic enabling apic paging enabled cr0 = 80010011 cr3 = 459000 cr4 = 20 apic version: 1050014 PASS: apic existence PASS: xapic id matches cpuid PASS: writeable xapic id PASS: non-writeable x2apic id PASS: sane x2apic id FAIL: x2apic id matches cpuid PASS: correct xapic id after reset PASS: apic_disable: Local apic enabled PASS: apic_disable: CPUID.1H:EDX.APIC[bit 9] is set PASS: apic_disable: Local apic disabled PASS: apic_disable: CPUID.1H:EDX.APIC[bit 9] is clear PASS: apic_disable: Local apic enabled PASS: apic_disable: CPUID.1H:EDX.APIC[bit 9] is set x2apic enabled PASS: x2apic enabled to invalid state PASS: x2apic enabled to apic enabled PASS: disabled to invalid state PASS: disabled to x2apic enabled PASS: apic enabled to invalid state PASS: apicbase: relocate apic PASS: apicbase: reserved physaddr bits PASS: apicbase: reserved low bits PASS: self ipi starting broadcast (x2apic) PASS: APIC physical broadcast address PASS: APIC physical broadcast shorthand PASS: nmi-after-sti qemu-system-x86_64: terminating on signal 15 from pid 7246 ProblemType: Bug DistroRelease: Ubuntu 16.04 Package: linux-image-4.14.0-1004-azure-edge 4.14.0-1004.4 ProcVersionSignature: User Name 4.14.0-1004.4-username-edge 4.14.14 Uname: Linux 4.14.0-1004-azure-edge x86_64 ApportVersion: 2.20.1-0ubuntu2.15 Architecture: amd64 Date: Thu Feb 8 06:00:55 2018 ProcEnviron: TERM=xterm-256color PATH=(custom, no user) XDG_RUNTIME_DIR= LANG=en_US.UTF-8 SHELL=/bin/bash SourcePackage: linux-azure-edge UpgradeStatus: No upgrade log present (probably fresh install) --- ApportVersion: 2.20.1-0ubuntu2.15 Architecture: amd64 DistroRelease: Ubuntu 16.04 Package: linux-azure-edge PackageArchitecture: amd64 ProcEnviron: TERM=xterm-256color PATH=(custom, no user) XDG_RUNTIME_DIR= LANG=en_US.UTF-8 SHELL=/bin/bash ProcVersionSignature: User Name 4.13.0-1009.12-username 4.13.13 Tags: xenial uec-images Uname: Linux 4.13.0-1009-azure x86_64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: adm audio cdrom dialout dip floppy libvirtd lxd netdev plugdev sudo video _MarkForUpload: True To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1748103/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1831449] Re: memory in ubuntu_kvm_unit_tests fails
Seen with linux-oracle 4.15.0-1065.73~16.04.1. ** Tags added: sru-20210125 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-oracle in Ubuntu. https://bugs.launchpad.net/bugs/1831449 Title: memory in ubuntu_kvm_unit_tests fails Status in ubuntu-kernel-tests: Triaged Status in linux-kvm package in Ubuntu: New Status in linux-oracle package in Ubuntu: New Bug description: Need to run this on oracle manually to get the full output: TESTNAME=memory TIMEOUT=90s ACCEL= ./x86/run x86/memory.flat -smp 1 -cpu host FAIL memory (8 tests, 2 unexpected failures) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1831449/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1748105] Re: port80 test in ubuntu_kvm_unit_tests failed with timeout
Seen with linux-oracle 4.15.0-1065.73~16.04.1. ** Tags added: sru-20210125 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-kvm in Ubuntu. https://bugs.launchpad.net/bugs/1748105 Title: port80 test in ubuntu_kvm_unit_tests failed with timeout Status in ubuntu-kernel-tests: Triaged Status in linux package in Ubuntu: Incomplete Status in linux-azure package in Ubuntu: Confirmed Status in linux-azure-edge package in Ubuntu: Confirmed Status in linux-kvm package in Ubuntu: Confirmed Status in linux-oracle-5.0 package in Ubuntu: Confirmed Bug description: With Joshua's comment in bug 1719524: "Nested KVM can only be tried on instance sizes with nested Hypervisor support: Ev3 and Dv3.", although the instance name is E4v3 here but I can start a KVM on it. Test port80 test will timeout on it. Steps: 1. git clone --depth=1 https://git.kernel.org/pub/scm/virt/kvm/kvm-unit-tests.git 2. cd kvm-unit-tests; ./configure; make 3. Run the port80 test as root: # TESTNAME=port80 TIMEOUT=90s ACCEL= ./x86/run x86/port80.flat -smp 1 timeout -k 1s --foreground 90s /usr/bin/qemu-system-x86_64 -nodefaults -device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -vnc none -serial stdio -device pci-testdev -machine accel=kvm -kernel x86/port80.flat -smp 1 # -initrd /tmp/tmp.3p9PWc2SRi enabling apic begining port 0x80 write test qemu-system-x86_64: terminating on signal 15 from pid 7790 ProblemType: Bug DistroRelease: Ubuntu 16.04 Package: linux-image-4.14.0-1004-azure-edge 4.14.0-1004.4 ProcVersionSignature: User Name 4.14.0-1004.4-username-edge 4.14.14 Uname: Linux 4.14.0-1004-azure-edge x86_64 ApportVersion: 2.20.1-0ubuntu2.15 Architecture: amd64 Date: Thu Feb 8 06:13:18 2018 ProcEnviron: TERM=xterm-256color PATH=(custom, no user) XDG_RUNTIME_DIR= LANG=en_US.UTF-8 SHELL=/bin/bash SourcePackage: linux-azure-edge UpgradeStatus: No upgrade log present (probably fresh install) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1748105/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1827979] Re: pcid in ubuntu_kvm_unit_tests failed on B-KVM / X-4.15-oracle / B-oracle-5.3
Seen with linux-oracle 4.15.0-1065.73~16.04.1. ** Tags added: sru-20210125 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-kvm in Ubuntu. https://bugs.launchpad.net/bugs/1827979 Title: pcid in ubuntu_kvm_unit_tests failed on B-KVM / X-4.15-oracle / B-oracle-5.3 Status in ubuntu-kernel-tests: New Status in linux-kvm package in Ubuntu: New Bug description: FAIL pcid (3 tests, 1 unexpected failures) # TESTNAME=pcid TIMEOUT=90s ACCEL= ./x86/run x86/pcid.flat -smp 1 -cpu qemu64,+pcid timeout -k 1s --foreground 90s /usr/bin/qemu-system-x86_64 -nodefaults -device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -vnc none -serial stdio -device pci-testdev -machine accel=kvm -kernel x86/pcid.flat -smp 1 -cpu qemu64,+pcid # -initrd /tmp/tmp.a4xQyF7juj qemu-system-x86_64: warning: host doesn't support requested feature: CPUID.01H:ECX.pcid [bit 17] qemu-system-x86_64: warning: host doesn't support requested feature: CPUID.8001H:ECX.svm [bit 2] enabling apic PASS: CPUID consistency PASS: Test on PCID when disabled FAIL: Test on INVPCID when disabled SUMMARY: 3 tests, 1 unexpected failures ProblemType: Bug DistroRelease: Ubuntu 18.04 Package: linux-image-4.15.0-1032-kvm 4.15.0-1032.32 ProcVersionSignature: User Name 4.15.0-1032.32-kvm 4.15.18 Uname: Linux 4.15.0-1032-kvm x86_64 ApportVersion: 2.20.9-0ubuntu7.6 Architecture: amd64 Date: Tue May 7 03:31:38 2019 SourcePackage: linux-kvm UpgradeStatus: No upgrade log present (probably fresh install) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1827979/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1837035] Re: memcg_stat_rss from controllers in ubuntu_ltp failed
Seen with linux-oracle 4.15.0-1065.73. ** Tags added: sru-20210125 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-aws in Ubuntu. https://bugs.launchpad.net/bugs/1837035 Title: memcg_stat_rss from controllers in ubuntu_ltp failed Status in ubuntu-kernel-tests: New Status in linux package in Ubuntu: Confirmed Status in linux-aws package in Ubuntu: New Bug description: This issue was spotted on an i386 node "pepe" with Disco kernel, it failed with: memcg_process: shmget() failed: Invalid argument /opt/ltp/testcases/bin/memcg_stat_rss.sh: 168: kill: No such process memcg_stat_rss 4 TFAIL: Process 1845 exited with 1 after warm up <<>> tag=memcg_stat_rss stime=1563448062 cmdline="memcg_stat_rss.sh" contacts="" analysis=exit <<>> memcg_stat_rss 1 TINFO: Starting test 1 /opt/ltp/testcases/bin/memcg_stat_rss.sh: 522: echo: echo: I/O error memcg_stat_rss 1 TINFO: set /dev/memcg/memory.use_hierarchy to 0 failed memcg_stat_rss 1 TINFO: Running memcg_process --mmap-anon -s 135168 memcg_stat_rss 1 TINFO: Warming up pid: 1784 memcg_stat_rss 1 TINFO: Process is still here after warm up: 1784 memcg_stat_rss 1 TPASS: rss is 135168 as expected memcg_stat_rss 2 TINFO: Starting test 2 /opt/ltp/testcases/bin/memcg_stat_rss.sh: 522: echo: echo: I/O error memcg_stat_rss 2 TINFO: set /dev/memcg/memory.use_hierarchy to 0 failed memcg_stat_rss 2 TINFO: Running memcg_process --mmap-file -s 4096 memcg_stat_rss 2 TINFO: Warming up pid: 1804 memcg_stat_rss 2 TINFO: Process is still here after warm up: 1804 memcg_stat_rss 2 TPASS: rss is 0 as expected memcg_stat_rss 3 TINFO: Starting test 3 /opt/ltp/testcases/bin/memcg_stat_rss.sh: 522: echo: echo: I/O error memcg_stat_rss 3 TINFO: set /dev/memcg/memory.use_hierarchy to 0 failed memcg_stat_rss 3 TINFO: Running memcg_process --shm -k 3 -s 4096 memcg_stat_rss 3 TINFO: Warming up pid: 1825 memcg_stat_rss 3 TINFO: Process is still here after warm up: 1825 memcg_stat_rss 3 TPASS: rss is 0 as expected memcg_stat_rss 4 TINFO: Starting test 4 /opt/ltp/testcases/bin/memcg_stat_rss.sh: 522: echo: echo: I/O error memcg_stat_rss 4 TINFO: set /dev/memcg/memory.use_hierarchy to 0 failed memcg_stat_rss 4 TINFO: Running memcg_process --mmap-anon --mmap-file --shm -s 135168 memcg_stat_rss 4 TINFO: Warming up pid: 1845 memcg_process: shmget() failed: Invalid argument /opt/ltp/testcases/bin/memcg_stat_rss.sh: 168: kill: No such process memcg_stat_rss 4 TFAIL: Process 1845 exited with 1 after warm up memcg_stat_rss 5 TINFO: Starting test 5 /opt/ltp/testcases/bin/memcg_stat_rss.sh: 522: echo: echo: I/O error memcg_stat_rss 5 TINFO: set /dev/memcg/memory.use_hierarchy to 0 failed memcg_stat_rss 5 TINFO: Running memcg_process --mmap-lock1 -s 135168 memcg_stat_rss 5 TINFO: Warming up pid: 1858 memcg_stat_rss 5 TINFO: Process is still here after warm up: 1858 memcg_stat_rss 5 TPASS: rss is 135168 as expected memcg_stat_rss 6 TINFO: Starting test 6 /opt/ltp/testcases/bin/memcg_stat_rss.sh: 522: echo: echo: I/O error memcg_stat_rss 6 TINFO: set /dev/memcg/memory.use_hierarchy to 0 failed memcg_stat_rss 6 TINFO: Running memcg_process --mmap-anon -s 135168 memcg_stat_rss 6 TINFO: Warming up pid: 1878 memcg_stat_rss 6 TINFO: Process is still here after warm up: 1878 memcg_stat_rss 6 TPASS: rss is 135168 as expected memcg_stat_rss 7 TPASS: rss is 0 as expected memcg_stat_rss 8 TINFO: Starting test 7 /opt/ltp/testcases/bin/memcg_stat_rss.sh: 522: echo: echo: I/O error memcg_stat_rss 8 TINFO: set /dev/memcg/memory.use_hierarchy to 0 failed memcg_stat_rss 8 TINFO: Running memcg_process --mmap-file -s 4096 memcg_stat_rss 8 TINFO: Warming up pid: 1901 memcg_stat_rss 8 TINFO: Process is still here after warm up: 1901 memcg_stat_rss 8 TPASS: rss is 0 as expected memcg_stat_rss 9 TPASS: rss is 0 as expected memcg_stat_rss 10 TINFO: Starting test 8 /opt/ltp/testcases/bin/memcg_stat_rss.sh: 522: echo: echo: I/O error memcg_stat_rss 10 TINFO: set /dev/memcg/memory.use_hierarchy to 0 failed memcg_stat_rss 10 TINFO: Running memcg_process --shm -k 8 -s 4096 memcg_stat_rss 10 TINFO: Warming up pid: 1925 memcg_stat_rss 10 TINFO: Process is still here after warm up: 1925 memcg_stat_rss 10 TPASS: rss is 0 as expected memcg_stat_rss 11 TPASS: rss is 0 as expected memcg_stat_rss 12 TINFO: Starting test 9 /opt/ltp/testcases/bin/memcg_stat_rss.sh: 522: echo: echo: I/O error memcg_stat_rss 12 TINFO: set /dev/memcg/memory.use_hierarchy to 0 failed memcg_stat_rss 12 TINFO: Running memcg_process --mmap-anon --mmap-file --shm -s 135168 memcg_stat_rss 12 TINFO: Warming up pid: 1948 memcg_process: shmget() failed: Invalid argument /opt/ltp/testcases/bin/memcg_stat_rss.sh: 168: kill: No such process memcg_stat_rss 12 TFAIL: Process 1948 exited with 1 after warm up memcg_stat_rss 13 TINFO: Starting test
[Kernel-packages] [Bug 1829995] Re: getaddrinfo_01 from ipv6_lib test suite in LTP failed
Seen with linux-oracle 4.15.0-1065.73. ** Tags added: oracle sru-20210125 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-aws in Ubuntu. https://bugs.launchpad.net/bugs/1829995 Title: getaddrinfo_01 from ipv6_lib test suite in LTP failed Status in ubuntu-kernel-tests: Triaged Status in linux package in Ubuntu: Incomplete Status in linux-aws package in Ubuntu: New Status in linux source package in Bionic: Incomplete Status in linux-aws source package in Bionic: New Status in linux source package in Eoan: New Status in linux-aws source package in Eoan: New Bug description: startup='Wed May 22 08:02:52 2019' getaddrinfo_011 TPASS : getaddrinfo IPv4 basic lookup getaddrinfo_012 TFAIL : getaddrinfo_01.c:140: getaddrinfo IPv4 canonical name ("curly.maas") doesn't match hostname ("curly") getaddrinfo_013 TFAIL : getaddrinfo_01.c:578: getaddrinfo IPv6 basic lookup ("curly") returns -5 ("No address associated with hostname") tag=getaddrinfo_01 stime=1558512172 dur=1 exit=exited stat=1 core=no cu=0 cs=0 ProblemType: Bug DistroRelease: Ubuntu 18.04 Package: linux-image-4.15.0-50-generic 4.15.0-50.54 ProcVersionSignature: User Name 4.15.0-50.54-generic 4.15.18 Uname: Linux 4.15.0-50-generic x86_64 AlsaDevices: total 0 crw-rw 1 root audio 116, 1 May 22 02:57 seq crw-rw 1 root audio 116, 33 May 22 02:57 timer AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay' ApportVersion: 2.20.9-0ubuntu7.6 Architecture: amd64 ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 'arecord' AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: CurrentDmesg: Date: Wed May 22 08:04:30 2019 IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig' Lsusb: Bus 001 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub MachineType: QEMU Standard PC (i440FX + PIIX, 1996) PciMultimedia: ProcFB: 0 cirrusdrmfb ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-50-generic root=UUID=57e8-9e7f-40ee-934e-f1dce18323e5 ro RelatedPackageVersions: linux-restricted-modules-4.15.0-50-generic N/A linux-backports-modules-4.15.0-50-generic N/A linux-firmware 1.173.6 RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill' SourcePackage: linux UpgradeStatus: No upgrade log present (probably fresh install) dmi.bios.date: 04/01/2014 dmi.bios.vendor: SeaBIOS dmi.bios.version: Ubuntu-1.8.2-1ubuntu1 dmi.chassis.type: 1 dmi.chassis.vendor: QEMU dmi.chassis.version: pc-i440fx-xenial dmi.modalias: dmi:bvnSeaBIOS:bvrUbuntu-1.8.2-1ubuntu1:bd04/01/2014:svnQEMU:pnStandardPC(i440FX+PIIX,1996):pvrpc-i440fx-xenial:cvnQEMU:ct1:cvrpc-i440fx-xenial: dmi.product.name: Standard PC (i440FX + PIIX, 1996) dmi.product.version: pc-i440fx-xenial dmi.sys.vendor: QEMU To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1829995/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp