[Kernel-packages] [Bug 2059978] Re: linux-aws-5.15 ADT test MISS because it's unable to find package

2024-04-04 Thread Francis Ginther
@paride: Yes, I've seen this with other kernels, mostly with the nvidia
drivers. I think all of the runs of the following since March 20 show
this problem:

https://autopkgtest.ubuntu.com/packages/n/nvidia-graphics-drivers-510-server/focal/amd64
https://autopkgtest.ubuntu.com/packages/n/nvidia-graphics-drivers-515/focal/amd64
https://autopkgtest.ubuntu.com/packages/n/nvidia-graphics-drivers-460-server/focal/amd64

As mentioned in MM, all of these started failing between March 11 and
March 20. All of these drivers are transitional packages (they don't
really do anything other then depend on the next driver in the series to
keep users on a supported driver), but I don't know if this is of
interest to the problem as I've seen this with other packages too.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2059978

Title:
  linux-aws-5.15 ADT test MISS because it's unable to find package

Status in Auto Package Testing:
  Invalid
Status in autopkgtest package in Ubuntu:
  Confirmed
Status in linux package in Ubuntu:
  New

Bug description:
  SRU cycle 2024.03.04 Focal aws-5.15 ADT test linux-aws-5.15 for both
  amd64 and arm64 results in a MISS with the error message below. It
  seems like the test was unable to locate the kernel-testing--linux-
  aws-5.15--modules-extra--preferred$ package which led to other missing
  packages and the test erroring out with exit code 1. This test has
  been SKIPPED before but with seemingly different reasons.

  I have also attached the whole log amd64 output to this bug report.

  "339s Reading state information...
  339s E: Unable to locate package 
^kernel-testing--linux-aws-5.15--modules-extra--preferred$
  339s E: Couldn't find any package by glob 
'^kernel-testing--linux-aws-5.15--modules-extra--preferred$'
  339s E: Couldn't find any package by regex 
'^kernel-testing--linux-aws-5.15--modules-extra--preferred$'
  339s Reading package lists...
  339s Building dependency tree...
  339s Reading state information...
  339s E: Unable to locate package ^linux-modules-extra-aws-5.15$
  339s E: Couldn't find any package by glob '^linux-modules-extra-aws-5.15$'
  339s E: Couldn't find any package by regex '^linux-modules-extra-aws-5.15$'
  339s autopkgtest [16:53:45]: rebooting testbed after setup commands that 
affected boot
  363s autopkgtest [16:54:09]: testbed running kernel: Linux 5.15.0-1057-aws 
#63~20.04.1-Ubuntu SMP Mon Mar 25 10:28:36 UTC 2024
  363s autopkgtest [16:54:09]:  apt-source linux-aws-5.15
  364s blame: linux-aws-5.15
  364s badpkg: rules extract failed with exit code 1
  364s autopkgtest [16:54:10]: ERROR: erroneous package: rules extract failed 
with exit code 1"

To manage notifications about this bug go to:
https://bugs.launchpad.net/auto-package-testing/+bug/2059978/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2052640] Re: New NVIDIA release 470.239.06

2024-02-26 Thread Francis Ginther
I have done my typical CUDA based testing with this package using the
generic, nvidia and gcp kernels using bionic, focal, jammy and mantic
(amd64 only so far).

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to nvidia-graphics-drivers-470 in Ubuntu.
https://bugs.launchpad.net/bugs/2052640

Title:
  New NVIDIA release 470.239.06

Status in fabric-manager-470 package in Ubuntu:
  In Progress
Status in libnvidia-nscq-470 package in Ubuntu:
  In Progress
Status in nvidia-graphics-drivers-470 package in Ubuntu:
  In Progress
Status in nvidia-graphics-drivers-470-server package in Ubuntu:
  In Progress
Status in fabric-manager-470 source package in Bionic:
  In Progress
Status in libnvidia-nscq-470 source package in Bionic:
  In Progress
Status in nvidia-graphics-drivers-470 source package in Bionic:
  In Progress
Status in nvidia-graphics-drivers-470-server source package in Bionic:
  In Progress
Status in fabric-manager-470 source package in Focal:
  In Progress
Status in libnvidia-nscq-470 source package in Focal:
  In Progress
Status in nvidia-graphics-drivers-470 source package in Focal:
  In Progress
Status in nvidia-graphics-drivers-470-server source package in Focal:
  In Progress
Status in fabric-manager-470 source package in Jammy:
  In Progress
Status in libnvidia-nscq-470 source package in Jammy:
  In Progress
Status in nvidia-graphics-drivers-470 source package in Jammy:
  In Progress
Status in nvidia-graphics-drivers-470-server source package in Jammy:
  In Progress
Status in fabric-manager-470 source package in Mantic:
  In Progress
Status in libnvidia-nscq-470 source package in Mantic:
  In Progress
Status in nvidia-graphics-drivers-470 source package in Mantic:
  In Progress
Status in nvidia-graphics-drivers-470-server source package in Mantic:
  In Progress
Status in fabric-manager-470 source package in Noble:
  In Progress
Status in libnvidia-nscq-470 source package in Noble:
  In Progress
Status in nvidia-graphics-drivers-470 source package in Noble:
  In Progress
Status in nvidia-graphics-drivers-470-server source package in Noble:
  In Progress

Bug description:
  [Impact]
  These releases provide both bug fixes and new features, and we would like to
  make sure all of our users have access to these improvements.

  See the changelog entry below for a full list of changes and bugs.

  [Test Case]
  The following development and SRU process was followed:
  https://wiki.ubuntu.com/NVidiaUpdates

  Certification test suite must pass on a range of hardware:
  https://git.launchpad.net/plainbox-provider-sru/tree/units/sru.pxu

  The QA team that executed the tests will be in charge of attaching the
  artifacts and console output of the appropriate run to the bug. nVidia
  maintainers team members will not mark ‘verification-done’ until this
  has happened.

  Note as this is a legacy driver, the QA team available hardware might
  be limited if not existent. Tests on GKE might be suitable, as they
  still default to 470 series.

  [Regression Potential]
  In order to mitigate the regression potential, the results of the
  aforementioned system level tests are attached to this bug.

  [Discussion]

  [Changelog]

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/fabric-manager-470/+bug/2052640/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2029934] Re: arm64 AWS host hangs during modprobe nvidia on lunar and mantic

2024-01-26 Thread Francis Ginther
I can reproduce the failure on mantic with both the DKMS and LRM
drivers. Specifically what I'm doing to install these are:

for DKMS:
sudo DEBIAN_FRONTEND=noninteractive apt-get install -y nvidia-driver-535-server

for LRM:
sudo DEBIAN_FRONTEND=noninteractive apt-get install -y 
nvidia-headless-no-dkms-535-server linux-modules-nvidia-535-server-generic 
nvidia-utils-535-server

I'm intentionally not using `ubuntu-drivers` to isolate this testing to
just the installation and functioning of the drivers.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-aws in Ubuntu.
https://bugs.launchpad.net/bugs/2029934

Title:
  arm64 AWS host hangs during modprobe nvidia on lunar and mantic

Status in linux-aws package in Ubuntu:
  Confirmed
Status in nvidia-graphics-drivers-525 package in Ubuntu:
  Confirmed
Status in nvidia-graphics-drivers-525-server package in Ubuntu:
  Confirmed
Status in nvidia-graphics-drivers-535 package in Ubuntu:
  Confirmed
Status in nvidia-graphics-drivers-535-server package in Ubuntu:
  Confirmed

Bug description:
  Loading the nvidia driver dkms modules with "modprove nvidia" will
  result in the host hanging and being completely unusable. This was
  reproduced using both the linux generic and linux-aws kernels on lunar
  and mantic using an AWS g5g.xlarge instance.

  To reproduce using the generic kernel:
  # Deploy a arm64 host with an nvidia gpu, such as an AWS g5g.xlarge.

  # Install the linux generic kernel from lunar-updates:
  $ sudo DEBIAN_FRONTEND=noninteractive apt-get install -y -o 
DPkg::Options::=--force-confold linux-generic

  # Boot to the linux-generic kernel (this can be accomplished by removing the 
existing kernel, in this case it was the linux-aws 6.2.0-1008-aws kernel)
  $ sudo DEBIAN_FRONTEND=noninteractive apt-get purge -y -o 
DPkg::Options::=--force-confold linux-aws linux-aws-headers-6.2.0-1008 
linux-headers-6.2.0-1008-aws linux-headers-aws linux-image-6.2.0-1008-aws 
linux-image-aws linux-modules-6.2.0-1008-aws  linux-headers-6.2.0-1008-aws 
linux-image-6.2.0-1008-aws linux-modules-6.2.0-1008-aws
  $ reboot

  # Install the Nvidia 535-server driver DKMS package:
  $ sudo DEBIAN_FRONTEND=noninteractive apt-get install -y 
nvidia-driver-535-server

  # Enable the driver
  $ sudo modprobe nvidia

  # At this point the system will hang and never return.
  # A reboot instead of a modprobe will result in a system that never boots up 
all the way. I was able to recover the console logs from such a system and 
found (the full captured log is attached):

  [1.964942] nvidia: loading out-of-tree module taints kernel.
  [1.965475] nvidia: module license 'NVIDIA' taints kernel.
  [1.965905] Disabling lock debugging due to kernel taint
  [1.980905] nvidia: module verification failed: signature and/or required 
key missing - tainting kernel
  [2.012067] nvidia-nvlink: Nvlink Core is being initialized, major device 
number 510
  [2.012715] 
  [   62.025143] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
  [   62.025807] rcu:   3-...0: (14 ticks this GP) 
idle=c04c/1/0x4000 softirq=653/654 fqs=3301
  [   62.026516](detected by 0, t=15003 jiffies, g=-699, q=216 ncpus=4)
  [   62.027018] Task dump for CPU 3:
  [   62.027290] task:systemd-udevd   state:R  running task stack:0 
pid:164   ppid:144flags:0x000e
  [   62.028066] Call trace:
  [   62.028273]  __switch_to+0xbc/0x100
  [   62.028567]  0x228
  Timed out for waiting the udev queue being empty.
  Timed out for waiting the udev queue being empty.
  [  242.045143] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
  [  242.045655] rcu:   3-...0: (14 ticks this GP) 
idle=c04c/1/0x4000 softirq=653/654 fqs=12303
  [  242.046373](detected by 1, t=60008 jiffies, g=-699, q=937 ncpus=4)
  [  242.046874] Task dump for CPU 3:
  [  242.047146] task:systemd-udevd   state:R  running task stack:0 
pid:164   ppid:144flags:0x000f
  [  242.047922] Call trace:
  [  242.048128]  __switch_to+0xbc/0x100
  [  242.048417]  0x228
  Timed out for waiting the udev queue being empty.
  Begin: Loading essential drivers ... [  384.001142] watchdog: BUG: soft 
lockup - CPU#2 stuck for 22s! [modprobe:215]
  [  384.001738] Modules linked in: nvidia(POE+) crct10dif_ce video polyval_ce 
polyval_generic drm_kms_helper ghash_ce syscopyarea sm4 sysfillrect sha2_ce 
sysimgblt sha256_arm64 sha1_ce drm nvme nvme_core ena nvme_common aes_neon_bs 
aes_neon_blk aes_ce_blk aes_ce_cipher
  [  384.003513] CPU: 2 PID: 215 Comm: modprobe Tainted: P   OE  
6.2.0-26-generic #26-Ubuntu
  [  384.004210] Hardware name: Amazon EC2 g5g.xlarge/, BIOS 1.0 11/1/2018
  [  384.004715] pstate: 8045 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
  [  384.005259] pc : smp_call_function_many_cond+0x1b4/0x4b4
  [  384.005683] lr : smp_call_function_many_cond+0x1d0/0x4b4
  [  384.006108] sp : 889a

[Kernel-packages] [Bug 2042564] Re: Performance regression in the 5.15 Ubuntu 20.04 kernel compared to 5.4 Ubuntu 20.04 kernel

2023-12-08 Thread Francis Ginther
We are still looking into this issue. While we can reproduce the test
case and see difference in the performance, the delta is not as
significant and our results have not very consistent. I'm taking the
approach of setting up a more comprehensive test environment to run more
tests faster. Hopefully we can then go through an analysis and bisect
process with more meaningful results.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2042564

Title:
  Performance regression in the 5.15 Ubuntu 20.04 kernel compared to 5.4
  Ubuntu 20.04 kernel

Status in linux package in Ubuntu:
  New
Status in linux source package in Focal:
  New

Bug description:
  We in the Canonical Public Cloud team have received report from our
  colleagues in Google regarding a potential performance regression with
  the 5.15 kernel vs the 5.4 kernel on ubuntu 20.04. Their test were
  performed using the linux-gkeop and linux-gkeop-5.15 kernels.

  I have verified with the generic Ubuntu 20.04 5.4 linux-generic and
  the Ubuntu 20.04 5.15 linux-generic-hwe-20.04 kernels.

  The tests were run using `fio`

  fio commands:

  * 4k initwrite: `fio --ioengine=libaio --blocksize=4k --readwrite=write 
--filesize=40G --end_fsync=1 --iodepth=128 --direct=1 --group_reporting 
--numjobs=8 --name=fiojob1 --filename=/dev/sdc`
  * 4k overwrite: `fio --ioengine=libaio --blocksize=4k --readwrite=write 
--filesize=40G --end_fsync=1 --iodepth=128 --direct=1 --group_reporting 
--numjobs=8 --name=fiojob1 --filename=/dev/sdc`

  
  My reproducer was to launch an Ubuntu 20.04 cloud image locally with qemu the 
results are below:

  Using 5.4 kernel

  ```
  ubuntu@cloudimg:~$ uname --kernel-release
  5.4.0-164-generic

  ubuntu@cloudimg:~$ sudo fio --ioengine=libaio --blocksize=4k 
--readwrite=write --filesize=40G --end_fsync=1 --iodepth=128 --direct=1 
--group_reporting --numjobs=8 --name=fiojob1 --filename=/dev/sda
  fiojob1: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 
4096B-4096B, ioengine=libaio, iodepth=128
  ...
  fio-3.16
  Starting 8 processes
  Jobs: 8 (f=8): [W(8)][99.6%][w=925MiB/s][w=237k IOPS][eta 00m:01s] 
  fiojob1: (groupid=0, jobs=8): err= 0: pid=2443: Thu Nov  2 09:15:22 2023
write: IOPS=317k, BW=1237MiB/s (1297MB/s)(320GiB/264837msec); 0 zone resets
  slat (nsec): min=628, max=37820k, avg=7207.71, stdev=101058.61
  clat (nsec): min=457, max=56099k, avg=340.45, stdev=1707823.38
   lat (usec): min=23, max=56100, avg=3229.78, stdev=1705.80
  clat percentiles (usec):
   |  1.00th=[  775],  5.00th=[ 1352], 10.00th=[ 1647], 20.00th=[ 2024],
   | 30.00th=[ 2343], 40.00th=[ 2638], 50.00th=[ 2933], 60.00th=[ 3261],
   | 70.00th=[ 3654], 80.00th=[ 4146], 90.00th=[ 5014], 95.00th=[ 5932],
   | 99.00th=[ 8979], 99.50th=[10945], 99.90th=[18220], 99.95th=[22676],
   | 99.99th=[32113]
 bw (  MiB/s): min=  524, max= 1665, per=100.00%, avg=1237.72, stdev=20.42, 
samples=4232
 iops: min=134308, max=426326, avg=316855.16, stdev=5227.36, 
samples=4232
lat (nsec)   : 500=0.01%, 750=0.01%, 1000=0.01%
lat (usec)   : 4=0.01%, 10=0.01%, 20=0.01%, 50=0.01%, 100=0.01%
lat (usec)   : 250=0.05%, 500=0.54%, 750=0.37%, 1000=0.93%
lat (msec)   : 2=17.40%, 4=58.02%, 10=22.01%, 20=0.60%, 50=0.07%
lat (msec)   : 100=0.01%
cpu  : usr=3.29%, sys=7.45%, ctx=1262621, majf=0, minf=103
IO depths: 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
   submit: 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, 
>=64=0.0%
   complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, 
>=64=0.1%
   issued rwts: total=0,83886080,0,8 short=0,0,0,0 dropped=0,0,0,0
   latency   : target=0, window=0, percentile=100.00%, depth=128

  Run status group 0 (all jobs):
WRITE: bw=1237MiB/s (1297MB/s), 1237MiB/s-1237MiB/s (1297MB/s-1297MB/s), 
io=320GiB (344GB), run=264837-264837msec

  Disk stats (read/write):
sda: ios=36/32868891, merge=0/50979424, ticks=5/27498602, in_queue=1183124, 
util=100.00%
  ```

  
  After upgrading to linux-generic-hwe-20.04 kernel and rebooting

  ```
  ubuntu@cloudimg:~$ uname --kernel-release
  5.15.0-88-generic

  ubuntu@cloudimg:~$ sudo fio --ioengine=libaio --blocksize=4k 
--readwrite=write --filesize=40G --end_fsync=1 --iodepth=128 --direct=1 
--group_reporting --numjobs=8 --name=fiojob1 --filename=/dev/sda
  fiojob1: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 
4096B-4096B, ioengine=libaio, iodepth=128
  ...
  fio-3.16
  Starting 8 processes
  Jobs: 1 (f=1): [_(7),W(1)][100.0%][w=410MiB/s][w=105k IOPS][eta 00m:00s]
  fiojob1: (groupid=0, jobs=8): err= 0: pid=1438: Thu Nov  2 09:46:49 2023
write: IOPS=155k, BW=605MiB/s (634MB/s)(320GiB/541949msec); 0 zone resets
  slat (nsec): min=660, max=325426k, avg=10351.04, stdev=232438.50
  clat (nsec): min=1100, max=782743k, avg=6595

[Kernel-packages] [Bug 2037417] Re: mantic images after 20230917 are failing to deploy with failure to mount root and kernel filesystems

2023-10-09 Thread Francis Ginther
The latest maas images from 20231008 are booting without issue:

ubuntu@akis:~$ lsb_release -sc
No LSB modules are available.
mantic
ubuntu@akis:~$ cat /etc/cloud/build.info 
build_name: server
serial: 20231008
ubuntu@akis:~$ uname -a
Linux akis 6.5.0-7-generic #7-Ubuntu SMP PREEMPT_DYNAMIC Fri Sep 29 09:14:56 
UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2037417

Title:
  mantic images after 20230917 are failing to deploy with failure to
  mount root and kernel filesystems

Status in cloud-images:
  New
Status in maas-images:
  Confirmed
Status in The Ubuntu-power-systems project:
  Invalid
Status in Release Notes for Ubuntu:
  New
Status in linux package in Ubuntu:
  Invalid
Status in systemd package in Ubuntu:
  Invalid
Status in util-linux package in Ubuntu:
  Fix Released
Status in linux source package in Mantic:
  Invalid
Status in systemd source package in Mantic:
  Invalid
Status in util-linux source package in Mantic:
  Fix Released

Bug description:
  Mantic arm64 deploys started failing on Sept 18th with:

  [   41.913552] systemd[1]: Starting systemd-remount-fs.service - Remount Root 
and Kernel File Systems...
   Starting systemd-remount-f鈥t Root and Kernel File 
Systems...
  [   41.940748] systemd[1]: Starting systemd-udev-trigger.service - Coldplug 
All udev Devices...
   Starting systemd-udev-trig鈥0m - Coldplug All udev 
Devices...
  [   41.964758] systemd[1]: Started systemd-journald.service - Journal Service.
  [  OK  ] Started systemd-journald.service - Journal 
Service.
  [  OK  ] Mounted dev-hugepages.mount - Huge Pages 
File System.
  [  OK  ] Mounted dev-mqueue.mount[鈥�- POSIX Message 
Queue File System.
  [  OK  ] Mounted sys-kernel-debug.m鈥t - Kernel Debug 
File System.
  [  OK  ] Mounted sys-kernel-tracing鈥t - Kernel Trace 
File System.
  [  OK  ] Finished keyboard-setup.se鈥�- Set the console 
keyboard layout.
  [  OK  ] Finished kmod-static-nodes鈥eate List of Static 
Device Nodes.
  [  OK  ] Finished lvm2-monitor.serv鈥ing dmeventd or 
progress polling.
  [  OK  ] Finished modprobe@configfs鈥0m - Load Kernel 
Module configfs.
  [  OK  ] Finished modprobe@dm_mod.s鈥 - Load Kernel 
Module dm_mod.
  [  OK  ] Finished modprobe@drm.service - Load Kernel 
Module drm.
  [  OK  ] Finished modprobe@efi_psto鈥 - Load Kernel 
Module efi_pstore.
  [  OK  ] Finished modprobe@fuse.service - Load Kernel 
Module fuse.
  [  OK  ] Finished modprobe@loop.service - Load Kernel 
Module loop.
  [  OK  ] Finished systemd-modules-l鈥ervice - Load 
Kernel Modules.
  [FAILED] Failed to start systemd-re鈥unt Root and 
Kernel File Systems.
  See 'systemctl status systemd-remount-fs.service' for details.

  After this many other services and cloud-init fails. See the full
  kopter-0918.log. For comparison, a log from the prior day's test is
  also attached.

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-images/+bug/2037417/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2037417] Re: mantic images after 20230917 are failing to deploy with failure to mount root and kernel filesystems

2023-10-06 Thread Francis Ginther
Special maas image built with util-linux, 2.39.1-4ubuntu2, from
https://ppa.launchpadcontent.net/xnox/release-critical/ubuntu is looking
good. I have one machine deployed with this:

ubuntu@rumford:~$ uname -r
6.5.0-5-lowlatency
ubuntu@rumford:~$ apt-cache policy util-linux
util-linux:
  Installed: 2.39.1-4ubuntu2
  Candidate: 2.39.1-4ubuntu2
  Version table:
 *** 2.39.1-4ubuntu2 500
500 https://ppa.launchpadcontent.net/xnox/release-critical/ubuntu 
mantic/main amd64 Packages
100 /var/lib/dpkg/status
 2.39.1-4ubuntu1 500
500 http://archive.ubuntu.com/ubuntu mantic/main amd64 Packages
ubuntu@rumford:~$ cat /etc/cloud/build.info 
build_name: server
serial: 20231006.1732

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2037417

Title:
  mantic images after 20230917 are failing to deploy with failure to
  mount root and kernel filesystems

Status in cloud-images:
  New
Status in maas-images:
  Confirmed
Status in The Ubuntu-power-systems project:
  Invalid
Status in Release Notes for Ubuntu:
  New
Status in linux package in Ubuntu:
  Invalid
Status in systemd package in Ubuntu:
  Invalid
Status in util-linux package in Ubuntu:
  Triaged
Status in linux source package in Mantic:
  Invalid
Status in systemd source package in Mantic:
  Invalid
Status in util-linux source package in Mantic:
  Triaged

Bug description:
  Mantic arm64 deploys started failing on Sept 18th with:

  [   41.913552] systemd[1]: Starting systemd-remount-fs.service - Remount Root 
and Kernel File Systems...
   Starting systemd-remount-f鈥t Root and Kernel File 
Systems...
  [   41.940748] systemd[1]: Starting systemd-udev-trigger.service - Coldplug 
All udev Devices...
   Starting systemd-udev-trig鈥0m - Coldplug All udev 
Devices...
  [   41.964758] systemd[1]: Started systemd-journald.service - Journal Service.
  [  OK  ] Started systemd-journald.service - Journal 
Service.
  [  OK  ] Mounted dev-hugepages.mount - Huge Pages 
File System.
  [  OK  ] Mounted dev-mqueue.mount[鈥�- POSIX Message 
Queue File System.
  [  OK  ] Mounted sys-kernel-debug.m鈥t - Kernel Debug 
File System.
  [  OK  ] Mounted sys-kernel-tracing鈥t - Kernel Trace 
File System.
  [  OK  ] Finished keyboard-setup.se鈥�- Set the console 
keyboard layout.
  [  OK  ] Finished kmod-static-nodes鈥eate List of Static 
Device Nodes.
  [  OK  ] Finished lvm2-monitor.serv鈥ing dmeventd or 
progress polling.
  [  OK  ] Finished modprobe@configfs鈥0m - Load Kernel 
Module configfs.
  [  OK  ] Finished modprobe@dm_mod.s鈥 - Load Kernel 
Module dm_mod.
  [  OK  ] Finished modprobe@drm.service - Load Kernel 
Module drm.
  [  OK  ] Finished modprobe@efi_psto鈥 - Load Kernel 
Module efi_pstore.
  [  OK  ] Finished modprobe@fuse.service - Load Kernel 
Module fuse.
  [  OK  ] Finished modprobe@loop.service - Load Kernel 
Module loop.
  [  OK  ] Finished systemd-modules-l鈥ervice - Load 
Kernel Modules.
  [FAILED] Failed to start systemd-re鈥unt Root and 
Kernel File Systems.
  See 'systemctl status systemd-remount-fs.service' for details.

  After this many other services and cloud-init fails. See the full
  kopter-0918.log. For comparison, a log from the prior day's test is
  also attached.

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-images/+bug/2037417/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2037417] Re: mantic images after 20230917 are failing to deploy with failure to mount root and kernel filesystems

2023-10-04 Thread Francis Ginther
** Project changed: linux => linux (Ubuntu)

** Changed in: linux (Ubuntu)
Milestone: None => ubuntu-23.10

** Also affects: linux (Ubuntu Mantic)
   Importance: Undecided
   Status: New

** Also affects: systemd (Ubuntu Mantic)
   Importance: Undecided
   Status: Confirmed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2037417

Title:
  mantic images after 20230917 are failing to deploy with failure to
  mount root and kernel filesystems

Status in maas-images:
  New
Status in The Ubuntu-power-systems project:
  Confirmed
Status in linux package in Ubuntu:
  New
Status in systemd package in Ubuntu:
  Confirmed
Status in linux source package in Mantic:
  New
Status in systemd source package in Mantic:
  Confirmed

Bug description:
  Mantic arm64 deploys started failing on Sept 18th with:

  [   41.913552] systemd[1]: Starting systemd-remount-fs.service - Remount Root 
and Kernel File Systems...
   Starting systemd-remount-f鈥t Root and Kernel File 
Systems...
  [   41.940748] systemd[1]: Starting systemd-udev-trigger.service - Coldplug 
All udev Devices...
   Starting systemd-udev-trig鈥0m - Coldplug All udev 
Devices...
  [   41.964758] systemd[1]: Started systemd-journald.service - Journal Service.
  [  OK  ] Started systemd-journald.service - Journal 
Service.
  [  OK  ] Mounted dev-hugepages.mount - Huge Pages 
File System.
  [  OK  ] Mounted dev-mqueue.mount[鈥�- POSIX Message 
Queue File System.
  [  OK  ] Mounted sys-kernel-debug.m鈥t - Kernel Debug 
File System.
  [  OK  ] Mounted sys-kernel-tracing鈥t - Kernel Trace 
File System.
  [  OK  ] Finished keyboard-setup.se鈥�- Set the console 
keyboard layout.
  [  OK  ] Finished kmod-static-nodes鈥eate List of Static 
Device Nodes.
  [  OK  ] Finished lvm2-monitor.serv鈥ing dmeventd or 
progress polling.
  [  OK  ] Finished modprobe@configfs鈥0m - Load Kernel 
Module configfs.
  [  OK  ] Finished modprobe@dm_mod.s鈥 - Load Kernel 
Module dm_mod.
  [  OK  ] Finished modprobe@drm.service - Load Kernel 
Module drm.
  [  OK  ] Finished modprobe@efi_psto鈥 - Load Kernel 
Module efi_pstore.
  [  OK  ] Finished modprobe@fuse.service - Load Kernel 
Module fuse.
  [  OK  ] Finished modprobe@loop.service - Load Kernel 
Module loop.
  [  OK  ] Finished systemd-modules-l鈥ervice - Load 
Kernel Modules.
  [FAILED] Failed to start systemd-re鈥unt Root and 
Kernel File Systems.
  See 'systemctl status systemd-remount-fs.service' for details.

  After this many other services and cloud-init fails. See the full
  kopter-0918.log. For comparison, a log from the prior day's test is
  also attached.

To manage notifications about this bug go to:
https://bugs.launchpad.net/maas-images/+bug/2037417/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2034447] acpidump.txt

2023-09-06 Thread Francis Ginther
apport information

** Attachment added: "acpidump.txt"
   
https://bugs.launchpad.net/bugs/2034447/+attachment/5697982/+files/acpidump.txt

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2034447

Title:
  `refcount_t: underflow; use-after-free.` on hidon w/ 5.15.0-85-generic

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  Seeing a panic on hidon (an Nvidia H100) after booting the
  5.15.0-85-generic kernel:

  [   58.935877] [ cut here ]
  [   58.935893] refcount_t: underflow; use-after-free.
  [   58.935920] WARNING: CPU: 207 PID: 2985 at lib/refcount.c:28 
refcount_warn_saturate+0xf7/0x150
  [   58.935943] Modules linked in: x86_pkg_temp_thermal(+) intel_powerclamp 
coretemp nls_iso8859_1 rapl irdma(+) i40e qat_4xxx(+) isst_if_mbox_pci 
intel_qat pmt_telemetry pmt_crashlog idxd(+) isst_if_mmio pmt_class 
isst_if_common authenc idxd_bus intel_th_gth mei_me intel_th_pci intel_th mei 
switchtec ipmi_ssif acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler mac_hid 
sch_fq_codel dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ramoops 
reed_solomon pstore_blk pstore_zone efi_pstore ip_tables x_tables autofs4 btrfs 
blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy 
async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 multipath linear 
mlx5_ib ib_uverbs ib_core ast i2c_algo_bit drm_vram_helper drm_ttm_helper ttm 
drm_kms_helper raid0 mlx5_core syscopyarea sysfillrect sysimgblt 
crct10dif_pclmul fb_sys_fops crc32_pclmul ixgbe cec mlxfw ghash_clmulni_intel 
aesni_intel psample crypto_simd xfrm_algo ice rc_core cryptd tls nvme i2c_i801 
dca xhci_pci intel_pmt drm
  [   58.936077]  pci_hyperv_intf i2c_ismt i2c_smbus
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936083]  mdio
  [   58.936096]  xhci_pci_renesas nvme_core wmi pinctrl_emmitsburg
  [   58.936106] CPU: 207 PID: 2985 Comm: systemd-udevd Not tainted 
5.15.0-85-generic #95-Ubuntu
  [   58.936115] Hardware name: NVIDIA DGXH100/DGXH100, BIOS 1.0.7 05/08/2023
  [   58.936119] RIP: 0010:refcount_warn_saturate+0xf7/0x150
  [   58.936130] Code: eb 9e 0f b6 1d 5e e6 b9 01 80 fb 01 0f 87 f4 63 6f 00 83 
e3 01 75 89 48 c7 c7 88 c3 23 9e c6 05 42 e6 b9 01 01 e8 d8 e4 6b 00 <0f> 0b e9 
6f ff ff ff 0f b6 1d 2d e6 b9 01 80 fb 01 0f 87 b1 63 6f
  [   58.936135] RSP: 0018:ff4d5d94b2c7fa28 EFLAGS: 00010282
  [   58.936142] RAX:  RBX:  RCX: 
0027
  [   58.936146] RDX: ff314dbbbf9e0588 RSI: 0001 RDI: 
ff314dbbbf9e0580
  [   58.936149] RBP: ff4d5d94b2c7fa30 R08: 0026 R09: 
ff4d5d94b2c7f9c0
  [   58.936153] R10: 0028 R11: 0001 R12: 

  [   58.936156] R13: ff314cbfdbcb6900 R14: ff314cbfdbcb67b8 R15: 
ff314cbfd24b4000
  [   58.936159] FS:  7fadd2f6c8c0() GS:ff314dbbbf9c() 
knlGS:
  [   58.936163] CS:  0010 DS:  ES:  CR0: 80050033
  [   58.936167] CR2: 7fadd243b584 CR3: 00012972c006 CR4: 
00771ee0
  [   58.936171] DR0:  DR1:  DR2: 

  [   58.936174] DR3:  DR6: fffe07f0 DR7: 
0400
  [   58.936177] PKRU: 5554
  [   58.936179] Call Trace:
  [   58.936184]  
  [   58.936188]  ? show_trace_log_lvl+0x1d6/0x2ea
  [   58.936204]  ? show_trace_log_lvl+0x1d6/0x2ea
  [   58.936212]  ? crypto_mod_put+0x6b/0x80
  [   58.936225]  ? show_regs.part.0+0x23/0x29
  [   58.936232]  ? show_regs.cold+0x8/0xd
  [   58.936239]  ? refcount_warn_saturate+0xf7/0x150
  [   58.936246]  ? __warn+0x8c/0x100
  [   58.936255]  ? refcount_warn_saturate+0xf7/0x150
  [   58.936263]  ? report_bug+0xa4/0xd0
  [   58.936274]  ? down_trylock+0x2e/0x40
  [   58.936285]  ? handle_bug+0x39/0x90
  [   58.936296]  ? exc_invalid_op+0x19/0x70
  [   58.936301]  ? asm_exc_invalid_op+0x1b/0x20
  [   58.936310]  ? refcount_warn_saturate+0xf7/0x150
  [   58.936317]  ? refcount_warn_saturate+0xf7/0x150
  [   58.936323]  crypto_mod_put+0x6b/0x80
  [   58.936329]  crypto_destroy_tfm+0x4e/0xa0
  [   58.936336]  pkcs1pad_exit_tfm+0x15/0x20
  [   58.936345]  crypto_akcipher_exit_tfm+0x13/0x20
  [   58.936352]  crypto_destroy_tfm+0x43/0xa0
  [   58.936358]  public_key_verify_signature+0x2dc/0x3c0
  [   58.936366]  ? find_asymmetric_key+0xd2/0x1d0
  [   58.936374]  ? kfree+0x1f7/0x250
  [   58.936385]  public_key_verify_signature_2+0x15/0x20
  [   58.936389]  verify_signature+0x37/0x60
  [   58.936393]  pkcs7_validate_trust_one.constprop.0+0x156/0x1e0
  [   58.936400]  pkcs7_validate_trust+0x4a/0xa0
  [   58.93

[Kernel-packages] [Bug 2034447] WifiSyslog.txt

2023-09-06 Thread Francis Ginther
apport information

** Attachment added: "WifiSyslog.txt"
   
https://bugs.launchpad.net/bugs/2034447/+attachment/5697981/+files/WifiSyslog.txt

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2034447

Title:
  `refcount_t: underflow; use-after-free.` on hidon w/ 5.15.0-85-generic

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  Seeing a panic on hidon (an Nvidia H100) after booting the
  5.15.0-85-generic kernel:

  [   58.935877] [ cut here ]
  [   58.935893] refcount_t: underflow; use-after-free.
  [   58.935920] WARNING: CPU: 207 PID: 2985 at lib/refcount.c:28 
refcount_warn_saturate+0xf7/0x150
  [   58.935943] Modules linked in: x86_pkg_temp_thermal(+) intel_powerclamp 
coretemp nls_iso8859_1 rapl irdma(+) i40e qat_4xxx(+) isst_if_mbox_pci 
intel_qat pmt_telemetry pmt_crashlog idxd(+) isst_if_mmio pmt_class 
isst_if_common authenc idxd_bus intel_th_gth mei_me intel_th_pci intel_th mei 
switchtec ipmi_ssif acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler mac_hid 
sch_fq_codel dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ramoops 
reed_solomon pstore_blk pstore_zone efi_pstore ip_tables x_tables autofs4 btrfs 
blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy 
async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 multipath linear 
mlx5_ib ib_uverbs ib_core ast i2c_algo_bit drm_vram_helper drm_ttm_helper ttm 
drm_kms_helper raid0 mlx5_core syscopyarea sysfillrect sysimgblt 
crct10dif_pclmul fb_sys_fops crc32_pclmul ixgbe cec mlxfw ghash_clmulni_intel 
aesni_intel psample crypto_simd xfrm_algo ice rc_core cryptd tls nvme i2c_i801 
dca xhci_pci intel_pmt drm
  [   58.936077]  pci_hyperv_intf i2c_ismt i2c_smbus
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936083]  mdio
  [   58.936096]  xhci_pci_renesas nvme_core wmi pinctrl_emmitsburg
  [   58.936106] CPU: 207 PID: 2985 Comm: systemd-udevd Not tainted 
5.15.0-85-generic #95-Ubuntu
  [   58.936115] Hardware name: NVIDIA DGXH100/DGXH100, BIOS 1.0.7 05/08/2023
  [   58.936119] RIP: 0010:refcount_warn_saturate+0xf7/0x150
  [   58.936130] Code: eb 9e 0f b6 1d 5e e6 b9 01 80 fb 01 0f 87 f4 63 6f 00 83 
e3 01 75 89 48 c7 c7 88 c3 23 9e c6 05 42 e6 b9 01 01 e8 d8 e4 6b 00 <0f> 0b e9 
6f ff ff ff 0f b6 1d 2d e6 b9 01 80 fb 01 0f 87 b1 63 6f
  [   58.936135] RSP: 0018:ff4d5d94b2c7fa28 EFLAGS: 00010282
  [   58.936142] RAX:  RBX:  RCX: 
0027
  [   58.936146] RDX: ff314dbbbf9e0588 RSI: 0001 RDI: 
ff314dbbbf9e0580
  [   58.936149] RBP: ff4d5d94b2c7fa30 R08: 0026 R09: 
ff4d5d94b2c7f9c0
  [   58.936153] R10: 0028 R11: 0001 R12: 

  [   58.936156] R13: ff314cbfdbcb6900 R14: ff314cbfdbcb67b8 R15: 
ff314cbfd24b4000
  [   58.936159] FS:  7fadd2f6c8c0() GS:ff314dbbbf9c() 
knlGS:
  [   58.936163] CS:  0010 DS:  ES:  CR0: 80050033
  [   58.936167] CR2: 7fadd243b584 CR3: 00012972c006 CR4: 
00771ee0
  [   58.936171] DR0:  DR1:  DR2: 

  [   58.936174] DR3:  DR6: fffe07f0 DR7: 
0400
  [   58.936177] PKRU: 5554
  [   58.936179] Call Trace:
  [   58.936184]  
  [   58.936188]  ? show_trace_log_lvl+0x1d6/0x2ea
  [   58.936204]  ? show_trace_log_lvl+0x1d6/0x2ea
  [   58.936212]  ? crypto_mod_put+0x6b/0x80
  [   58.936225]  ? show_regs.part.0+0x23/0x29
  [   58.936232]  ? show_regs.cold+0x8/0xd
  [   58.936239]  ? refcount_warn_saturate+0xf7/0x150
  [   58.936246]  ? __warn+0x8c/0x100
  [   58.936255]  ? refcount_warn_saturate+0xf7/0x150
  [   58.936263]  ? report_bug+0xa4/0xd0
  [   58.936274]  ? down_trylock+0x2e/0x40
  [   58.936285]  ? handle_bug+0x39/0x90
  [   58.936296]  ? exc_invalid_op+0x19/0x70
  [   58.936301]  ? asm_exc_invalid_op+0x1b/0x20
  [   58.936310]  ? refcount_warn_saturate+0xf7/0x150
  [   58.936317]  ? refcount_warn_saturate+0xf7/0x150
  [   58.936323]  crypto_mod_put+0x6b/0x80
  [   58.936329]  crypto_destroy_tfm+0x4e/0xa0
  [   58.936336]  pkcs1pad_exit_tfm+0x15/0x20
  [   58.936345]  crypto_akcipher_exit_tfm+0x13/0x20
  [   58.936352]  crypto_destroy_tfm+0x43/0xa0
  [   58.936358]  public_key_verify_signature+0x2dc/0x3c0
  [   58.936366]  ? find_asymmetric_key+0xd2/0x1d0
  [   58.936374]  ? kfree+0x1f7/0x250
  [   58.936385]  public_key_verify_signature_2+0x15/0x20
  [   58.936389]  verify_signature+0x37/0x60
  [   58.936393]  pkcs7_validate_trust_one.constprop.0+0x156/0x1e0
  [   58.936400]  pkcs7_validate_trust+0x4a/0xa0
  [   5

[Kernel-packages] [Bug 2034447] UdevDb.txt

2023-09-06 Thread Francis Ginther
apport information

** Attachment added: "UdevDb.txt"
   https://bugs.launchpad.net/bugs/2034447/+attachment/5697980/+files/UdevDb.txt

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2034447

Title:
  `refcount_t: underflow; use-after-free.` on hidon w/ 5.15.0-85-generic

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  Seeing a panic on hidon (an Nvidia H100) after booting the
  5.15.0-85-generic kernel:

  [   58.935877] [ cut here ]
  [   58.935893] refcount_t: underflow; use-after-free.
  [   58.935920] WARNING: CPU: 207 PID: 2985 at lib/refcount.c:28 
refcount_warn_saturate+0xf7/0x150
  [   58.935943] Modules linked in: x86_pkg_temp_thermal(+) intel_powerclamp 
coretemp nls_iso8859_1 rapl irdma(+) i40e qat_4xxx(+) isst_if_mbox_pci 
intel_qat pmt_telemetry pmt_crashlog idxd(+) isst_if_mmio pmt_class 
isst_if_common authenc idxd_bus intel_th_gth mei_me intel_th_pci intel_th mei 
switchtec ipmi_ssif acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler mac_hid 
sch_fq_codel dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ramoops 
reed_solomon pstore_blk pstore_zone efi_pstore ip_tables x_tables autofs4 btrfs 
blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy 
async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 multipath linear 
mlx5_ib ib_uverbs ib_core ast i2c_algo_bit drm_vram_helper drm_ttm_helper ttm 
drm_kms_helper raid0 mlx5_core syscopyarea sysfillrect sysimgblt 
crct10dif_pclmul fb_sys_fops crc32_pclmul ixgbe cec mlxfw ghash_clmulni_intel 
aesni_intel psample crypto_simd xfrm_algo ice rc_core cryptd tls nvme i2c_i801 
dca xhci_pci intel_pmt drm
  [   58.936077]  pci_hyperv_intf i2c_ismt i2c_smbus
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936083]  mdio
  [   58.936096]  xhci_pci_renesas nvme_core wmi pinctrl_emmitsburg
  [   58.936106] CPU: 207 PID: 2985 Comm: systemd-udevd Not tainted 
5.15.0-85-generic #95-Ubuntu
  [   58.936115] Hardware name: NVIDIA DGXH100/DGXH100, BIOS 1.0.7 05/08/2023
  [   58.936119] RIP: 0010:refcount_warn_saturate+0xf7/0x150
  [   58.936130] Code: eb 9e 0f b6 1d 5e e6 b9 01 80 fb 01 0f 87 f4 63 6f 00 83 
e3 01 75 89 48 c7 c7 88 c3 23 9e c6 05 42 e6 b9 01 01 e8 d8 e4 6b 00 <0f> 0b e9 
6f ff ff ff 0f b6 1d 2d e6 b9 01 80 fb 01 0f 87 b1 63 6f
  [   58.936135] RSP: 0018:ff4d5d94b2c7fa28 EFLAGS: 00010282
  [   58.936142] RAX:  RBX:  RCX: 
0027
  [   58.936146] RDX: ff314dbbbf9e0588 RSI: 0001 RDI: 
ff314dbbbf9e0580
  [   58.936149] RBP: ff4d5d94b2c7fa30 R08: 0026 R09: 
ff4d5d94b2c7f9c0
  [   58.936153] R10: 0028 R11: 0001 R12: 

  [   58.936156] R13: ff314cbfdbcb6900 R14: ff314cbfdbcb67b8 R15: 
ff314cbfd24b4000
  [   58.936159] FS:  7fadd2f6c8c0() GS:ff314dbbbf9c() 
knlGS:
  [   58.936163] CS:  0010 DS:  ES:  CR0: 80050033
  [   58.936167] CR2: 7fadd243b584 CR3: 00012972c006 CR4: 
00771ee0
  [   58.936171] DR0:  DR1:  DR2: 

  [   58.936174] DR3:  DR6: fffe07f0 DR7: 
0400
  [   58.936177] PKRU: 5554
  [   58.936179] Call Trace:
  [   58.936184]  
  [   58.936188]  ? show_trace_log_lvl+0x1d6/0x2ea
  [   58.936204]  ? show_trace_log_lvl+0x1d6/0x2ea
  [   58.936212]  ? crypto_mod_put+0x6b/0x80
  [   58.936225]  ? show_regs.part.0+0x23/0x29
  [   58.936232]  ? show_regs.cold+0x8/0xd
  [   58.936239]  ? refcount_warn_saturate+0xf7/0x150
  [   58.936246]  ? __warn+0x8c/0x100
  [   58.936255]  ? refcount_warn_saturate+0xf7/0x150
  [   58.936263]  ? report_bug+0xa4/0xd0
  [   58.936274]  ? down_trylock+0x2e/0x40
  [   58.936285]  ? handle_bug+0x39/0x90
  [   58.936296]  ? exc_invalid_op+0x19/0x70
  [   58.936301]  ? asm_exc_invalid_op+0x1b/0x20
  [   58.936310]  ? refcount_warn_saturate+0xf7/0x150
  [   58.936317]  ? refcount_warn_saturate+0xf7/0x150
  [   58.936323]  crypto_mod_put+0x6b/0x80
  [   58.936329]  crypto_destroy_tfm+0x4e/0xa0
  [   58.936336]  pkcs1pad_exit_tfm+0x15/0x20
  [   58.936345]  crypto_akcipher_exit_tfm+0x13/0x20
  [   58.936352]  crypto_destroy_tfm+0x43/0xa0
  [   58.936358]  public_key_verify_signature+0x2dc/0x3c0
  [   58.936366]  ? find_asymmetric_key+0xd2/0x1d0
  [   58.936374]  ? kfree+0x1f7/0x250
  [   58.936385]  public_key_verify_signature_2+0x15/0x20
  [   58.936389]  verify_signature+0x37/0x60
  [   58.936393]  pkcs7_validate_trust_one.constprop.0+0x156/0x1e0
  [   58.936400]  pkcs7_validate_trust+0x4a/0xa0
  [   58.936406]

[Kernel-packages] [Bug 2034447] ProcModules.txt

2023-09-06 Thread Francis Ginther
apport information

** Attachment added: "ProcModules.txt"
   
https://bugs.launchpad.net/bugs/2034447/+attachment/5697979/+files/ProcModules.txt

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2034447

Title:
  `refcount_t: underflow; use-after-free.` on hidon w/ 5.15.0-85-generic

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  Seeing a panic on hidon (an Nvidia H100) after booting the
  5.15.0-85-generic kernel:

  [   58.935877] [ cut here ]
  [   58.935893] refcount_t: underflow; use-after-free.
  [   58.935920] WARNING: CPU: 207 PID: 2985 at lib/refcount.c:28 
refcount_warn_saturate+0xf7/0x150
  [   58.935943] Modules linked in: x86_pkg_temp_thermal(+) intel_powerclamp 
coretemp nls_iso8859_1 rapl irdma(+) i40e qat_4xxx(+) isst_if_mbox_pci 
intel_qat pmt_telemetry pmt_crashlog idxd(+) isst_if_mmio pmt_class 
isst_if_common authenc idxd_bus intel_th_gth mei_me intel_th_pci intel_th mei 
switchtec ipmi_ssif acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler mac_hid 
sch_fq_codel dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ramoops 
reed_solomon pstore_blk pstore_zone efi_pstore ip_tables x_tables autofs4 btrfs 
blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy 
async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 multipath linear 
mlx5_ib ib_uverbs ib_core ast i2c_algo_bit drm_vram_helper drm_ttm_helper ttm 
drm_kms_helper raid0 mlx5_core syscopyarea sysfillrect sysimgblt 
crct10dif_pclmul fb_sys_fops crc32_pclmul ixgbe cec mlxfw ghash_clmulni_intel 
aesni_intel psample crypto_simd xfrm_algo ice rc_core cryptd tls nvme i2c_i801 
dca xhci_pci intel_pmt drm
  [   58.936077]  pci_hyperv_intf i2c_ismt i2c_smbus
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936083]  mdio
  [   58.936096]  xhci_pci_renesas nvme_core wmi pinctrl_emmitsburg
  [   58.936106] CPU: 207 PID: 2985 Comm: systemd-udevd Not tainted 
5.15.0-85-generic #95-Ubuntu
  [   58.936115] Hardware name: NVIDIA DGXH100/DGXH100, BIOS 1.0.7 05/08/2023
  [   58.936119] RIP: 0010:refcount_warn_saturate+0xf7/0x150
  [   58.936130] Code: eb 9e 0f b6 1d 5e e6 b9 01 80 fb 01 0f 87 f4 63 6f 00 83 
e3 01 75 89 48 c7 c7 88 c3 23 9e c6 05 42 e6 b9 01 01 e8 d8 e4 6b 00 <0f> 0b e9 
6f ff ff ff 0f b6 1d 2d e6 b9 01 80 fb 01 0f 87 b1 63 6f
  [   58.936135] RSP: 0018:ff4d5d94b2c7fa28 EFLAGS: 00010282
  [   58.936142] RAX:  RBX:  RCX: 
0027
  [   58.936146] RDX: ff314dbbbf9e0588 RSI: 0001 RDI: 
ff314dbbbf9e0580
  [   58.936149] RBP: ff4d5d94b2c7fa30 R08: 0026 R09: 
ff4d5d94b2c7f9c0
  [   58.936153] R10: 0028 R11: 0001 R12: 

  [   58.936156] R13: ff314cbfdbcb6900 R14: ff314cbfdbcb67b8 R15: 
ff314cbfd24b4000
  [   58.936159] FS:  7fadd2f6c8c0() GS:ff314dbbbf9c() 
knlGS:
  [   58.936163] CS:  0010 DS:  ES:  CR0: 80050033
  [   58.936167] CR2: 7fadd243b584 CR3: 00012972c006 CR4: 
00771ee0
  [   58.936171] DR0:  DR1:  DR2: 

  [   58.936174] DR3:  DR6: fffe07f0 DR7: 
0400
  [   58.936177] PKRU: 5554
  [   58.936179] Call Trace:
  [   58.936184]  
  [   58.936188]  ? show_trace_log_lvl+0x1d6/0x2ea
  [   58.936204]  ? show_trace_log_lvl+0x1d6/0x2ea
  [   58.936212]  ? crypto_mod_put+0x6b/0x80
  [   58.936225]  ? show_regs.part.0+0x23/0x29
  [   58.936232]  ? show_regs.cold+0x8/0xd
  [   58.936239]  ? refcount_warn_saturate+0xf7/0x150
  [   58.936246]  ? __warn+0x8c/0x100
  [   58.936255]  ? refcount_warn_saturate+0xf7/0x150
  [   58.936263]  ? report_bug+0xa4/0xd0
  [   58.936274]  ? down_trylock+0x2e/0x40
  [   58.936285]  ? handle_bug+0x39/0x90
  [   58.936296]  ? exc_invalid_op+0x19/0x70
  [   58.936301]  ? asm_exc_invalid_op+0x1b/0x20
  [   58.936310]  ? refcount_warn_saturate+0xf7/0x150
  [   58.936317]  ? refcount_warn_saturate+0xf7/0x150
  [   58.936323]  crypto_mod_put+0x6b/0x80
  [   58.936329]  crypto_destroy_tfm+0x4e/0xa0
  [   58.936336]  pkcs1pad_exit_tfm+0x15/0x20
  [   58.936345]  crypto_akcipher_exit_tfm+0x13/0x20
  [   58.936352]  crypto_destroy_tfm+0x43/0xa0
  [   58.936358]  public_key_verify_signature+0x2dc/0x3c0
  [   58.936366]  ? find_asymmetric_key+0xd2/0x1d0
  [   58.936374]  ? kfree+0x1f7/0x250
  [   58.936385]  public_key_verify_signature_2+0x15/0x20
  [   58.936389]  verify_signature+0x37/0x60
  [   58.936393]  pkcs7_validate_trust_one.constprop.0+0x156/0x1e0
  [   58.936400]  pkcs7_validate_trust+0x4a/0xa0
  [  

[Kernel-packages] [Bug 2034447] ProcCpuinfoMinimal.txt

2023-09-06 Thread Francis Ginther
apport information

** Attachment added: "ProcCpuinfoMinimal.txt"
   
https://bugs.launchpad.net/bugs/2034447/+attachment/5697977/+files/ProcCpuinfoMinimal.txt

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2034447

Title:
  `refcount_t: underflow; use-after-free.` on hidon w/ 5.15.0-85-generic

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  Seeing a panic on hidon (an Nvidia H100) after booting the
  5.15.0-85-generic kernel:

  [   58.935877] [ cut here ]
  [   58.935893] refcount_t: underflow; use-after-free.
  [   58.935920] WARNING: CPU: 207 PID: 2985 at lib/refcount.c:28 
refcount_warn_saturate+0xf7/0x150
  [   58.935943] Modules linked in: x86_pkg_temp_thermal(+) intel_powerclamp 
coretemp nls_iso8859_1 rapl irdma(+) i40e qat_4xxx(+) isst_if_mbox_pci 
intel_qat pmt_telemetry pmt_crashlog idxd(+) isst_if_mmio pmt_class 
isst_if_common authenc idxd_bus intel_th_gth mei_me intel_th_pci intel_th mei 
switchtec ipmi_ssif acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler mac_hid 
sch_fq_codel dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ramoops 
reed_solomon pstore_blk pstore_zone efi_pstore ip_tables x_tables autofs4 btrfs 
blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy 
async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 multipath linear 
mlx5_ib ib_uverbs ib_core ast i2c_algo_bit drm_vram_helper drm_ttm_helper ttm 
drm_kms_helper raid0 mlx5_core syscopyarea sysfillrect sysimgblt 
crct10dif_pclmul fb_sys_fops crc32_pclmul ixgbe cec mlxfw ghash_clmulni_intel 
aesni_intel psample crypto_simd xfrm_algo ice rc_core cryptd tls nvme i2c_i801 
dca xhci_pci intel_pmt drm
  [   58.936077]  pci_hyperv_intf i2c_ismt i2c_smbus
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936083]  mdio
  [   58.936096]  xhci_pci_renesas nvme_core wmi pinctrl_emmitsburg
  [   58.936106] CPU: 207 PID: 2985 Comm: systemd-udevd Not tainted 
5.15.0-85-generic #95-Ubuntu
  [   58.936115] Hardware name: NVIDIA DGXH100/DGXH100, BIOS 1.0.7 05/08/2023
  [   58.936119] RIP: 0010:refcount_warn_saturate+0xf7/0x150
  [   58.936130] Code: eb 9e 0f b6 1d 5e e6 b9 01 80 fb 01 0f 87 f4 63 6f 00 83 
e3 01 75 89 48 c7 c7 88 c3 23 9e c6 05 42 e6 b9 01 01 e8 d8 e4 6b 00 <0f> 0b e9 
6f ff ff ff 0f b6 1d 2d e6 b9 01 80 fb 01 0f 87 b1 63 6f
  [   58.936135] RSP: 0018:ff4d5d94b2c7fa28 EFLAGS: 00010282
  [   58.936142] RAX:  RBX:  RCX: 
0027
  [   58.936146] RDX: ff314dbbbf9e0588 RSI: 0001 RDI: 
ff314dbbbf9e0580
  [   58.936149] RBP: ff4d5d94b2c7fa30 R08: 0026 R09: 
ff4d5d94b2c7f9c0
  [   58.936153] R10: 0028 R11: 0001 R12: 

  [   58.936156] R13: ff314cbfdbcb6900 R14: ff314cbfdbcb67b8 R15: 
ff314cbfd24b4000
  [   58.936159] FS:  7fadd2f6c8c0() GS:ff314dbbbf9c() 
knlGS:
  [   58.936163] CS:  0010 DS:  ES:  CR0: 80050033
  [   58.936167] CR2: 7fadd243b584 CR3: 00012972c006 CR4: 
00771ee0
  [   58.936171] DR0:  DR1:  DR2: 

  [   58.936174] DR3:  DR6: fffe07f0 DR7: 
0400
  [   58.936177] PKRU: 5554
  [   58.936179] Call Trace:
  [   58.936184]  
  [   58.936188]  ? show_trace_log_lvl+0x1d6/0x2ea
  [   58.936204]  ? show_trace_log_lvl+0x1d6/0x2ea
  [   58.936212]  ? crypto_mod_put+0x6b/0x80
  [   58.936225]  ? show_regs.part.0+0x23/0x29
  [   58.936232]  ? show_regs.cold+0x8/0xd
  [   58.936239]  ? refcount_warn_saturate+0xf7/0x150
  [   58.936246]  ? __warn+0x8c/0x100
  [   58.936255]  ? refcount_warn_saturate+0xf7/0x150
  [   58.936263]  ? report_bug+0xa4/0xd0
  [   58.936274]  ? down_trylock+0x2e/0x40
  [   58.936285]  ? handle_bug+0x39/0x90
  [   58.936296]  ? exc_invalid_op+0x19/0x70
  [   58.936301]  ? asm_exc_invalid_op+0x1b/0x20
  [   58.936310]  ? refcount_warn_saturate+0xf7/0x150
  [   58.936317]  ? refcount_warn_saturate+0xf7/0x150
  [   58.936323]  crypto_mod_put+0x6b/0x80
  [   58.936329]  crypto_destroy_tfm+0x4e/0xa0
  [   58.936336]  pkcs1pad_exit_tfm+0x15/0x20
  [   58.936345]  crypto_akcipher_exit_tfm+0x13/0x20
  [   58.936352]  crypto_destroy_tfm+0x43/0xa0
  [   58.936358]  public_key_verify_signature+0x2dc/0x3c0
  [   58.936366]  ? find_asymmetric_key+0xd2/0x1d0
  [   58.936374]  ? kfree+0x1f7/0x250
  [   58.936385]  public_key_verify_signature_2+0x15/0x20
  [   58.936389]  verify_signature+0x37/0x60
  [   58.936393]  pkcs7_validate_trust_one.constprop.0+0x156/0x1e0
  [   58.936400]  pkcs7_validate_trust+0

[Kernel-packages] [Bug 2034447] ProcInterrupts.txt

2023-09-06 Thread Francis Ginther
apport information

** Attachment added: "ProcInterrupts.txt"
   
https://bugs.launchpad.net/bugs/2034447/+attachment/5697978/+files/ProcInterrupts.txt

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2034447

Title:
  `refcount_t: underflow; use-after-free.` on hidon w/ 5.15.0-85-generic

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  Seeing a panic on hidon (an Nvidia H100) after booting the
  5.15.0-85-generic kernel:

  [   58.935877] [ cut here ]
  [   58.935893] refcount_t: underflow; use-after-free.
  [   58.935920] WARNING: CPU: 207 PID: 2985 at lib/refcount.c:28 
refcount_warn_saturate+0xf7/0x150
  [   58.935943] Modules linked in: x86_pkg_temp_thermal(+) intel_powerclamp 
coretemp nls_iso8859_1 rapl irdma(+) i40e qat_4xxx(+) isst_if_mbox_pci 
intel_qat pmt_telemetry pmt_crashlog idxd(+) isst_if_mmio pmt_class 
isst_if_common authenc idxd_bus intel_th_gth mei_me intel_th_pci intel_th mei 
switchtec ipmi_ssif acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler mac_hid 
sch_fq_codel dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ramoops 
reed_solomon pstore_blk pstore_zone efi_pstore ip_tables x_tables autofs4 btrfs 
blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy 
async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 multipath linear 
mlx5_ib ib_uverbs ib_core ast i2c_algo_bit drm_vram_helper drm_ttm_helper ttm 
drm_kms_helper raid0 mlx5_core syscopyarea sysfillrect sysimgblt 
crct10dif_pclmul fb_sys_fops crc32_pclmul ixgbe cec mlxfw ghash_clmulni_intel 
aesni_intel psample crypto_simd xfrm_algo ice rc_core cryptd tls nvme i2c_i801 
dca xhci_pci intel_pmt drm
  [   58.936077]  pci_hyperv_intf i2c_ismt i2c_smbus
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936083]  mdio
  [   58.936096]  xhci_pci_renesas nvme_core wmi pinctrl_emmitsburg
  [   58.936106] CPU: 207 PID: 2985 Comm: systemd-udevd Not tainted 
5.15.0-85-generic #95-Ubuntu
  [   58.936115] Hardware name: NVIDIA DGXH100/DGXH100, BIOS 1.0.7 05/08/2023
  [   58.936119] RIP: 0010:refcount_warn_saturate+0xf7/0x150
  [   58.936130] Code: eb 9e 0f b6 1d 5e e6 b9 01 80 fb 01 0f 87 f4 63 6f 00 83 
e3 01 75 89 48 c7 c7 88 c3 23 9e c6 05 42 e6 b9 01 01 e8 d8 e4 6b 00 <0f> 0b e9 
6f ff ff ff 0f b6 1d 2d e6 b9 01 80 fb 01 0f 87 b1 63 6f
  [   58.936135] RSP: 0018:ff4d5d94b2c7fa28 EFLAGS: 00010282
  [   58.936142] RAX:  RBX:  RCX: 
0027
  [   58.936146] RDX: ff314dbbbf9e0588 RSI: 0001 RDI: 
ff314dbbbf9e0580
  [   58.936149] RBP: ff4d5d94b2c7fa30 R08: 0026 R09: 
ff4d5d94b2c7f9c0
  [   58.936153] R10: 0028 R11: 0001 R12: 

  [   58.936156] R13: ff314cbfdbcb6900 R14: ff314cbfdbcb67b8 R15: 
ff314cbfd24b4000
  [   58.936159] FS:  7fadd2f6c8c0() GS:ff314dbbbf9c() 
knlGS:
  [   58.936163] CS:  0010 DS:  ES:  CR0: 80050033
  [   58.936167] CR2: 7fadd243b584 CR3: 00012972c006 CR4: 
00771ee0
  [   58.936171] DR0:  DR1:  DR2: 

  [   58.936174] DR3:  DR6: fffe07f0 DR7: 
0400
  [   58.936177] PKRU: 5554
  [   58.936179] Call Trace:
  [   58.936184]  
  [   58.936188]  ? show_trace_log_lvl+0x1d6/0x2ea
  [   58.936204]  ? show_trace_log_lvl+0x1d6/0x2ea
  [   58.936212]  ? crypto_mod_put+0x6b/0x80
  [   58.936225]  ? show_regs.part.0+0x23/0x29
  [   58.936232]  ? show_regs.cold+0x8/0xd
  [   58.936239]  ? refcount_warn_saturate+0xf7/0x150
  [   58.936246]  ? __warn+0x8c/0x100
  [   58.936255]  ? refcount_warn_saturate+0xf7/0x150
  [   58.936263]  ? report_bug+0xa4/0xd0
  [   58.936274]  ? down_trylock+0x2e/0x40
  [   58.936285]  ? handle_bug+0x39/0x90
  [   58.936296]  ? exc_invalid_op+0x19/0x70
  [   58.936301]  ? asm_exc_invalid_op+0x1b/0x20
  [   58.936310]  ? refcount_warn_saturate+0xf7/0x150
  [   58.936317]  ? refcount_warn_saturate+0xf7/0x150
  [   58.936323]  crypto_mod_put+0x6b/0x80
  [   58.936329]  crypto_destroy_tfm+0x4e/0xa0
  [   58.936336]  pkcs1pad_exit_tfm+0x15/0x20
  [   58.936345]  crypto_akcipher_exit_tfm+0x13/0x20
  [   58.936352]  crypto_destroy_tfm+0x43/0xa0
  [   58.936358]  public_key_verify_signature+0x2dc/0x3c0
  [   58.936366]  ? find_asymmetric_key+0xd2/0x1d0
  [   58.936374]  ? kfree+0x1f7/0x250
  [   58.936385]  public_key_verify_signature_2+0x15/0x20
  [   58.936389]  verify_signature+0x37/0x60
  [   58.936393]  pkcs7_validate_trust_one.constprop.0+0x156/0x1e0
  [   58.936400]  pkcs7_validate_trust+0x4a/0xa0

[Kernel-packages] [Bug 2034447] ProcCpuinfo.txt

2023-09-06 Thread Francis Ginther
apport information

** Attachment added: "ProcCpuinfo.txt"
   
https://bugs.launchpad.net/bugs/2034447/+attachment/5697976/+files/ProcCpuinfo.txt

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2034447

Title:
  `refcount_t: underflow; use-after-free.` on hidon w/ 5.15.0-85-generic

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  Seeing a panic on hidon (an Nvidia H100) after booting the
  5.15.0-85-generic kernel:

  [   58.935877] [ cut here ]
  [   58.935893] refcount_t: underflow; use-after-free.
  [   58.935920] WARNING: CPU: 207 PID: 2985 at lib/refcount.c:28 
refcount_warn_saturate+0xf7/0x150
  [   58.935943] Modules linked in: x86_pkg_temp_thermal(+) intel_powerclamp 
coretemp nls_iso8859_1 rapl irdma(+) i40e qat_4xxx(+) isst_if_mbox_pci 
intel_qat pmt_telemetry pmt_crashlog idxd(+) isst_if_mmio pmt_class 
isst_if_common authenc idxd_bus intel_th_gth mei_me intel_th_pci intel_th mei 
switchtec ipmi_ssif acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler mac_hid 
sch_fq_codel dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ramoops 
reed_solomon pstore_blk pstore_zone efi_pstore ip_tables x_tables autofs4 btrfs 
blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy 
async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 multipath linear 
mlx5_ib ib_uverbs ib_core ast i2c_algo_bit drm_vram_helper drm_ttm_helper ttm 
drm_kms_helper raid0 mlx5_core syscopyarea sysfillrect sysimgblt 
crct10dif_pclmul fb_sys_fops crc32_pclmul ixgbe cec mlxfw ghash_clmulni_intel 
aesni_intel psample crypto_simd xfrm_algo ice rc_core cryptd tls nvme i2c_i801 
dca xhci_pci intel_pmt drm
  [   58.936077]  pci_hyperv_intf i2c_ismt i2c_smbus
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936083]  mdio
  [   58.936096]  xhci_pci_renesas nvme_core wmi pinctrl_emmitsburg
  [   58.936106] CPU: 207 PID: 2985 Comm: systemd-udevd Not tainted 
5.15.0-85-generic #95-Ubuntu
  [   58.936115] Hardware name: NVIDIA DGXH100/DGXH100, BIOS 1.0.7 05/08/2023
  [   58.936119] RIP: 0010:refcount_warn_saturate+0xf7/0x150
  [   58.936130] Code: eb 9e 0f b6 1d 5e e6 b9 01 80 fb 01 0f 87 f4 63 6f 00 83 
e3 01 75 89 48 c7 c7 88 c3 23 9e c6 05 42 e6 b9 01 01 e8 d8 e4 6b 00 <0f> 0b e9 
6f ff ff ff 0f b6 1d 2d e6 b9 01 80 fb 01 0f 87 b1 63 6f
  [   58.936135] RSP: 0018:ff4d5d94b2c7fa28 EFLAGS: 00010282
  [   58.936142] RAX:  RBX:  RCX: 
0027
  [   58.936146] RDX: ff314dbbbf9e0588 RSI: 0001 RDI: 
ff314dbbbf9e0580
  [   58.936149] RBP: ff4d5d94b2c7fa30 R08: 0026 R09: 
ff4d5d94b2c7f9c0
  [   58.936153] R10: 0028 R11: 0001 R12: 

  [   58.936156] R13: ff314cbfdbcb6900 R14: ff314cbfdbcb67b8 R15: 
ff314cbfd24b4000
  [   58.936159] FS:  7fadd2f6c8c0() GS:ff314dbbbf9c() 
knlGS:
  [   58.936163] CS:  0010 DS:  ES:  CR0: 80050033
  [   58.936167] CR2: 7fadd243b584 CR3: 00012972c006 CR4: 
00771ee0
  [   58.936171] DR0:  DR1:  DR2: 

  [   58.936174] DR3:  DR6: fffe07f0 DR7: 
0400
  [   58.936177] PKRU: 5554
  [   58.936179] Call Trace:
  [   58.936184]  
  [   58.936188]  ? show_trace_log_lvl+0x1d6/0x2ea
  [   58.936204]  ? show_trace_log_lvl+0x1d6/0x2ea
  [   58.936212]  ? crypto_mod_put+0x6b/0x80
  [   58.936225]  ? show_regs.part.0+0x23/0x29
  [   58.936232]  ? show_regs.cold+0x8/0xd
  [   58.936239]  ? refcount_warn_saturate+0xf7/0x150
  [   58.936246]  ? __warn+0x8c/0x100
  [   58.936255]  ? refcount_warn_saturate+0xf7/0x150
  [   58.936263]  ? report_bug+0xa4/0xd0
  [   58.936274]  ? down_trylock+0x2e/0x40
  [   58.936285]  ? handle_bug+0x39/0x90
  [   58.936296]  ? exc_invalid_op+0x19/0x70
  [   58.936301]  ? asm_exc_invalid_op+0x1b/0x20
  [   58.936310]  ? refcount_warn_saturate+0xf7/0x150
  [   58.936317]  ? refcount_warn_saturate+0xf7/0x150
  [   58.936323]  crypto_mod_put+0x6b/0x80
  [   58.936329]  crypto_destroy_tfm+0x4e/0xa0
  [   58.936336]  pkcs1pad_exit_tfm+0x15/0x20
  [   58.936345]  crypto_akcipher_exit_tfm+0x13/0x20
  [   58.936352]  crypto_destroy_tfm+0x43/0xa0
  [   58.936358]  public_key_verify_signature+0x2dc/0x3c0
  [   58.936366]  ? find_asymmetric_key+0xd2/0x1d0
  [   58.936374]  ? kfree+0x1f7/0x250
  [   58.936385]  public_key_verify_signature_2+0x15/0x20
  [   58.936389]  verify_signature+0x37/0x60
  [   58.936393]  pkcs7_validate_trust_one.constprop.0+0x156/0x1e0
  [   58.936400]  pkcs7_validate_trust+0x4a/0xa0
  [  

[Kernel-packages] [Bug 2034447] Lsusb-v.txt

2023-09-06 Thread Francis Ginther
apport information

** Attachment added: "Lsusb-v.txt"
   
https://bugs.launchpad.net/bugs/2034447/+attachment/5697975/+files/Lsusb-v.txt

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2034447

Title:
  `refcount_t: underflow; use-after-free.` on hidon w/ 5.15.0-85-generic

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  Seeing a panic on hidon (an Nvidia H100) after booting the
  5.15.0-85-generic kernel:

  [   58.935877] [ cut here ]
  [   58.935893] refcount_t: underflow; use-after-free.
  [   58.935920] WARNING: CPU: 207 PID: 2985 at lib/refcount.c:28 
refcount_warn_saturate+0xf7/0x150
  [   58.935943] Modules linked in: x86_pkg_temp_thermal(+) intel_powerclamp 
coretemp nls_iso8859_1 rapl irdma(+) i40e qat_4xxx(+) isst_if_mbox_pci 
intel_qat pmt_telemetry pmt_crashlog idxd(+) isst_if_mmio pmt_class 
isst_if_common authenc idxd_bus intel_th_gth mei_me intel_th_pci intel_th mei 
switchtec ipmi_ssif acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler mac_hid 
sch_fq_codel dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ramoops 
reed_solomon pstore_blk pstore_zone efi_pstore ip_tables x_tables autofs4 btrfs 
blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy 
async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 multipath linear 
mlx5_ib ib_uverbs ib_core ast i2c_algo_bit drm_vram_helper drm_ttm_helper ttm 
drm_kms_helper raid0 mlx5_core syscopyarea sysfillrect sysimgblt 
crct10dif_pclmul fb_sys_fops crc32_pclmul ixgbe cec mlxfw ghash_clmulni_intel 
aesni_intel psample crypto_simd xfrm_algo ice rc_core cryptd tls nvme i2c_i801 
dca xhci_pci intel_pmt drm
  [   58.936077]  pci_hyperv_intf i2c_ismt i2c_smbus
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936083]  mdio
  [   58.936096]  xhci_pci_renesas nvme_core wmi pinctrl_emmitsburg
  [   58.936106] CPU: 207 PID: 2985 Comm: systemd-udevd Not tainted 
5.15.0-85-generic #95-Ubuntu
  [   58.936115] Hardware name: NVIDIA DGXH100/DGXH100, BIOS 1.0.7 05/08/2023
  [   58.936119] RIP: 0010:refcount_warn_saturate+0xf7/0x150
  [   58.936130] Code: eb 9e 0f b6 1d 5e e6 b9 01 80 fb 01 0f 87 f4 63 6f 00 83 
e3 01 75 89 48 c7 c7 88 c3 23 9e c6 05 42 e6 b9 01 01 e8 d8 e4 6b 00 <0f> 0b e9 
6f ff ff ff 0f b6 1d 2d e6 b9 01 80 fb 01 0f 87 b1 63 6f
  [   58.936135] RSP: 0018:ff4d5d94b2c7fa28 EFLAGS: 00010282
  [   58.936142] RAX:  RBX:  RCX: 
0027
  [   58.936146] RDX: ff314dbbbf9e0588 RSI: 0001 RDI: 
ff314dbbbf9e0580
  [   58.936149] RBP: ff4d5d94b2c7fa30 R08: 0026 R09: 
ff4d5d94b2c7f9c0
  [   58.936153] R10: 0028 R11: 0001 R12: 

  [   58.936156] R13: ff314cbfdbcb6900 R14: ff314cbfdbcb67b8 R15: 
ff314cbfd24b4000
  [   58.936159] FS:  7fadd2f6c8c0() GS:ff314dbbbf9c() 
knlGS:
  [   58.936163] CS:  0010 DS:  ES:  CR0: 80050033
  [   58.936167] CR2: 7fadd243b584 CR3: 00012972c006 CR4: 
00771ee0
  [   58.936171] DR0:  DR1:  DR2: 

  [   58.936174] DR3:  DR6: fffe07f0 DR7: 
0400
  [   58.936177] PKRU: 5554
  [   58.936179] Call Trace:
  [   58.936184]  
  [   58.936188]  ? show_trace_log_lvl+0x1d6/0x2ea
  [   58.936204]  ? show_trace_log_lvl+0x1d6/0x2ea
  [   58.936212]  ? crypto_mod_put+0x6b/0x80
  [   58.936225]  ? show_regs.part.0+0x23/0x29
  [   58.936232]  ? show_regs.cold+0x8/0xd
  [   58.936239]  ? refcount_warn_saturate+0xf7/0x150
  [   58.936246]  ? __warn+0x8c/0x100
  [   58.936255]  ? refcount_warn_saturate+0xf7/0x150
  [   58.936263]  ? report_bug+0xa4/0xd0
  [   58.936274]  ? down_trylock+0x2e/0x40
  [   58.936285]  ? handle_bug+0x39/0x90
  [   58.936296]  ? exc_invalid_op+0x19/0x70
  [   58.936301]  ? asm_exc_invalid_op+0x1b/0x20
  [   58.936310]  ? refcount_warn_saturate+0xf7/0x150
  [   58.936317]  ? refcount_warn_saturate+0xf7/0x150
  [   58.936323]  crypto_mod_put+0x6b/0x80
  [   58.936329]  crypto_destroy_tfm+0x4e/0xa0
  [   58.936336]  pkcs1pad_exit_tfm+0x15/0x20
  [   58.936345]  crypto_akcipher_exit_tfm+0x13/0x20
  [   58.936352]  crypto_destroy_tfm+0x43/0xa0
  [   58.936358]  public_key_verify_signature+0x2dc/0x3c0
  [   58.936366]  ? find_asymmetric_key+0xd2/0x1d0
  [   58.936374]  ? kfree+0x1f7/0x250
  [   58.936385]  public_key_verify_signature_2+0x15/0x20
  [   58.936389]  verify_signature+0x37/0x60
  [   58.936393]  pkcs7_validate_trust_one.constprop.0+0x156/0x1e0
  [   58.936400]  pkcs7_validate_trust+0x4a/0xa0
  [   58.9364

[Kernel-packages] [Bug 2034447] Lspci-vt.txt

2023-09-06 Thread Francis Ginther
apport information

** Attachment added: "Lspci-vt.txt"
   
https://bugs.launchpad.net/bugs/2034447/+attachment/5697974/+files/Lspci-vt.txt

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2034447

Title:
  `refcount_t: underflow; use-after-free.` on hidon w/ 5.15.0-85-generic

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  Seeing a panic on hidon (an Nvidia H100) after booting the
  5.15.0-85-generic kernel:

  [   58.935877] [ cut here ]
  [   58.935893] refcount_t: underflow; use-after-free.
  [   58.935920] WARNING: CPU: 207 PID: 2985 at lib/refcount.c:28 
refcount_warn_saturate+0xf7/0x150
  [   58.935943] Modules linked in: x86_pkg_temp_thermal(+) intel_powerclamp 
coretemp nls_iso8859_1 rapl irdma(+) i40e qat_4xxx(+) isst_if_mbox_pci 
intel_qat pmt_telemetry pmt_crashlog idxd(+) isst_if_mmio pmt_class 
isst_if_common authenc idxd_bus intel_th_gth mei_me intel_th_pci intel_th mei 
switchtec ipmi_ssif acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler mac_hid 
sch_fq_codel dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ramoops 
reed_solomon pstore_blk pstore_zone efi_pstore ip_tables x_tables autofs4 btrfs 
blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy 
async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 multipath linear 
mlx5_ib ib_uverbs ib_core ast i2c_algo_bit drm_vram_helper drm_ttm_helper ttm 
drm_kms_helper raid0 mlx5_core syscopyarea sysfillrect sysimgblt 
crct10dif_pclmul fb_sys_fops crc32_pclmul ixgbe cec mlxfw ghash_clmulni_intel 
aesni_intel psample crypto_simd xfrm_algo ice rc_core cryptd tls nvme i2c_i801 
dca xhci_pci intel_pmt drm
  [   58.936077]  pci_hyperv_intf i2c_ismt i2c_smbus
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936083]  mdio
  [   58.936096]  xhci_pci_renesas nvme_core wmi pinctrl_emmitsburg
  [   58.936106] CPU: 207 PID: 2985 Comm: systemd-udevd Not tainted 
5.15.0-85-generic #95-Ubuntu
  [   58.936115] Hardware name: NVIDIA DGXH100/DGXH100, BIOS 1.0.7 05/08/2023
  [   58.936119] RIP: 0010:refcount_warn_saturate+0xf7/0x150
  [   58.936130] Code: eb 9e 0f b6 1d 5e e6 b9 01 80 fb 01 0f 87 f4 63 6f 00 83 
e3 01 75 89 48 c7 c7 88 c3 23 9e c6 05 42 e6 b9 01 01 e8 d8 e4 6b 00 <0f> 0b e9 
6f ff ff ff 0f b6 1d 2d e6 b9 01 80 fb 01 0f 87 b1 63 6f
  [   58.936135] RSP: 0018:ff4d5d94b2c7fa28 EFLAGS: 00010282
  [   58.936142] RAX:  RBX:  RCX: 
0027
  [   58.936146] RDX: ff314dbbbf9e0588 RSI: 0001 RDI: 
ff314dbbbf9e0580
  [   58.936149] RBP: ff4d5d94b2c7fa30 R08: 0026 R09: 
ff4d5d94b2c7f9c0
  [   58.936153] R10: 0028 R11: 0001 R12: 

  [   58.936156] R13: ff314cbfdbcb6900 R14: ff314cbfdbcb67b8 R15: 
ff314cbfd24b4000
  [   58.936159] FS:  7fadd2f6c8c0() GS:ff314dbbbf9c() 
knlGS:
  [   58.936163] CS:  0010 DS:  ES:  CR0: 80050033
  [   58.936167] CR2: 7fadd243b584 CR3: 00012972c006 CR4: 
00771ee0
  [   58.936171] DR0:  DR1:  DR2: 

  [   58.936174] DR3:  DR6: fffe07f0 DR7: 
0400
  [   58.936177] PKRU: 5554
  [   58.936179] Call Trace:
  [   58.936184]  
  [   58.936188]  ? show_trace_log_lvl+0x1d6/0x2ea
  [   58.936204]  ? show_trace_log_lvl+0x1d6/0x2ea
  [   58.936212]  ? crypto_mod_put+0x6b/0x80
  [   58.936225]  ? show_regs.part.0+0x23/0x29
  [   58.936232]  ? show_regs.cold+0x8/0xd
  [   58.936239]  ? refcount_warn_saturate+0xf7/0x150
  [   58.936246]  ? __warn+0x8c/0x100
  [   58.936255]  ? refcount_warn_saturate+0xf7/0x150
  [   58.936263]  ? report_bug+0xa4/0xd0
  [   58.936274]  ? down_trylock+0x2e/0x40
  [   58.936285]  ? handle_bug+0x39/0x90
  [   58.936296]  ? exc_invalid_op+0x19/0x70
  [   58.936301]  ? asm_exc_invalid_op+0x1b/0x20
  [   58.936310]  ? refcount_warn_saturate+0xf7/0x150
  [   58.936317]  ? refcount_warn_saturate+0xf7/0x150
  [   58.936323]  crypto_mod_put+0x6b/0x80
  [   58.936329]  crypto_destroy_tfm+0x4e/0xa0
  [   58.936336]  pkcs1pad_exit_tfm+0x15/0x20
  [   58.936345]  crypto_akcipher_exit_tfm+0x13/0x20
  [   58.936352]  crypto_destroy_tfm+0x43/0xa0
  [   58.936358]  public_key_verify_signature+0x2dc/0x3c0
  [   58.936366]  ? find_asymmetric_key+0xd2/0x1d0
  [   58.936374]  ? kfree+0x1f7/0x250
  [   58.936385]  public_key_verify_signature_2+0x15/0x20
  [   58.936389]  verify_signature+0x37/0x60
  [   58.936393]  pkcs7_validate_trust_one.constprop.0+0x156/0x1e0
  [   58.936400]  pkcs7_validate_trust+0x4a/0xa0
  [   58.93

[Kernel-packages] [Bug 2034447] Lspci.txt

2023-09-06 Thread Francis Ginther
apport information

** Attachment added: "Lspci.txt"
   https://bugs.launchpad.net/bugs/2034447/+attachment/5697973/+files/Lspci.txt

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2034447

Title:
  `refcount_t: underflow; use-after-free.` on hidon w/ 5.15.0-85-generic

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  Seeing a panic on hidon (an Nvidia H100) after booting the
  5.15.0-85-generic kernel:

  [   58.935877] [ cut here ]
  [   58.935893] refcount_t: underflow; use-after-free.
  [   58.935920] WARNING: CPU: 207 PID: 2985 at lib/refcount.c:28 
refcount_warn_saturate+0xf7/0x150
  [   58.935943] Modules linked in: x86_pkg_temp_thermal(+) intel_powerclamp 
coretemp nls_iso8859_1 rapl irdma(+) i40e qat_4xxx(+) isst_if_mbox_pci 
intel_qat pmt_telemetry pmt_crashlog idxd(+) isst_if_mmio pmt_class 
isst_if_common authenc idxd_bus intel_th_gth mei_me intel_th_pci intel_th mei 
switchtec ipmi_ssif acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler mac_hid 
sch_fq_codel dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ramoops 
reed_solomon pstore_blk pstore_zone efi_pstore ip_tables x_tables autofs4 btrfs 
blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy 
async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 multipath linear 
mlx5_ib ib_uverbs ib_core ast i2c_algo_bit drm_vram_helper drm_ttm_helper ttm 
drm_kms_helper raid0 mlx5_core syscopyarea sysfillrect sysimgblt 
crct10dif_pclmul fb_sys_fops crc32_pclmul ixgbe cec mlxfw ghash_clmulni_intel 
aesni_intel psample crypto_simd xfrm_algo ice rc_core cryptd tls nvme i2c_i801 
dca xhci_pci intel_pmt drm
  [   58.936077]  pci_hyperv_intf i2c_ismt i2c_smbus
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936083]  mdio
  [   58.936096]  xhci_pci_renesas nvme_core wmi pinctrl_emmitsburg
  [   58.936106] CPU: 207 PID: 2985 Comm: systemd-udevd Not tainted 
5.15.0-85-generic #95-Ubuntu
  [   58.936115] Hardware name: NVIDIA DGXH100/DGXH100, BIOS 1.0.7 05/08/2023
  [   58.936119] RIP: 0010:refcount_warn_saturate+0xf7/0x150
  [   58.936130] Code: eb 9e 0f b6 1d 5e e6 b9 01 80 fb 01 0f 87 f4 63 6f 00 83 
e3 01 75 89 48 c7 c7 88 c3 23 9e c6 05 42 e6 b9 01 01 e8 d8 e4 6b 00 <0f> 0b e9 
6f ff ff ff 0f b6 1d 2d e6 b9 01 80 fb 01 0f 87 b1 63 6f
  [   58.936135] RSP: 0018:ff4d5d94b2c7fa28 EFLAGS: 00010282
  [   58.936142] RAX:  RBX:  RCX: 
0027
  [   58.936146] RDX: ff314dbbbf9e0588 RSI: 0001 RDI: 
ff314dbbbf9e0580
  [   58.936149] RBP: ff4d5d94b2c7fa30 R08: 0026 R09: 
ff4d5d94b2c7f9c0
  [   58.936153] R10: 0028 R11: 0001 R12: 

  [   58.936156] R13: ff314cbfdbcb6900 R14: ff314cbfdbcb67b8 R15: 
ff314cbfd24b4000
  [   58.936159] FS:  7fadd2f6c8c0() GS:ff314dbbbf9c() 
knlGS:
  [   58.936163] CS:  0010 DS:  ES:  CR0: 80050033
  [   58.936167] CR2: 7fadd243b584 CR3: 00012972c006 CR4: 
00771ee0
  [   58.936171] DR0:  DR1:  DR2: 

  [   58.936174] DR3:  DR6: fffe07f0 DR7: 
0400
  [   58.936177] PKRU: 5554
  [   58.936179] Call Trace:
  [   58.936184]  
  [   58.936188]  ? show_trace_log_lvl+0x1d6/0x2ea
  [   58.936204]  ? show_trace_log_lvl+0x1d6/0x2ea
  [   58.936212]  ? crypto_mod_put+0x6b/0x80
  [   58.936225]  ? show_regs.part.0+0x23/0x29
  [   58.936232]  ? show_regs.cold+0x8/0xd
  [   58.936239]  ? refcount_warn_saturate+0xf7/0x150
  [   58.936246]  ? __warn+0x8c/0x100
  [   58.936255]  ? refcount_warn_saturate+0xf7/0x150
  [   58.936263]  ? report_bug+0xa4/0xd0
  [   58.936274]  ? down_trylock+0x2e/0x40
  [   58.936285]  ? handle_bug+0x39/0x90
  [   58.936296]  ? exc_invalid_op+0x19/0x70
  [   58.936301]  ? asm_exc_invalid_op+0x1b/0x20
  [   58.936310]  ? refcount_warn_saturate+0xf7/0x150
  [   58.936317]  ? refcount_warn_saturate+0xf7/0x150
  [   58.936323]  crypto_mod_put+0x6b/0x80
  [   58.936329]  crypto_destroy_tfm+0x4e/0xa0
  [   58.936336]  pkcs1pad_exit_tfm+0x15/0x20
  [   58.936345]  crypto_akcipher_exit_tfm+0x13/0x20
  [   58.936352]  crypto_destroy_tfm+0x43/0xa0
  [   58.936358]  public_key_verify_signature+0x2dc/0x3c0
  [   58.936366]  ? find_asymmetric_key+0xd2/0x1d0
  [   58.936374]  ? kfree+0x1f7/0x250
  [   58.936385]  public_key_verify_signature_2+0x15/0x20
  [   58.936389]  verify_signature+0x37/0x60
  [   58.936393]  pkcs7_validate_trust_one.constprop.0+0x156/0x1e0
  [   58.936400]  pkcs7_validate_trust+0x4a/0xa0
  [   58.936406]  

[Kernel-packages] [Bug 2034447] Re: `refcount_t: underflow; use-after-free.` on hidon w/ 5.15.0-85-generic

2023-09-06 Thread Francis Ginther
apport information

** Tags added: apport-collected jammy uec-images

** Description changed:

  Seeing a panic on hidon (an Nvidia H100) after booting the
  5.15.0-85-generic kernel:
  
  [   58.935877] [ cut here ]
  [   58.935893] refcount_t: underflow; use-after-free.
  [   58.935920] WARNING: CPU: 207 PID: 2985 at lib/refcount.c:28 
refcount_warn_saturate+0xf7/0x150
  [   58.935943] Modules linked in: x86_pkg_temp_thermal(+) intel_powerclamp 
coretemp nls_iso8859_1 rapl irdma(+) i40e qat_4xxx(+) isst_if_mbox_pci 
intel_qat pmt_telemetry pmt_crashlog idxd(+) isst_if_mmio pmt_class 
isst_if_common authenc idxd_bus intel_th_gth mei_me intel_th_pci intel_th mei 
switchtec ipmi_ssif acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler mac_hid 
sch_fq_codel dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ramoops 
reed_solomon pstore_blk pstore_zone efi_pstore ip_tables x_tables autofs4 btrfs 
blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy 
async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 multipath linear 
mlx5_ib ib_uverbs ib_core ast i2c_algo_bit drm_vram_helper drm_ttm_helper ttm 
drm_kms_helper raid0 mlx5_core syscopyarea sysfillrect sysimgblt 
crct10dif_pclmul fb_sys_fops crc32_pclmul ixgbe cec mlxfw ghash_clmulni_intel 
aesni_intel psample crypto_simd xfrm_algo ice rc_core cryptd tls nvme i2c_i801 
dca xhci_pci intel_pmt drm
  [   58.936077]  pci_hyperv_intf i2c_ismt i2c_smbus
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936083]  mdio
  [   58.936096]  xhci_pci_renesas nvme_core wmi pinctrl_emmitsburg
  [   58.936106] CPU: 207 PID: 2985 Comm: systemd-udevd Not tainted 
5.15.0-85-generic #95-Ubuntu
  [   58.936115] Hardware name: NVIDIA DGXH100/DGXH100, BIOS 1.0.7 05/08/2023
  [   58.936119] RIP: 0010:refcount_warn_saturate+0xf7/0x150
  [   58.936130] Code: eb 9e 0f b6 1d 5e e6 b9 01 80 fb 01 0f 87 f4 63 6f 00 83 
e3 01 75 89 48 c7 c7 88 c3 23 9e c6 05 42 e6 b9 01 01 e8 d8 e4 6b 00 <0f> 0b e9 
6f ff ff ff 0f b6 1d 2d e6 b9 01 80 fb 01 0f 87 b1 63 6f
  [   58.936135] RSP: 0018:ff4d5d94b2c7fa28 EFLAGS: 00010282
  [   58.936142] RAX:  RBX:  RCX: 
0027
  [   58.936146] RDX: ff314dbbbf9e0588 RSI: 0001 RDI: 
ff314dbbbf9e0580
  [   58.936149] RBP: ff4d5d94b2c7fa30 R08: 0026 R09: 
ff4d5d94b2c7f9c0
  [   58.936153] R10: 0028 R11: 0001 R12: 

  [   58.936156] R13: ff314cbfdbcb6900 R14: ff314cbfdbcb67b8 R15: 
ff314cbfd24b4000
  [   58.936159] FS:  7fadd2f6c8c0() GS:ff314dbbbf9c() 
knlGS:
  [   58.936163] CS:  0010 DS:  ES:  CR0: 80050033
  [   58.936167] CR2: 7fadd243b584 CR3: 00012972c006 CR4: 
00771ee0
  [   58.936171] DR0:  DR1:  DR2: 

  [   58.936174] DR3:  DR6: fffe07f0 DR7: 
0400
  [   58.936177] PKRU: 5554
  [   58.936179] Call Trace:
  [   58.936184]  
  [   58.936188]  ? show_trace_log_lvl+0x1d6/0x2ea
  [   58.936204]  ? show_trace_log_lvl+0x1d6/0x2ea
  [   58.936212]  ? crypto_mod_put+0x6b/0x80
  [   58.936225]  ? show_regs.part.0+0x23/0x29
  [   58.936232]  ? show_regs.cold+0x8/0xd
  [   58.936239]  ? refcount_warn_saturate+0xf7/0x150
  [   58.936246]  ? __warn+0x8c/0x100
  [   58.936255]  ? refcount_warn_saturate+0xf7/0x150
  [   58.936263]  ? report_bug+0xa4/0xd0
  [   58.936274]  ? down_trylock+0x2e/0x40
  [   58.936285]  ? handle_bug+0x39/0x90
  [   58.936296]  ? exc_invalid_op+0x19/0x70
  [   58.936301]  ? asm_exc_invalid_op+0x1b/0x20
  [   58.936310]  ? refcount_warn_saturate+0xf7/0x150
  [   58.936317]  ? refcount_warn_saturate+0xf7/0x150
  [   58.936323]  crypto_mod_put+0x6b/0x80
  [   58.936329]  crypto_destroy_tfm+0x4e/0xa0
  [   58.936336]  pkcs1pad_exit_tfm+0x15/0x20
  [   58.936345]  crypto_akcipher_exit_tfm+0x13/0x20
  [   58.936352]  crypto_destroy_tfm+0x43/0xa0
  [   58.936358]  public_key_verify_signature+0x2dc/0x3c0
  [   58.936366]  ? find_asymmetric_key+0xd2/0x1d0
  [   58.936374]  ? kfree+0x1f7/0x250
  [   58.936385]  public_key_verify_signature_2+0x15/0x20
  [   58.936389]  verify_signature+0x37/0x60
  [   58.936393]  pkcs7_validate_trust_one.constprop.0+0x156/0x1e0
  [   58.936400]  pkcs7_validate_trust+0x4a/0xa0
  [   58.936406]  verify_pkcs7_message_sig+0x83/0x120
  [   58.936418]  verify_pkcs7_signature+0x4f/0x80
  [   58.936424]  mod_verify_sig+0xb5/0xf0
  [   58.936435]  load_module+0x275/0xbc0
  [   58.936440]  ? kernel_read_file_from_fd+0x56/0xa0
  [   58.936450]  __do_sys_finit_module+0xbf/0x120
  [   58.936496]  __x64_sys_finit_module+0x18/0x20
  [   58.936504]  do

[Kernel-packages] [Bug 2034447] Re: `refcount_t: underflow; use-after-free.` on hidon w/ 5.15.0-85-generic

2023-09-05 Thread Francis Ginther
Here's the full log from where that snippet was pulled.

** Attachment added: "hidon.log.1"
   
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2034447/+attachment/5697793/+files/hidon.log.1

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2034447

Title:
  `refcount_t: underflow; use-after-free.` on hidon w/ 5.15.0-85-generic

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  Seeing a panic on hidon (an Nvidia H100) after booting the
  5.15.0-85-generic kernel:

  [   58.935877] [ cut here ]
  [   58.935893] refcount_t: underflow; use-after-free.
  [   58.935920] WARNING: CPU: 207 PID: 2985 at lib/refcount.c:28 
refcount_warn_saturate+0xf7/0x150
  [   58.935943] Modules linked in: x86_pkg_temp_thermal(+) intel_powerclamp 
coretemp nls_iso8859_1 rapl irdma(+) i40e qat_4xxx(+) isst_if_mbox_pci 
intel_qat pmt_telemetry pmt_crashlog idxd(+) isst_if_mmio pmt_class 
isst_if_common authenc idxd_bus intel_th_gth mei_me intel_th_pci intel_th mei 
switchtec ipmi_ssif acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler mac_hid 
sch_fq_codel dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ramoops 
reed_solomon pstore_blk pstore_zone efi_pstore ip_tables x_tables autofs4 btrfs 
blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy 
async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 multipath linear 
mlx5_ib ib_uverbs ib_core ast i2c_algo_bit drm_vram_helper drm_ttm_helper ttm 
drm_kms_helper raid0 mlx5_core syscopyarea sysfillrect sysimgblt 
crct10dif_pclmul fb_sys_fops crc32_pclmul ixgbe cec mlxfw ghash_clmulni_intel 
aesni_intel psample crypto_simd xfrm_algo ice rc_core cryptd tls nvme i2c_i801 
dca xhci_pci intel_pmt drm
  [   58.936077]  pci_hyperv_intf i2c_ismt i2c_smbus
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936080] QAT: Could not find a device on node 1
  [   58.936083]  mdio
  [   58.936096]  xhci_pci_renesas nvme_core wmi pinctrl_emmitsburg
  [   58.936106] CPU: 207 PID: 2985 Comm: systemd-udevd Not tainted 
5.15.0-85-generic #95-Ubuntu
  [   58.936115] Hardware name: NVIDIA DGXH100/DGXH100, BIOS 1.0.7 05/08/2023
  [   58.936119] RIP: 0010:refcount_warn_saturate+0xf7/0x150
  [   58.936130] Code: eb 9e 0f b6 1d 5e e6 b9 01 80 fb 01 0f 87 f4 63 6f 00 83 
e3 01 75 89 48 c7 c7 88 c3 23 9e c6 05 42 e6 b9 01 01 e8 d8 e4 6b 00 <0f> 0b e9 
6f ff ff ff 0f b6 1d 2d e6 b9 01 80 fb 01 0f 87 b1 63 6f
  [   58.936135] RSP: 0018:ff4d5d94b2c7fa28 EFLAGS: 00010282
  [   58.936142] RAX:  RBX:  RCX: 
0027
  [   58.936146] RDX: ff314dbbbf9e0588 RSI: 0001 RDI: 
ff314dbbbf9e0580
  [   58.936149] RBP: ff4d5d94b2c7fa30 R08: 0026 R09: 
ff4d5d94b2c7f9c0
  [   58.936153] R10: 0028 R11: 0001 R12: 

  [   58.936156] R13: ff314cbfdbcb6900 R14: ff314cbfdbcb67b8 R15: 
ff314cbfd24b4000
  [   58.936159] FS:  7fadd2f6c8c0() GS:ff314dbbbf9c() 
knlGS:
  [   58.936163] CS:  0010 DS:  ES:  CR0: 80050033
  [   58.936167] CR2: 7fadd243b584 CR3: 00012972c006 CR4: 
00771ee0
  [   58.936171] DR0:  DR1:  DR2: 

  [   58.936174] DR3:  DR6: fffe07f0 DR7: 
0400
  [   58.936177] PKRU: 5554
  [   58.936179] Call Trace:
  [   58.936184]  
  [   58.936188]  ? show_trace_log_lvl+0x1d6/0x2ea
  [   58.936204]  ? show_trace_log_lvl+0x1d6/0x2ea
  [   58.936212]  ? crypto_mod_put+0x6b/0x80
  [   58.936225]  ? show_regs.part.0+0x23/0x29
  [   58.936232]  ? show_regs.cold+0x8/0xd
  [   58.936239]  ? refcount_warn_saturate+0xf7/0x150
  [   58.936246]  ? __warn+0x8c/0x100
  [   58.936255]  ? refcount_warn_saturate+0xf7/0x150
  [   58.936263]  ? report_bug+0xa4/0xd0
  [   58.936274]  ? down_trylock+0x2e/0x40
  [   58.936285]  ? handle_bug+0x39/0x90
  [   58.936296]  ? exc_invalid_op+0x19/0x70
  [   58.936301]  ? asm_exc_invalid_op+0x1b/0x20
  [   58.936310]  ? refcount_warn_saturate+0xf7/0x150
  [   58.936317]  ? refcount_warn_saturate+0xf7/0x150
  [   58.936323]  crypto_mod_put+0x6b/0x80
  [   58.936329]  crypto_destroy_tfm+0x4e/0xa0
  [   58.936336]  pkcs1pad_exit_tfm+0x15/0x20
  [   58.936345]  crypto_akcipher_exit_tfm+0x13/0x20
  [   58.936352]  crypto_destroy_tfm+0x43/0xa0
  [   58.936358]  public_key_verify_signature+0x2dc/0x3c0
  [   58.936366]  ? find_asymmetric_key+0xd2/0x1d0
  [   58.936374]  ? kfree+0x1f7/0x250
  [   58.936385]  public_key_verify_signature_2+0x15/0x20
  [   58.936389]  verify_signature+0x37/0x60
  [   58.936393]  pkcs7_validate_trust_one.constprop.0+0x156/0x1e0
  [ 

[Kernel-packages] [Bug 2034447] [NEW] `refcount_t: underflow; use-after-free.` on hidon w/ 5.15.0-85-generic

2023-09-05 Thread Francis Ginther
Public bug reported:

Seeing a panic on hidon (an Nvidia H100) after booting the
5.15.0-85-generic kernel:

[   58.935877] [ cut here ]
[   58.935893] refcount_t: underflow; use-after-free.
[   58.935920] WARNING: CPU: 207 PID: 2985 at lib/refcount.c:28 
refcount_warn_saturate+0xf7/0x150
[   58.935943] Modules linked in: x86_pkg_temp_thermal(+) intel_powerclamp 
coretemp nls_iso8859_1 rapl irdma(+) i40e qat_4xxx(+) isst_if_mbox_pci 
intel_qat pmt_telemetry pmt_crashlog idxd(+) isst_if_mmio pmt_class 
isst_if_common authenc idxd_bus intel_th_gth mei_me intel_th_pci intel_th mei 
switchtec ipmi_ssif acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler mac_hid 
sch_fq_codel dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ramoops 
reed_solomon pstore_blk pstore_zone efi_pstore ip_tables x_tables autofs4 btrfs 
blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy 
async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 multipath linear 
mlx5_ib ib_uverbs ib_core ast i2c_algo_bit drm_vram_helper drm_ttm_helper ttm 
drm_kms_helper raid0 mlx5_core syscopyarea sysfillrect sysimgblt 
crct10dif_pclmul fb_sys_fops crc32_pclmul ixgbe cec mlxfw ghash_clmulni_intel 
aesni_intel psample crypto_simd xfrm_algo ice rc_core cryptd tls nvme i2c_i801 
dca xhci_pci intel_pmt drm
[   58.936077]  pci_hyperv_intf i2c_ismt i2c_smbus
[   58.936080] QAT: Could not find a device on node 1
[   58.936080] QAT: Could not find a device on node 1
[   58.936080] QAT: Could not find a device on node 1
[   58.936080] QAT: Could not find a device on node 1
[   58.936080] QAT: Could not find a device on node 1
[   58.936080] QAT: Could not find a device on node 1
[   58.936083]  mdio
[   58.936096]  xhci_pci_renesas nvme_core wmi pinctrl_emmitsburg
[   58.936106] CPU: 207 PID: 2985 Comm: systemd-udevd Not tainted 
5.15.0-85-generic #95-Ubuntu
[   58.936115] Hardware name: NVIDIA DGXH100/DGXH100, BIOS 1.0.7 05/08/2023
[   58.936119] RIP: 0010:refcount_warn_saturate+0xf7/0x150
[   58.936130] Code: eb 9e 0f b6 1d 5e e6 b9 01 80 fb 01 0f 87 f4 63 6f 00 83 
e3 01 75 89 48 c7 c7 88 c3 23 9e c6 05 42 e6 b9 01 01 e8 d8 e4 6b 00 <0f> 0b e9 
6f ff ff ff 0f b6 1d 2d e6 b9 01 80 fb 01 0f 87 b1 63 6f
[   58.936135] RSP: 0018:ff4d5d94b2c7fa28 EFLAGS: 00010282
[   58.936142] RAX:  RBX:  RCX: 0027
[   58.936146] RDX: ff314dbbbf9e0588 RSI: 0001 RDI: ff314dbbbf9e0580
[   58.936149] RBP: ff4d5d94b2c7fa30 R08: 0026 R09: ff4d5d94b2c7f9c0
[   58.936153] R10: 0028 R11: 0001 R12: 
[   58.936156] R13: ff314cbfdbcb6900 R14: ff314cbfdbcb67b8 R15: ff314cbfd24b4000
[   58.936159] FS:  7fadd2f6c8c0() GS:ff314dbbbf9c() 
knlGS:
[   58.936163] CS:  0010 DS:  ES:  CR0: 80050033
[   58.936167] CR2: 7fadd243b584 CR3: 00012972c006 CR4: 00771ee0
[   58.936171] DR0:  DR1:  DR2: 
[   58.936174] DR3:  DR6: fffe07f0 DR7: 0400
[   58.936177] PKRU: 5554
[   58.936179] Call Trace:
[   58.936184]  
[   58.936188]  ? show_trace_log_lvl+0x1d6/0x2ea
[   58.936204]  ? show_trace_log_lvl+0x1d6/0x2ea
[   58.936212]  ? crypto_mod_put+0x6b/0x80
[   58.936225]  ? show_regs.part.0+0x23/0x29
[   58.936232]  ? show_regs.cold+0x8/0xd
[   58.936239]  ? refcount_warn_saturate+0xf7/0x150
[   58.936246]  ? __warn+0x8c/0x100
[   58.936255]  ? refcount_warn_saturate+0xf7/0x150
[   58.936263]  ? report_bug+0xa4/0xd0
[   58.936274]  ? down_trylock+0x2e/0x40
[   58.936285]  ? handle_bug+0x39/0x90
[   58.936296]  ? exc_invalid_op+0x19/0x70
[   58.936301]  ? asm_exc_invalid_op+0x1b/0x20
[   58.936310]  ? refcount_warn_saturate+0xf7/0x150
[   58.936317]  ? refcount_warn_saturate+0xf7/0x150
[   58.936323]  crypto_mod_put+0x6b/0x80
[   58.936329]  crypto_destroy_tfm+0x4e/0xa0
[   58.936336]  pkcs1pad_exit_tfm+0x15/0x20
[   58.936345]  crypto_akcipher_exit_tfm+0x13/0x20
[   58.936352]  crypto_destroy_tfm+0x43/0xa0
[   58.936358]  public_key_verify_signature+0x2dc/0x3c0
[   58.936366]  ? find_asymmetric_key+0xd2/0x1d0
[   58.936374]  ? kfree+0x1f7/0x250
[   58.936385]  public_key_verify_signature_2+0x15/0x20
[   58.936389]  verify_signature+0x37/0x60
[   58.936393]  pkcs7_validate_trust_one.constprop.0+0x156/0x1e0
[   58.936400]  pkcs7_validate_trust+0x4a/0xa0
[   58.936406]  verify_pkcs7_message_sig+0x83/0x120
[   58.936418]  verify_pkcs7_signature+0x4f/0x80
[   58.936424]  mod_verify_sig+0xb5/0xf0
[   58.936435]  load_module+0x275/0xbc0
[   58.936440]  ? kernel_read_file_from_fd+0x56/0xa0
[   58.936450]  __do_sys_finit_module+0xbf/0x120
[   58.936496]  __x64_sys_finit_module+0x18/0x20
[   58.936504]  do_syscall_64+0x59/0xc0
[   58.936510]  ? exit_to_user_mode_prepare+0x37/0xb0
[   58.936521]  ? syscall_exit_to_user_mode+0x35/0x50
[   58.936530]  ? __x64_sys_mmap+0x33/0x50
[   58.936539]  ? do_syscall_64+0x69/0xc0
[   

[Kernel-packages] [Bug 2026891] Re: linux-nvidia-6.2 on DGX servers: "WARNING: CPU: 0 PID: 0 at init/main.c:1065 start_kernel+0x4da/0x540"

2023-07-14 Thread Francis Ginther
I built and tested a 6.2.0-1004-nvidia based kernel with this patch
applied and did not see the warning message on boot. I'll follow up
further with Ian on Monday.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-nvidia-6.2 in Ubuntu.
https://bugs.launchpad.net/bugs/2026891

Title:
  linux-nvidia-6.2 on DGX servers: "WARNING: CPU: 0 PID: 0 at
  init/main.c:1065 start_kernel+0x4da/0x540"

Status in linux-nvidia-6.2 package in Ubuntu:
  New

Bug description:
  We started testing the jammy/linux-nvidia-6.2 kernels on the nvidia
  servers (DGX-1/DGX-2/H100) and hit the following warning during boot:

  [7.690486] [ cut here ]
  [7.690487] Interrupts were enabled early
  [7.690490] WARNING: CPU: 0 PID: 0 at init/main.c:1065 
start_kernel+0x4da/0x540
  [7.690498] Modules linked in:
  [7.690501] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.2.0-1004-nvidia 
#4~22.04.1-Ubuntu
  [7.690504] Hardware name: NVIDIA NVIDIA DGX-2/NVIDIA DGX-2, BIOS 0.29 
06/07/2021
  [7.690505] RIP: 0010:start_kernel+0x4da/0x540
  [7.690508] Code: ff 48 c7 c7 e8 26 f0 97 e8 b3 59 a8 fd 0f 0b e9 96 fd ff 
ff e8 a7 1d 04 00 e9 7c fe ff ff 48 c7 c7 18 27 f0 97 e8 96 59 a8 fd <0f> 0b e9 
ed fd ff ff 48 c7 c7 b0 26 f0 97 e8 83 59 a8 fd 0f 0b ff
  [7.690510] RSP: :98803f08 EFLAGS: 00010246
  [7.690512] RAX:  RBX:  RCX: 

  [7.690513] RDX:  RSI:  RDI: 

  [7.690514] RBP: 98803f20 R08:  R09: 

  [7.690515] R10:  R11:  R12: 
00e0
  [7.690516] R13: 5a1ccde0 R14: 5a1c7469 R15: 
5a1d7ee0
  [7.690518] FS:  () GS:96490060() 
knlGS:
  [7.690520] CS:  0010 DS:  ES:  CR0: 80050033
  [7.690521] CR2: 970bf000 CR3: 00ecd7810001 CR4: 
000606f0
  [7.690522] DR0:  DR1:  DR2: 

  [7.690523] DR3:  DR6: fffe0ff0 DR7: 
0400
  [7.690524] Call Trace:
  [7.690526]  
  [7.690529]  x86_64_start_kernel+0x102/0x180
  [7.690536]  secondary_startup_64_no_verify+0xe5/0xeb
  [7.690544]  
  [7.690544] ---[ end trace  ]---

  I also see pretty much the same thing on some Ampere based arm64
  servers:

  [0.000519] [ cut here ]
  [0.000521] Interrupts were enabled early
  [0.000525] WARNING: CPU: 0 PID: 0 at init/main.c:1065 
start_kernel+0x3ac/0x514
  [0.000531] Modules linked in:
  [0.000535] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.2.0-1004-nvidia 
#4~22.04.1-Ubuntu
  [0.000538] pstate: 6049 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
  [0.000540] pc : start_kernel+0x3ac/0x514
  [0.000543] lr : start_kernel+0x3ac/0x514
  [0.000545] sp : dec5ff733e60
  [0.000546] x29: dec5ff733e60 x28: 0819aa09baac x27: 
403ffdd124e0
  [0.000549] x26: bfdf3788 x25: 9b6fc000 x24: 
001dba7b
  [0.000552] x23: 5ec57c98 x22: 0819ab2a x21: 
dec5ff749140
  [0.000555] x20: dec5ff73d9c0 x19: dec5ffbe4000 x18: 
dec5ff74a1c8
  [0.000558] x17:  x16:  x15: 

  [0.000560] x14:  x13: 0a796c7261652064 x12: 
656c62616e652065
  [0.000563] x11: 656820747563205b x10: 2d2d2d2d2d2d2d2d x9 : 

  [0.000565] x8 :  x7 :  x6 : 

  [0.000568] x5 :  x4 :  x3 : 

  [0.000571] x2 :  x1 :  x0 : 

  [0.000573] Call trace:
  [0.000574]  start_kernel+0x3ac/0x514
  [0.000577]  __primary_switched+0xc0/0xc8
  [0.000580] ---[ end trace  ]---

  The warning does not appear on an older thunderx2 server.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-nvidia-6.2/+bug/2026891/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2026891] Re: linux-nvidia-6.2 on DGX servers: "WARNING: CPU: 0 PID: 0 at init/main.c:1065 start_kernel+0x4da/0x540"

2023-07-11 Thread Francis Ginther
I ran through several kernels on our DGX-2 server, only the latest
6.2.0-1004-nvidia kernel emitted the warning. Here are all the kernels I
tried:

Lunar 6.2.0-24.24 generic - PASS
Jammy 5.15.0-1028-nvidia - PASS
Jammy 5.19.0-46-generic - PASS
Jammy 5.19.0-1014-nvidia - PASS
Jammy 6.2.0-25-generic - PASS
Jammy 6.2.0-1003-nvidia - PASS
Jammy 6.2.0-1004-nvidia - FAIL

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-nvidia-6.2 in Ubuntu.
https://bugs.launchpad.net/bugs/2026891

Title:
  linux-nvidia-6.2 on DGX servers: "WARNING: CPU: 0 PID: 0 at
  init/main.c:1065 start_kernel+0x4da/0x540"

Status in linux-nvidia-6.2 package in Ubuntu:
  New

Bug description:
  We started testing the jammy/linux-nvidia-6.2 kernels on the nvidia
  servers (DGX-1/DGX-2/H100) and hit the following warning during boot:

  [7.690486] [ cut here ]
  [7.690487] Interrupts were enabled early
  [7.690490] WARNING: CPU: 0 PID: 0 at init/main.c:1065 
start_kernel+0x4da/0x540
  [7.690498] Modules linked in:
  [7.690501] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.2.0-1004-nvidia 
#4~22.04.1-Ubuntu
  [7.690504] Hardware name: NVIDIA NVIDIA DGX-2/NVIDIA DGX-2, BIOS 0.29 
06/07/2021
  [7.690505] RIP: 0010:start_kernel+0x4da/0x540
  [7.690508] Code: ff 48 c7 c7 e8 26 f0 97 e8 b3 59 a8 fd 0f 0b e9 96 fd ff 
ff e8 a7 1d 04 00 e9 7c fe ff ff 48 c7 c7 18 27 f0 97 e8 96 59 a8 fd <0f> 0b e9 
ed fd ff ff 48 c7 c7 b0 26 f0 97 e8 83 59 a8 fd 0f 0b ff
  [7.690510] RSP: :98803f08 EFLAGS: 00010246
  [7.690512] RAX:  RBX:  RCX: 

  [7.690513] RDX:  RSI:  RDI: 

  [7.690514] RBP: 98803f20 R08:  R09: 

  [7.690515] R10:  R11:  R12: 
00e0
  [7.690516] R13: 5a1ccde0 R14: 5a1c7469 R15: 
5a1d7ee0
  [7.690518] FS:  () GS:96490060() 
knlGS:
  [7.690520] CS:  0010 DS:  ES:  CR0: 80050033
  [7.690521] CR2: 970bf000 CR3: 00ecd7810001 CR4: 
000606f0
  [7.690522] DR0:  DR1:  DR2: 

  [7.690523] DR3:  DR6: fffe0ff0 DR7: 
0400
  [7.690524] Call Trace:
  [7.690526]  
  [7.690529]  x86_64_start_kernel+0x102/0x180
  [7.690536]  secondary_startup_64_no_verify+0xe5/0xeb
  [7.690544]  
  [7.690544] ---[ end trace  ]---

  I also see pretty much the same thing on some Ampere based arm64
  servers:

  [0.000519] [ cut here ]
  [0.000521] Interrupts were enabled early
  [0.000525] WARNING: CPU: 0 PID: 0 at init/main.c:1065 
start_kernel+0x3ac/0x514
  [0.000531] Modules linked in:
  [0.000535] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.2.0-1004-nvidia 
#4~22.04.1-Ubuntu
  [0.000538] pstate: 6049 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
  [0.000540] pc : start_kernel+0x3ac/0x514
  [0.000543] lr : start_kernel+0x3ac/0x514
  [0.000545] sp : dec5ff733e60
  [0.000546] x29: dec5ff733e60 x28: 0819aa09baac x27: 
403ffdd124e0
  [0.000549] x26: bfdf3788 x25: 9b6fc000 x24: 
001dba7b
  [0.000552] x23: 5ec57c98 x22: 0819ab2a x21: 
dec5ff749140
  [0.000555] x20: dec5ff73d9c0 x19: dec5ffbe4000 x18: 
dec5ff74a1c8
  [0.000558] x17:  x16:  x15: 

  [0.000560] x14:  x13: 0a796c7261652064 x12: 
656c62616e652065
  [0.000563] x11: 656820747563205b x10: 2d2d2d2d2d2d2d2d x9 : 

  [0.000565] x8 :  x7 :  x6 : 

  [0.000568] x5 :  x4 :  x3 : 

  [0.000571] x2 :  x1 :  x0 : 

  [0.000573] Call trace:
  [0.000574]  start_kernel+0x3ac/0x514
  [0.000577]  __primary_switched+0xc0/0xc8
  [0.000580] ---[ end trace  ]---

  The warning does not appear on an older thunderx2 server.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-nvidia-6.2/+bug/2026891/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2026891] [NEW] linux-nvidia-6.2 on DGX servers: "WARNING: CPU: 0 PID: 0 at init/main.c:1065 start_kernel+0x4da/0x540"

2023-07-11 Thread Francis Ginther
Public bug reported:

We started testing the jammy/linux-nvidia-6.2 kernels on the nvidia
servers (DGX-1/DGX-2/H100) and hit the following warning during boot:

[7.690486] [ cut here ]
[7.690487] Interrupts were enabled early
[7.690490] WARNING: CPU: 0 PID: 0 at init/main.c:1065 
start_kernel+0x4da/0x540
[7.690498] Modules linked in:
[7.690501] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.2.0-1004-nvidia 
#4~22.04.1-Ubuntu
[7.690504] Hardware name: NVIDIA NVIDIA DGX-2/NVIDIA DGX-2, BIOS 0.29 
06/07/2021
[7.690505] RIP: 0010:start_kernel+0x4da/0x540
[7.690508] Code: ff 48 c7 c7 e8 26 f0 97 e8 b3 59 a8 fd 0f 0b e9 96 fd ff 
ff e8 a7 1d 04 00 e9 7c fe ff ff 48 c7 c7 18 27 f0 97 e8 96 59 a8 fd <0f> 0b e9 
ed fd ff ff 48 c7 c7 b0 26 f0 97 e8 83 59 a8 fd 0f 0b ff
[7.690510] RSP: :98803f08 EFLAGS: 00010246
[7.690512] RAX:  RBX:  RCX: 
[7.690513] RDX:  RSI:  RDI: 
[7.690514] RBP: 98803f20 R08:  R09: 
[7.690515] R10:  R11:  R12: 00e0
[7.690516] R13: 5a1ccde0 R14: 5a1c7469 R15: 5a1d7ee0
[7.690518] FS:  () GS:96490060() 
knlGS:
[7.690520] CS:  0010 DS:  ES:  CR0: 80050033
[7.690521] CR2: 970bf000 CR3: 00ecd7810001 CR4: 000606f0
[7.690522] DR0:  DR1:  DR2: 
[7.690523] DR3:  DR6: fffe0ff0 DR7: 0400
[7.690524] Call Trace:
[7.690526]  
[7.690529]  x86_64_start_kernel+0x102/0x180
[7.690536]  secondary_startup_64_no_verify+0xe5/0xeb
[7.690544]  
[7.690544] ---[ end trace  ]---

I also see pretty much the same thing on some Ampere based arm64
servers:

[0.000519] [ cut here ]
[0.000521] Interrupts were enabled early
[0.000525] WARNING: CPU: 0 PID: 0 at init/main.c:1065 
start_kernel+0x3ac/0x514
[0.000531] Modules linked in:
[0.000535] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.2.0-1004-nvidia 
#4~22.04.1-Ubuntu
[0.000538] pstate: 6049 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[0.000540] pc : start_kernel+0x3ac/0x514
[0.000543] lr : start_kernel+0x3ac/0x514
[0.000545] sp : dec5ff733e60
[0.000546] x29: dec5ff733e60 x28: 0819aa09baac x27: 403ffdd124e0
[0.000549] x26: bfdf3788 x25: 9b6fc000 x24: 001dba7b
[0.000552] x23: 5ec57c98 x22: 0819ab2a x21: dec5ff749140
[0.000555] x20: dec5ff73d9c0 x19: dec5ffbe4000 x18: dec5ff74a1c8
[0.000558] x17:  x16:  x15: 
[0.000560] x14:  x13: 0a796c7261652064 x12: 656c62616e652065
[0.000563] x11: 656820747563205b x10: 2d2d2d2d2d2d2d2d x9 : 
[0.000565] x8 :  x7 :  x6 : 
[0.000568] x5 :  x4 :  x3 : 
[0.000571] x2 :  x1 :  x0 : 
[0.000573] Call trace:
[0.000574]  start_kernel+0x3ac/0x514
[0.000577]  __primary_switched+0xc0/0xc8
[0.000580] ---[ end trace  ]---

The warning does not appear on an older thunderx2 server.

** Affects: linux-nvidia-6.2 (Ubuntu)
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-nvidia-6.2 in Ubuntu.
https://bugs.launchpad.net/bugs/2026891

Title:
  linux-nvidia-6.2 on DGX servers: "WARNING: CPU: 0 PID: 0 at
  init/main.c:1065 start_kernel+0x4da/0x540"

Status in linux-nvidia-6.2 package in Ubuntu:
  New

Bug description:
  We started testing the jammy/linux-nvidia-6.2 kernels on the nvidia
  servers (DGX-1/DGX-2/H100) and hit the following warning during boot:

  [7.690486] [ cut here ]
  [7.690487] Interrupts were enabled early
  [7.690490] WARNING: CPU: 0 PID: 0 at init/main.c:1065 
start_kernel+0x4da/0x540
  [7.690498] Modules linked in:
  [7.690501] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.2.0-1004-nvidia 
#4~22.04.1-Ubuntu
  [7.690504] Hardware name: NVIDIA NVIDIA DGX-2/NVIDIA DGX-2, BIOS 0.29 
06/07/2021
  [7.690505] RIP: 0010:start_kernel+0x4da/0x540
  [7.690508] Code: ff 48 c7 c7 e8 26 f0 97 e8 b3 59 a8 fd 0f 0b e9 96 fd ff 
ff e8 a7 1d 04 00 e9 7c fe ff ff 48 c7 c7 18 27 f0 97 e8 96 59 a8 fd <0f> 0b e9 
ed fd ff ff 48 c7 c7 b0 26 f0 97 e8 83 59 a8 fd 0f 0b ff
  [7.690510] RSP: :98803f08 EFLAGS: 00010246
  [7.690512] RAX:  RBX:  RCX: 

  [7.690513] RDX:  RSI: 0

[Kernel-packages] [Bug 2024675] Re: NVIDIA CVE-2023-25515, CVE-2023-25516

2023-06-27 Thread Francis Ginther
Automated testing of the DKMS drivers, (450-server, 470-server,
525-server, 470, 525 and 535) has completed across bionic, focal, jammy,
kinetic and lunar. This was performed with:

 * Deploy host with gpgpu
 * Install latest `linux-generic` kernel
 * Install driver from ppa using `nvidia-driver-${DRIVER_NUMBER}` package
 * Reboot
 * Install cuda
 * Execute select cuda samples
 * Verify nvidia-smi output matches the expected DRIVER_NUMBER and version.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to nvidia-graphics-drivers-470 in Ubuntu.
https://bugs.launchpad.net/bugs/2024675

Title:
  NVIDIA CVE-2023-25515, CVE-2023-25516

Status in fabric-manager-450 package in Ubuntu:
  In Progress
Status in fabric-manager-470 package in Ubuntu:
  Triaged
Status in fabric-manager-525 package in Ubuntu:
  Triaged
Status in libnvidia-nscq-450 package in Ubuntu:
  Triaged
Status in libnvidia-nscq-470 package in Ubuntu:
  Triaged
Status in libnvidia-nscq-525 package in Ubuntu:
  Triaged
Status in nvidia-graphics-drivers-450-server package in Ubuntu:
  Triaged
Status in nvidia-graphics-drivers-470 package in Ubuntu:
  Triaged
Status in nvidia-graphics-drivers-470-server package in Ubuntu:
  Triaged
Status in nvidia-graphics-drivers-525 package in Ubuntu:
  Triaged
Status in nvidia-graphics-drivers-525-server package in Ubuntu:
  Triaged
Status in nvidia-graphics-drivers-530 package in Ubuntu:
  Triaged
Status in fabric-manager-450 source package in Focal:
  New
Status in fabric-manager-470 source package in Focal:
  New
Status in fabric-manager-525 source package in Focal:
  New
Status in libnvidia-nscq-450 source package in Focal:
  New
Status in libnvidia-nscq-470 source package in Focal:
  New
Status in libnvidia-nscq-525 source package in Focal:
  New
Status in nvidia-graphics-drivers-450-server source package in Focal:
  New
Status in nvidia-graphics-drivers-470 source package in Focal:
  New
Status in nvidia-graphics-drivers-470-server source package in Focal:
  New
Status in nvidia-graphics-drivers-525 source package in Focal:
  New
Status in nvidia-graphics-drivers-525-server source package in Focal:
  New
Status in nvidia-graphics-drivers-530 source package in Focal:
  New
Status in fabric-manager-450 source package in Jammy:
  New
Status in fabric-manager-470 source package in Jammy:
  New
Status in fabric-manager-525 source package in Jammy:
  New
Status in libnvidia-nscq-450 source package in Jammy:
  New
Status in libnvidia-nscq-470 source package in Jammy:
  New
Status in libnvidia-nscq-525 source package in Jammy:
  New
Status in nvidia-graphics-drivers-450-server source package in Jammy:
  New
Status in nvidia-graphics-drivers-470 source package in Jammy:
  New
Status in nvidia-graphics-drivers-470-server source package in Jammy:
  New
Status in nvidia-graphics-drivers-525 source package in Jammy:
  New
Status in nvidia-graphics-drivers-525-server source package in Jammy:
  New
Status in nvidia-graphics-drivers-530 source package in Jammy:
  New
Status in fabric-manager-450 source package in Kinetic:
  New
Status in fabric-manager-470 source package in Kinetic:
  New
Status in fabric-manager-525 source package in Kinetic:
  New
Status in libnvidia-nscq-450 source package in Kinetic:
  New
Status in libnvidia-nscq-470 source package in Kinetic:
  New
Status in libnvidia-nscq-525 source package in Kinetic:
  New
Status in nvidia-graphics-drivers-450-server source package in Kinetic:
  New
Status in nvidia-graphics-drivers-470 source package in Kinetic:
  New
Status in nvidia-graphics-drivers-470-server source package in Kinetic:
  New
Status in nvidia-graphics-drivers-525 source package in Kinetic:
  New
Status in nvidia-graphics-drivers-525-server source package in Kinetic:
  New
Status in nvidia-graphics-drivers-530 source package in Kinetic:
  New
Status in fabric-manager-450 source package in Lunar:
  New
Status in fabric-manager-470 source package in Lunar:
  New
Status in fabric-manager-525 source package in Lunar:
  New
Status in libnvidia-nscq-450 source package in Lunar:
  New
Status in libnvidia-nscq-470 source package in Lunar:
  New
Status in libnvidia-nscq-525 source package in Lunar:
  New
Status in nvidia-graphics-drivers-450-server source package in Lunar:
  New
Status in nvidia-graphics-drivers-470 source package in Lunar:
  New
Status in nvidia-graphics-drivers-470-server source package in Lunar:
  New
Status in nvidia-graphics-drivers-525 source package in Lunar:
  New
Status in nvidia-graphics-drivers-525-server source package in Lunar:
  New
Status in nvidia-graphics-drivers-530 source package in Lunar:
  New

Bug description:
  CVE-2023-25516, CVE-2023-25516

  https://nvidia.custhelp.com/app/answers/detail/a_id/5468

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/fabric-manager-450/+bug/2024675/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-p

[Kernel-packages] [Bug 2023986] Re: Drivers not working using kernel linux-image-6.2.0-1003-oracle

2023-06-17 Thread Francis Ginther
Thanks to everyone supplying their logs. I'm still looking through these
to try to understand what's going on here.

For most that hit this issue, the solution would be interrupt the boot
loader to boot back into the generic kernel, then remove the oracle and
lowlatency kernels.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-signed-oracle in Ubuntu.
https://bugs.launchpad.net/bugs/2023986

Title:
  Drivers not working using kernel linux-image-6.2.0-1003-oracle

Status in linux-signed-oracle package in Ubuntu:
  Confirmed

Bug description:
  My Ubuntu 23.04 installed two kernels linux-image-6.2.0-1003-oracle
  and linux-image-6.2.0-1003-lowlatency using software updater along
  with other updates. After install and restart, most drivers including
  wifi, bluetooth, touchpad and ethernet stopped working. Rebooting
  using 6.2.0-1003-lowlatency or 6.2.0-20-generic solved the driver
  issues.

  ProblemType: Bug
  DistroRelease: Ubuntu 23.04
  Package: linux-image-6.2.0-1003-oracle 6.2.0-1003.3
  ProcVersionSignature: Ubuntu 6.2.0-1003.3-lowlatency 6.2.6
  Uname: Linux 6.2.0-1003-lowlatency x86_64
  ApportVersion: 2.26.1-0ubuntu2
  Architecture: amd64
  CasperMD5CheckResult: unknown
  CurrentDesktop: ubuntu:GNOME
  Date: Thu Jun 15 16:34:56 2023
  ProcEnviron:
   LANG=en_US.UTF-8
   PATH=(custom, no user)
   SHELL=/bin/bash
   TERM=xterm-256color
   XDG_RUNTIME_DIR=
  SourcePackage: linux-signed-oracle
  UpgradeStatus: No upgrade log present (probably fresh install)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-signed-oracle/+bug/2023986/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2023986] Re: Drivers not working using kernel linux-image-6.2.0-1003-oracle

2023-06-15 Thread Francis Ginther
@navroop005,

Hello, would you mind please sharing a copy of your
`/var/log/apt/history.log`? This looks like a possible package
dependency issue.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-signed-oracle in Ubuntu.
https://bugs.launchpad.net/bugs/2023986

Title:
  Drivers not working using kernel linux-image-6.2.0-1003-oracle

Status in linux-signed-oracle package in Ubuntu:
  Confirmed

Bug description:
  My Ubuntu 23.04 installed two kernels linux-image-6.2.0-1003-oracle
  and linux-image-6.2.0-1003-lowlatency using software updater along
  with other updates. After install and restart, most drivers including
  wifi, bluetooth, touchpad and ethernet stopped working. Rebooting
  using 6.2.0-1003-lowlatency or 6.2.0-20-generic solved the driver
  issues.

  ProblemType: Bug
  DistroRelease: Ubuntu 23.04
  Package: linux-image-6.2.0-1003-oracle 6.2.0-1003.3
  ProcVersionSignature: Ubuntu 6.2.0-1003.3-lowlatency 6.2.6
  Uname: Linux 6.2.0-1003-lowlatency x86_64
  ApportVersion: 2.26.1-0ubuntu2
  Architecture: amd64
  CasperMD5CheckResult: unknown
  CurrentDesktop: ubuntu:GNOME
  Date: Thu Jun 15 16:34:56 2023
  ProcEnviron:
   LANG=en_US.UTF-8
   PATH=(custom, no user)
   SHELL=/bin/bash
   TERM=xterm-256color
   XDG_RUNTIME_DIR=
  SourcePackage: linux-signed-oracle
  UpgradeStatus: No upgrade log present (probably fresh install)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-signed-oracle/+bug/2023986/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2023042] Re: "couldn't communicate with the NVIDIA driver" when installing open dkms and LRM drivers concurrently

2023-06-14 Thread Francis Ginther
I've found a flaw in the test script in which it was installing the
wrong LRM modules for the running kernel. It was installing the generic
modules for a gcp kernel. Once I corrected this to install the gcp
modules, it now passes.

Attached are the logs with the addition of `lsmod` and `modinfo nvidia`

I think this can now be closed as a test error.

** Attachment added: "lunar-525-open-to-lrm-PASSED.txt"
   
https://bugs.launchpad.net/ubuntu/+source/nvidia-graphics-drivers-525/+bug/2023042/+attachment/5679750/+files/lunar-525-open-to-lrm-PASSED.txt

** Changed in: nvidia-graphics-drivers-525 (Ubuntu)
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to nvidia-graphics-drivers-525 in Ubuntu.
https://bugs.launchpad.net/bugs/2023042

Title:
  "couldn't communicate with the NVIDIA driver" when installing open
  dkms and LRM drivers concurrently

Status in nvidia-graphics-drivers-525 package in Ubuntu:
  Invalid

Bug description:
  Installing "nvidia-driver-525-open" followed by "nvidia-headless-no-
  dkms-525 linux-modules-nvidia-525-gcp nvidia-utils-525" led to a
  system which complained about a "Driver/library version mismatch".
  Specifically what was done is:

  Deploy a clean google VM with:

  gcloud compute instances create fginther-kinetic-gpgpu-525 --image-
  project ubuntu-os-cloud --image-family ubuntu-2210-amd64 --machine-
  type n1-standard-4 --boot-disk-size=32GB --accelerator type=nvidia-
  tesla-t4,count=1 --maintenance-policy TERMINATE --restart-on-failure

  Enable kinetic-proposed (this was done with the
  525.116.04-0ubuntu0.22.10.1 driver package).

  Install the 525-open driver first:

  apt-get install -y nvidia-driver-525-open

  Then install the proprietary driver:

  apt-get install nvidia-headless-no-dkms-525 linux-modules-
  nvidia-525-gcp nvidia-utils-525

  After rebooting, "nvidia-smi" complained of the driver/library
  mismatch:

  ubuntu@fginther-kinetic-gpgpu-525:~$ nvidia-smi
  NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. 
Make sure that the latest NVIDIA driver is installed and running.

  The /var/log/apt/history.log is attached which details the packages
  installed and removed.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/nvidia-graphics-drivers-525/+bug/2023042/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2023611] Re: Unable to remove efi variable with 6.2.0-21.21 or newer lunar kernel

2023-06-13 Thread Francis Ginther
I've reproduced this with the 6.3.0-7-generic kernel from mantic-
proposed.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2023611

Title:
  Unable to remove efi variable with 6.2.0-21.21 or newer lunar kernel

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  I'm seeing an issue on an isolated host, howzit, in which it fails to
  remove boot entries. In my limited testing this worked with the
  6.2.0-20.20 kernel, but not the 21.21 or 23.23 kernel. I have not yet
  tried any of the 6.3 kernels.

  I've only seen this on one host so far, howzit, which is an arm64
  server. I have tested on three other arm64 servers and they don't
  appear to be impacted, so this could be some firmware issue. It
  adversely impacts maas installs and will cause a mantic install (which
  is using 6.2.0-21.21) to fail because it can't modify the boot paths.

  Here's what I see trying to remove a boot entry:
  ubuntu@howzit-kernel:~$ efibootmgr -v
  BootCurrent: 0005
  Timeout: 5 seconds
  BootOrder: 0005,0007,0004,0006
  Boot0004  UEFI: Built-in EFI Shell  
VenMedia(5023b95c-db26-429b-a648-bd47664c8012)..BO
  Boot0005* UEFI: PXE IPv4 Mellanox Network Adapter - 0C:42:A1:52:3D:4C   
PcieRoot(0x3)/Pci(0x1,0x0)/Pci(0x0,0x0)/MAC(0c42a1523d4c,1)/IPv4(0.0.0.00.0.0.0,0,0)..BO
  Boot0006  UEFI: PXE IPv4 Mellanox Network Adapter - 0C:42:A1:52:3D:4D   
PcieRoot(0x3)/Pci(0x1,0x0)/Pci(0x0,0x1)/MAC(0c42a1523d4d,1)/IPv4(0.0.0.00.0.0.0,0,0)..BO
  Boot0007* ubuntu
HD(1,GPT,6d8df92f-72ad-4c24-bc8d-8236a4c5e222,0x800,0x10)/File(\EFI\UBUNTU\GRUBAA64.EFI)..BO
  ubuntu@howzit-kernel:~$ sudo efibootmgr -B -b 0007
  Could not delete variable: Invalid argument

  The same command will work with the 6.2.0-20.20 kernel.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2023611/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2023611] [NEW] Unable to remove efi variable with 6.2.0-21.21 or newer lunar kernel

2023-06-12 Thread Francis Ginther
Public bug reported:

I'm seeing an issue on an isolated host, howzit, in which it fails to
remove boot entries. In my limited testing this worked with the
6.2.0-20.20 kernel, but not the 21.21 or 23.23 kernel. I have not yet
tried any of the 6.3 kernels.

I've only seen this on one host so far, howzit, which is an arm64
server. I have tested on three other arm64 servers and they don't appear
to be impacted, so this could be some firmware issue. It adversely
impacts maas installs and will cause a mantic install (which is using
6.2.0-21.21) to fail because it can't modify the boot paths.

Here's what I see trying to remove a boot entry:
ubuntu@howzit-kernel:~$ efibootmgr -v
BootCurrent: 0005
Timeout: 5 seconds
BootOrder: 0005,0007,0004,0006
Boot0004  UEFI: Built-in EFI Shell  
VenMedia(5023b95c-db26-429b-a648-bd47664c8012)..BO
Boot0005* UEFI: PXE IPv4 Mellanox Network Adapter - 0C:42:A1:52:3D:4C   
PcieRoot(0x3)/Pci(0x1,0x0)/Pci(0x0,0x0)/MAC(0c42a1523d4c,1)/IPv4(0.0.0.00.0.0.0,0,0)..BO
Boot0006  UEFI: PXE IPv4 Mellanox Network Adapter - 0C:42:A1:52:3D:4D   
PcieRoot(0x3)/Pci(0x1,0x0)/Pci(0x0,0x1)/MAC(0c42a1523d4d,1)/IPv4(0.0.0.00.0.0.0,0,0)..BO
Boot0007* ubuntu
HD(1,GPT,6d8df92f-72ad-4c24-bc8d-8236a4c5e222,0x800,0x10)/File(\EFI\UBUNTU\GRUBAA64.EFI)..BO
ubuntu@howzit-kernel:~$ sudo efibootmgr -B -b 0007
Could not delete variable: Invalid argument

The same command will work with the 6.2.0-20.20 kernel.

** Affects: linux (Ubuntu)
 Importance: Undecided
 Status: New


** Tags: lunar

** Tags added: lunar

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2023611

Title:
  Unable to remove efi variable with 6.2.0-21.21 or newer lunar kernel

Status in linux package in Ubuntu:
  New

Bug description:
  I'm seeing an issue on an isolated host, howzit, in which it fails to
  remove boot entries. In my limited testing this worked with the
  6.2.0-20.20 kernel, but not the 21.21 or 23.23 kernel. I have not yet
  tried any of the 6.3 kernels.

  I've only seen this on one host so far, howzit, which is an arm64
  server. I have tested on three other arm64 servers and they don't
  appear to be impacted, so this could be some firmware issue. It
  adversely impacts maas installs and will cause a mantic install (which
  is using 6.2.0-21.21) to fail because it can't modify the boot paths.

  Here's what I see trying to remove a boot entry:
  ubuntu@howzit-kernel:~$ efibootmgr -v
  BootCurrent: 0005
  Timeout: 5 seconds
  BootOrder: 0005,0007,0004,0006
  Boot0004  UEFI: Built-in EFI Shell  
VenMedia(5023b95c-db26-429b-a648-bd47664c8012)..BO
  Boot0005* UEFI: PXE IPv4 Mellanox Network Adapter - 0C:42:A1:52:3D:4C   
PcieRoot(0x3)/Pci(0x1,0x0)/Pci(0x0,0x0)/MAC(0c42a1523d4c,1)/IPv4(0.0.0.00.0.0.0,0,0)..BO
  Boot0006  UEFI: PXE IPv4 Mellanox Network Adapter - 0C:42:A1:52:3D:4D   
PcieRoot(0x3)/Pci(0x1,0x0)/Pci(0x0,0x1)/MAC(0c42a1523d4d,1)/IPv4(0.0.0.00.0.0.0,0,0)..BO
  Boot0007* ubuntu
HD(1,GPT,6d8df92f-72ad-4c24-bc8d-8236a4c5e222,0x800,0x10)/File(\EFI\UBUNTU\GRUBAA64.EFI)..BO
  ubuntu@howzit-kernel:~$ sudo efibootmgr -B -b 0007
  Could not delete variable: Invalid argument

  The same command will work with the 6.2.0-20.20 kernel.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2023611/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2023042] [NEW] "Driver/library version mismatch" when installing open and proprietary drivers concurrently

2023-06-06 Thread Francis Ginther
Public bug reported:

Installing "nvidia-driver-525-open" followed by "nvidia-headless-no-
dkms-525 linux-modules-nvidia-525-gcp nvidia-utils-525" led to a system
which complained about a "Driver/library version mismatch". Specifically
what was done is:

Deploy a clean google VM with:

gcloud compute instances create fginther-kinetic-gpgpu-525 --image-
project ubuntu-os-cloud --image-family ubuntu-2210-amd64 --machine-type
n1-standard-4 --boot-disk-size=32GB --accelerator type=nvidia-
tesla-t4,count=1 --maintenance-policy TERMINATE --restart-on-failure

Enable kinetic-proposed (this was done with the
525.116.04-0ubuntu0.22.10.1 driver package).

Install the 525-open driver first:

apt-get install -y nvidia-driver-525-open

Then install the proprietary driver:

apt-get install nvidia-headless-no-dkms-525 linux-modules-nvidia-525-gcp
nvidia-utils-525

After rebooting, "nvidia-smi" complained of the driver/library mismatch:

ubuntu@fginther-kinetic-gpgpu-525:~$ nvidia-smi
Failed to initialize NVML: Driver/library version mismatch

The /var/log/apt/history.log is attached which details the packages
installed and removed.

** Affects: nvidia-graphics-drivers-525 (Ubuntu)
 Importance: Undecided
 Status: New

** Attachment added: "history.log"
   
https://bugs.launchpad.net/bugs/2023042/+attachment/5678203/+files/history.log

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to nvidia-graphics-drivers-525 in Ubuntu.
https://bugs.launchpad.net/bugs/2023042

Title:
  "Driver/library version mismatch" when installing open and proprietary
  drivers concurrently

Status in nvidia-graphics-drivers-525 package in Ubuntu:
  New

Bug description:
  Installing "nvidia-driver-525-open" followed by "nvidia-headless-no-
  dkms-525 linux-modules-nvidia-525-gcp nvidia-utils-525" led to a
  system which complained about a "Driver/library version mismatch".
  Specifically what was done is:

  Deploy a clean google VM with:

  gcloud compute instances create fginther-kinetic-gpgpu-525 --image-
  project ubuntu-os-cloud --image-family ubuntu-2210-amd64 --machine-
  type n1-standard-4 --boot-disk-size=32GB --accelerator type=nvidia-
  tesla-t4,count=1 --maintenance-policy TERMINATE --restart-on-failure

  Enable kinetic-proposed (this was done with the
  525.116.04-0ubuntu0.22.10.1 driver package).

  Install the 525-open driver first:

  apt-get install -y nvidia-driver-525-open

  Then install the proprietary driver:

  apt-get install nvidia-headless-no-dkms-525 linux-modules-
  nvidia-525-gcp nvidia-utils-525

  After rebooting, "nvidia-smi" complained of the driver/library
  mismatch:

  ubuntu@fginther-kinetic-gpgpu-525:~$ nvidia-smi
  Failed to initialize NVML: Driver/library version mismatch

  The /var/log/apt/history.log is attached which details the packages
  installed and removed.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/nvidia-graphics-drivers-525/+bug/2023042/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2016908] Re: udev fails to make prctl() syscall with apparmor=0 (as used by maas by default)

2023-04-20 Thread Francis Ginther
I can confirm @xnox's findings with my maas server deploying lunar.
Adding `apparmor=1` to the settings/configuration/kernel-parameters
allows for a successful deployment with the lunar 6.2.0-20.20 kernel.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2016908

Title:
  udev fails to make prctl() syscall with apparmor=0 (as used by maas by
  default)

Status in MAAS:
  Triaged
Status in maas-images:
  Invalid
Status in linux package in Ubuntu:
  Triaged
Status in systemd package in Ubuntu:
  Invalid

Bug description:
  I'm assuming the image being used for these deploys is 20230417 or
  20230417.1 based on the fact that I saw a 6.2 kernel being used which
  I don't believe was part of the 20230319 serial. I don't have access
  to the maas server, so I can't directly check any log files.

  MAAS Version: 3.3.2

  Here's where the serial log indicates it can't download the squashfs. The 
full log is attached as scobee-lunar-no-squashfs.log (there are some other 
console message intermixed):
  no search or nameservers found in /run/net-BOOTIF.conf /run/net-*.conf 
/run/net6
  -*.conf
  :: 
root=squash:http://10.229.32.21:5248/images/ubuntu/arm64/ga-23.04/lunar/candi
  date/squa[  206.804704] Btrfs loaded, crc32c=crc32c-generic, zoned=yes, 
fsverity
  =yes
  shfs
  :: mount_squash downloading 
http://10.229.32.21:5248/images/ubuntu/arm64/ga-23.0
  4/lunar/candidate/squashfs to /root.tmp.img
  Connecting to 10.229.32.21:5248 (10.229.32.21:5248)
  wget: can't connect to remote host (10.229.32.21): Network is unreachable
  :: mount -t squashfs -o loop  '/root.tmp.img' '/root.tmp'
  mount: mounting /root.tmp.img on /root.tmp failed: No such file or directory
  done.

  Still gathering logs and info and will update as I go.

  
  Kernel Bug / Apparmor
  reproducer

  $ wget 
https://images.maas.io/ephemeral-v3/candidate/lunar/amd64/20230419/ga-23.04/generic/boot-kernel
  $ wget 
https://images.maas.io/ephemeral-v3/candidate/lunar/amd64/20230419/ga-23.04/generic/boot-initrd
  $ qemu-system-x86_64 -nographic -m 2G -kernel ./boot-kernel -initrd 
./boot-initrd -append 'console=ttyS0 break=modules apparmor=0'

  #start the VM
  
  Starting systemd-udevd version 252.5-2ubuntu3
  Spawning shell within the initramfs

  BusyBox v1.35.0 (Ubuntu 1:1.35.0-4ubuntu1) built-in shell (ash)
  Enter 'help' for a list of built-in commands.

  (initramfs) udevadm info --export-db
  Failed to set death signal: Invalid argument

  Observe that udevadm fails to setup death signal, with in systemd code
  is this

  
https://github.com/systemd/systemd/blob/08c2f9c626e0f0052d505b1b7e52f335c0fbfa1d/src/basic/process-
  util.c#L1252

  if (flags & (FORK_DEATHSIG|FORK_DEATHSIG_SIGINT))
  if (prctl(PR_SET_PDEATHSIG, (flags & FORK_DEATHSIG_SIGINT) ? 
SIGINT : SIGTERM) < 0) {
  log_full_errno(prio, errno, "Failed to set death 
signal: %m");
  _exit(EXIT_FAILURE);
  }

  
  workaround set kernel commandline to `apparmor=1`
  

  MAAS bug
  Why is maas setting `apparmor=0` ? Ubuntu shouldn't be used without apparmor. 
Even for deployment and commisioning.

To manage notifications about this bug go to:
https://bugs.launchpad.net/maas/+bug/2016908/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2012529] Re: NVIDIA CVE-2023-{0180 to 0195}

2023-04-05 Thread Francis Ginther
Cuda testing passed for all drivers (470, 515, 525, 450-server,
470-server, 515-server, 525-server) on bionic, focal, jammy and kinetic
using both DKMS and LRM (when using the appropriate stream 2 ppa for the
LRM packages). DKMS testing also passed on lunar.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-restricted-modules in Ubuntu.
https://bugs.launchpad.net/bugs/2012529

Title:
  NVIDIA CVE-2023-{0180 to 0195}

Status in fabric-manager-450 package in Ubuntu:
  New
Status in fabric-manager-470 package in Ubuntu:
  New
Status in fabric-manager-515 package in Ubuntu:
  New
Status in fabric-manager-525 package in Ubuntu:
  New
Status in libnvidia-nscq-450 package in Ubuntu:
  New
Status in libnvidia-nscq-470 package in Ubuntu:
  New
Status in libnvidia-nscq-515 package in Ubuntu:
  New
Status in libnvidia-nscq-525 package in Ubuntu:
  New
Status in linux-restricted-modules package in Ubuntu:
  Triaged
Status in nvidia-graphics-drivers-450-server package in Ubuntu:
  Fix Released
Status in nvidia-graphics-drivers-470 package in Ubuntu:
  Fix Released
Status in nvidia-graphics-drivers-470-server package in Ubuntu:
  Fix Released
Status in nvidia-graphics-drivers-515 package in Ubuntu:
  Fix Released
Status in nvidia-graphics-drivers-515-server package in Ubuntu:
  Fix Released
Status in nvidia-graphics-drivers-525 package in Ubuntu:
  Triaged
Status in nvidia-graphics-drivers-525-server package in Ubuntu:
  Fix Released
Status in fabric-manager-450 source package in Bionic:
  New
Status in fabric-manager-470 source package in Bionic:
  New
Status in fabric-manager-515 source package in Bionic:
  New
Status in fabric-manager-525 source package in Bionic:
  New
Status in libnvidia-nscq-450 source package in Bionic:
  New
Status in libnvidia-nscq-470 source package in Bionic:
  New
Status in libnvidia-nscq-515 source package in Bionic:
  New
Status in libnvidia-nscq-525 source package in Bionic:
  New
Status in linux-restricted-modules source package in Bionic:
  New
Status in nvidia-graphics-drivers-450-server source package in Bionic:
  In Progress
Status in nvidia-graphics-drivers-470 source package in Bionic:
  New
Status in nvidia-graphics-drivers-470-server source package in Bionic:
  New
Status in nvidia-graphics-drivers-515 source package in Bionic:
  New
Status in nvidia-graphics-drivers-515-server source package in Bionic:
  New
Status in nvidia-graphics-drivers-525 source package in Bionic:
  New
Status in nvidia-graphics-drivers-525-server source package in Bionic:
  New
Status in fabric-manager-450 source package in Focal:
  New
Status in fabric-manager-470 source package in Focal:
  New
Status in fabric-manager-515 source package in Focal:
  New
Status in fabric-manager-525 source package in Focal:
  New
Status in libnvidia-nscq-450 source package in Focal:
  New
Status in libnvidia-nscq-470 source package in Focal:
  New
Status in libnvidia-nscq-515 source package in Focal:
  New
Status in libnvidia-nscq-525 source package in Focal:
  New
Status in linux-restricted-modules source package in Focal:
  New
Status in nvidia-graphics-drivers-450-server source package in Focal:
  In Progress
Status in nvidia-graphics-drivers-470 source package in Focal:
  New
Status in nvidia-graphics-drivers-470-server source package in Focal:
  New
Status in nvidia-graphics-drivers-515 source package in Focal:
  New
Status in nvidia-graphics-drivers-515-server source package in Focal:
  New
Status in nvidia-graphics-drivers-525 source package in Focal:
  New
Status in nvidia-graphics-drivers-525-server source package in Focal:
  New
Status in fabric-manager-450 source package in Jammy:
  New
Status in fabric-manager-470 source package in Jammy:
  New
Status in fabric-manager-515 source package in Jammy:
  New
Status in fabric-manager-525 source package in Jammy:
  New
Status in libnvidia-nscq-450 source package in Jammy:
  New
Status in libnvidia-nscq-470 source package in Jammy:
  New
Status in libnvidia-nscq-515 source package in Jammy:
  New
Status in libnvidia-nscq-525 source package in Jammy:
  New
Status in linux-restricted-modules source package in Jammy:
  New
Status in nvidia-graphics-drivers-450-server source package in Jammy:
  New
Status in nvidia-graphics-drivers-470 source package in Jammy:
  New
Status in nvidia-graphics-drivers-470-server source package in Jammy:
  New
Status in nvidia-graphics-drivers-515 source package in Jammy:
  New
Status in nvidia-graphics-drivers-515-server source package in Jammy:
  New
Status in nvidia-graphics-drivers-525 source package in Jammy:
  New
Status in nvidia-graphics-drivers-525-server source package in Jammy:
  New
Status in fabric-manager-450 source package in Kinetic:
  New
Status in fabric-manager-470 source package in Kinetic:
  New
Status in fabric-manager-515 source package in Kinetic:
  New
Status in fabric-manager-525 source package in Kinetic:
  New
Status in libnvidia-nscq-450 source pack

[Kernel-packages] [Bug 2000778] Re: pmtu.sh in net from ubunut_kernel_selftests crash SUT with K-5.19

2023-03-10 Thread Francis Ginther
Still failing on baltar.ppc64el.9 during 2023.02.27 sru cycle. The
kuzzle and scobee (another arm64 server) passed.

** Tags added: sru-20230227

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2000778

Title:
  pmtu.sh in net from ubunut_kernel_selftests crash SUT with K-5.19

Status in ubuntu-kernel-tests:
  New
Status in linux package in Ubuntu:
  Incomplete
Status in linux source package in Kinetic:
  Incomplete

Bug description:
  Issue found with Kinetic 5.19.0-27.28 and 5.19.0-28.29 in this cycle 
(20221114) on these SUTs
   * P9 baltar
   * ARM64 kuzzle
   * ARM64 howzit-kernel

  This should not be considered as a regression as the net test cannot
  be built in 5.19.0-24.25

  
  Test log:
  
ubuntu@baltar:~/autotest/client/tmp/ubuntu_kernel_selftests/src/linux/tools/testing/selftests/net$
 sudo ./pmtu.sh
  TEST: ipv4: PMTU exceptions [ OK ]
  TEST: ipv4: PMTU exceptions - nexthop objects   [ OK ]
  TEST: ipv6: PMTU exceptions [ OK ]
  TEST: ipv6: PMTU exceptions - nexthop objects   [ OK ]
  TEST: ICMPv4 with DSCP and ECN: PMTU exceptions [ OK ]
  TEST: ICMPv4 with DSCP and ECN: PMTU exceptions - nexthop objects   [ OK ]
  'socat' command not found; skipping tests
  TEST: UDPv4 with DSCP and ECN: PMTU exceptions  [SKIP]
  TEST: IPv4 over vxlan4: PMTU exceptions [ OK ]
  TEST: IPv4 over vxlan4: PMTU exceptions - nexthop objects   [ OK ]
  TEST: IPv6 over vxlan4: PMTU exceptions [ OK ]
  TEST: IPv6 over vxlan4: PMTU exceptions - nexthop objects   [ OK ]
  TEST: IPv4 over vxlan6: PMTU exceptions [ OK ]
  TEST: IPv4 over vxlan6: PMTU exceptions - nexthop objects   [ OK ]
  TEST: IPv6 over vxlan6: PMTU exceptions [ OK ]
  TEST: IPv6 over vxlan6: PMTU exceptions - nexthop objects   [ OK ]
  TEST: IPv4 over geneve4: PMTU exceptions[ OK ]
  TEST: IPv4 over geneve4: PMTU exceptions - nexthop objects  [ OK ]
  TEST: IPv6 over geneve4: PMTU exceptions[ OK ]
  TEST: IPv6 over geneve4: PMTU exceptions - nexthop objects  [ OK ]
  TEST: IPv4 over geneve6: PMTU exceptions[ OK ]
  TEST: IPv4 over geneve6: PMTU exceptions - nexthop objects  [ OK ]
  TEST: IPv6 over geneve6: PMTU exceptions[ OK ]
  TEST: IPv6 over geneve6: PMTU exceptions - nexthop objects  [ OK ]
  TEST: IPv4, bridged vxlan4: PMTU exceptions [ OK ]
  TEST: IPv4, bridged vxlan4: PMTU exceptions - nexthop objects   [ OK ]
  TEST: IPv6, bridged vxlan4: PMTU exceptions [ OK ]
  TEST: IPv6, bridged vxlan4: PMTU exceptions - nexthop objects   [ OK ]
  TEST: IPv4, bridged vxlan6: PMTU exceptions [ OK ]
  TEST: IPv4, bridged vxlan6: PMTU exceptions - nexthop objects   [ OK ]
  TEST: IPv6, bridged vxlan6: PMTU exceptions [ OK ]
  TEST: IPv6, bridged vxlan6: PMTU exceptions - nexthop objects   [ OK ]
  TEST: IPv4, bridged geneve4: PMTU exceptions[ OK ]
  TEST: IPv4, bridged geneve4: PMTU exceptions - nexthop objects  [ OK ]
  TEST: IPv6, bridged geneve4: PMTU exceptions[ OK ]
  TEST: IPv6, bridged geneve4: PMTU exceptions - nexthop objects  [ OK ]
  TEST: IPv4, bridged geneve6: PMTU exceptions[ OK ]
  TEST: IPv4, bridged geneve6: PMTU exceptions - nexthop objects  [ OK ]
  TEST: IPv6, bridged geneve6: PMTU exceptions[ OK ]
  TEST: IPv6, bridged geneve6: PMTU exceptions - nexthop objects  [ OK ]
ovs_bridge not supported
  TEST: IPv4, OVS vxlan4: PMTU exceptions [SKIP]
ovs_bridge not supported
  TEST: IPv6, OVS vxlan4: PMTU exceptions [SKIP]
ovs_bridge not supported
  TEST: IPv4, OVS vxlan6: PMTU exceptions [SKIP]
ovs_bridge not supported
  TEST: IPv6, OVS vxlan6: PMTU exceptions [SKIP]
ovs_bridge not supported
  TEST: IPv4, OVS geneve4: PMTU exceptions[SKIP]
ovs_bridge not supported
  TEST: IPv6, OVS geneve4: PMTU exceptions[SKIP]
ovs_bridge not supported
  TEST: IPv4, OVS geneve6: PMTU exceptions[SKIP]
ovs_bridge not supported
  TEST: IPv6, OVS geneve6: PMTU exceptions[SKIP]
  TEST: IPv4 over fou4: PMTU exceptions   [ OK ]
  TEST: IPv4 over fou4: PMTU exceptions - nexthop objects [ OK ]
  TEST: IPv6 ove

[Kernel-packages] [Bug 2003995] Re: Update the 525 and 525-server NVIDIA driver series in Bionic, Focal, Jammy, and Kinetic

2023-02-22 Thread Francis Ginther
No regressions found for either 515-server or 525-server. Both were
tested as DKMS and as LRMs using the generic kernel in all releases
(lunar could not be installed with lrm). Jammy was also tested with the
linux-nvidia kernel and LRMs.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-restricted-modules in Ubuntu.
https://bugs.launchpad.net/bugs/2003995

Title:
  Update the 525 and 525-server NVIDIA driver series in Bionic, Focal,
  Jammy, and Kinetic

Status in fabric-manager-515 package in Ubuntu:
  Fix Released
Status in fabric-manager-525 package in Ubuntu:
  Fix Released
Status in libnvidia-nscq-515 package in Ubuntu:
  Fix Released
Status in libnvidia-nscq-525 package in Ubuntu:
  Fix Released
Status in linux-restricted-modules package in Ubuntu:
  New
Status in linux-restricted-modules-hwe package in Ubuntu:
  New
Status in nvidia-graphics-drivers-515-server package in Ubuntu:
  Triaged
Status in nvidia-graphics-drivers-525 package in Ubuntu:
  Triaged
Status in nvidia-graphics-drivers-525-server package in Ubuntu:
  Fix Released
Status in fabric-manager-515 source package in Bionic:
  Triaged
Status in fabric-manager-525 source package in Bionic:
  Triaged
Status in libnvidia-nscq-515 source package in Bionic:
  Triaged
Status in libnvidia-nscq-525 source package in Bionic:
  Triaged
Status in linux-restricted-modules source package in Bionic:
  New
Status in linux-restricted-modules-hwe source package in Bionic:
  New
Status in nvidia-graphics-drivers-515-server source package in Bionic:
  Triaged
Status in nvidia-graphics-drivers-525 source package in Bionic:
  Triaged
Status in nvidia-graphics-drivers-525-server source package in Bionic:
  Triaged
Status in fabric-manager-515 source package in Focal:
  Triaged
Status in fabric-manager-525 source package in Focal:
  Triaged
Status in libnvidia-nscq-515 source package in Focal:
  Triaged
Status in libnvidia-nscq-525 source package in Focal:
  Triaged
Status in linux-restricted-modules source package in Focal:
  New
Status in linux-restricted-modules-hwe source package in Focal:
  New
Status in nvidia-graphics-drivers-515-server source package in Focal:
  Triaged
Status in nvidia-graphics-drivers-525 source package in Focal:
  Triaged
Status in nvidia-graphics-drivers-525-server source package in Focal:
  Triaged
Status in fabric-manager-515 source package in Jammy:
  Triaged
Status in fabric-manager-525 source package in Jammy:
  Triaged
Status in libnvidia-nscq-515 source package in Jammy:
  Triaged
Status in libnvidia-nscq-525 source package in Jammy:
  Triaged
Status in linux-restricted-modules source package in Jammy:
  New
Status in linux-restricted-modules-hwe source package in Jammy:
  New
Status in nvidia-graphics-drivers-515-server source package in Jammy:
  Triaged
Status in nvidia-graphics-drivers-525 source package in Jammy:
  Triaged
Status in nvidia-graphics-drivers-525-server source package in Jammy:
  Triaged
Status in fabric-manager-515 source package in Kinetic:
  Triaged
Status in fabric-manager-525 source package in Kinetic:
  Triaged
Status in libnvidia-nscq-515 source package in Kinetic:
  Triaged
Status in libnvidia-nscq-525 source package in Kinetic:
  Triaged
Status in linux-restricted-modules source package in Kinetic:
  New
Status in linux-restricted-modules-hwe source package in Kinetic:
  New
Status in nvidia-graphics-drivers-515-server source package in Kinetic:
  Triaged
Status in nvidia-graphics-drivers-525 source package in Kinetic:
  Triaged
Status in nvidia-graphics-drivers-525-server source package in Kinetic:
  Triaged

Bug description:
  [Impact]
  These releases provide both bug fixes and new features, and we would like to
  make sure all of our users have access to these improvements.

  See the changelog entry below for a full list of changes and bugs.

  [Test Case]
  The following development and SRU process was followed:
  https://wiki.ubuntu.com/NVidiaUpdates

  Certification test suite must pass on a range of hardware:
  https://git.launchpad.net/plainbox-provider-sru/tree/units/sru.pxu

  The QA team that executed the tests will be in charge of attaching the
  artifacts and console output of the appropriate run to the bug. nVidia
  maintainers team members will not mark ‘verification-done’ until this
  has happened.

  [Regression Potential]
  In order to mitigate the regression potential, the results of the
  aforementioned system level tests are attached to this bug.

  [Discussion]

  [Changelog]

  525

* New upstream release (LP: #2003995):
  - Improved the reliability of suspend and resume on UEFI systems
when using certain display panels.
  - Fixed a bug that could cause VK_ERROR_DEVICE_LOST when using
VK_MEMORY_ALLOCATE_DEVICE_ADDRESS_CAPTURE_REPLAY_BIT to
allocate memory.
  - Disabled Fixed Rate Link (FRL) when using passive DisplayPort
to HDMI dongles, which are incompatible  wit

[Kernel-packages] [Bug 2006620] Re: linux-aws-5.19 hibernation tasks sometimes fail to freeze

2023-02-08 Thread Francis Ginther
Here is the full syslog from which the portion in the bug description
was extracted from.

** Attachment added: "c5.12xlarge-3-syslog.log"
   
https://bugs.launchpad.net/ubuntu/+source/linux-aws/+bug/2006620/+attachment/5645600/+files/c5.12xlarge-3-syslog.log

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-aws in Ubuntu.
https://bugs.launchpad.net/bugs/2006620

Title:
  linux-aws-5.19 hibernation tasks sometimes fail to freeze

Status in linux-aws package in Ubuntu:
  New

Bug description:
  Hibernation on AWS instances with jammy/5.19.0-1019-aws sometimes
  fails due to the following failure to freeze:

  Feb  1 01:09:05 ip-172-31-54-178 kernel: [  443.247854] PM: hibernation: 
hibernation entry
  Feb  1 01:09:05 ip-172-31-54-178 kernel: [  443.347353] TSC found unstable 
after boot, most likely due to broken BIOS. Use 'tsc=unstable'.
  Feb  1 01:09:05 ip-172-31-54-178 kernel: [  443.347355] sched_clock: Marking 
unstable (442909362062, 1007864825)<-(443748056670, -400707172)
  Feb  1 01:12:33 ip-172-31-54-178 kernel: [  443.940489] Filesystems sync: 
0.022 seconds
  Feb  1 01:12:33 ip-172-31-54-178 kernel: [  443.940492] Freezing user space 
processes ... (elapsed 0.001 seconds) done.
  Feb  1 01:12:33 ip-172-31-54-178 kernel: [  443.941611] OOM killer disabled.
  Feb  1 01:12:33 ip-172-31-54-178 kernel: [  443.943036] PM: hibernation: 
Marking nosave pages: [mem 0x-0x0fff]
  Feb  1 01:12:33 ip-172-31-54-178 kernel: [  443.943039] PM: hibernation: 
Marking nosave pages: [mem 0x0009f000-0x000f]
  Feb  1 01:12:33 ip-172-31-54-178 kernel: [  443.943041] PM: hibernation: 
Marking nosave pages: [mem 0xbffe8000-0x]
  Feb  1 01:12:33 ip-172-31-54-178 kernel: [  443.943950] PM: hibernation: 
Basic memory bitmaps created
  Feb  1 01:12:33 ip-172-31-54-178 kernel: [  443.943961] PM: hibernation: 
Preallocating image memory
  Feb  1 01:12:33 ip-172-31-54-178 kernel: [  630.782421] PM: hibernation: 
Allocated 9655951 pages for snapshot
  Feb  1 01:12:33 ip-172-31-54-178 kernel: [  630.782424] PM: hibernation: 
Allocated 38623804 kbytes in 186.83 seconds (206.73 MB/s)
  Feb  1 01:12:33 ip-172-31-54-178 kernel: [  630.782426] Freezing remaining 
freezable tasks ... 
  Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.789826] Freezing of tasks 
failed after 20.007 seconds (1 tasks refusing to freeze, wq_busy=0):
  Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792830] task:kswapd0 
state:D stack:0 pid:  328 ppid: 2 flags:0x4000
  Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792833] Call Trace:
  Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792835]  
  Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792837]  
__schedule+0x248/0x5d0
  Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792842]  schedule+0x58/0x100
  Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792844]  io_schedule+0x46/0x80
  Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792846]  
blk_mq_get_tag+0x117/0x2e0
  Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792852]  ? 
destroy_sched_domains_rcu+0x40/0x40
  Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792857]  
__blk_mq_alloc_requests+0xc4/0x1e0
  Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792859]  
blk_mq_get_new_requests+0xce/0x190
  Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792861]  
blk_mq_submit_bio+0x1e6/0x430
  Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792864]  
__submit_bio+0xf6/0x190
  Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792866]  
submit_bio_noacct_nocheck+0xc2/0x120
  Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792869]  
submit_bio_noacct+0x1c5/0x540
  Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792871]  ? 
sio_write_complete+0x1f0/0x1f0
  Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792875]  submit_bio+0x47/0xf0
  Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792877]  
__swap_writepage+0x157/0x570
  Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792879]  
swap_writepage+0x2f/0x80
  Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792880]  pageout+0xe2/0x2f0
  Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792883]  
shrink_page_list+0x60b/0xc80
  Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792885]  
shrink_inactive_list+0x1bc/0x4d0
  Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792886]  
shrink_lruvec+0x2f5/0x450
  Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792888]  
shrink_node_memcgs+0x166/0x1d0
  Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792890]  
shrink_node+0x156/0x5a0
  Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792891]  ? 
__schedule+0x250/0x5d0
  Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792893]  
balance_pgdat+0x37b/0x880
  Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792894]  ? 
zone_watermark_ok_safe+0x4f/0x100
  Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792899]  ? 
balance_pgdat+0x880/0x880
  Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792900]  kswapd+0x10c/0x1c0
  Feb  1 01:12:33 ip-

[Kernel-packages] [Bug 2006620] [NEW] linux-aws-5.19 hibernation tasks sometimes fail to freeze

2023-02-08 Thread Francis Ginther
Public bug reported:

Hibernation on AWS instances with jammy/5.19.0-1019-aws sometimes fails
due to the following failure to freeze:

Feb  1 01:09:05 ip-172-31-54-178 kernel: [  443.247854] PM: hibernation: 
hibernation entry
Feb  1 01:09:05 ip-172-31-54-178 kernel: [  443.347353] TSC found unstable 
after boot, most likely due to broken BIOS. Use 'tsc=unstable'.
Feb  1 01:09:05 ip-172-31-54-178 kernel: [  443.347355] sched_clock: Marking 
unstable (442909362062, 1007864825)<-(443748056670, -400707172)
Feb  1 01:12:33 ip-172-31-54-178 kernel: [  443.940489] Filesystems sync: 0.022 
seconds
Feb  1 01:12:33 ip-172-31-54-178 kernel: [  443.940492] Freezing user space 
processes ... (elapsed 0.001 seconds) done.
Feb  1 01:12:33 ip-172-31-54-178 kernel: [  443.941611] OOM killer disabled.
Feb  1 01:12:33 ip-172-31-54-178 kernel: [  443.943036] PM: hibernation: 
Marking nosave pages: [mem 0x-0x0fff]
Feb  1 01:12:33 ip-172-31-54-178 kernel: [  443.943039] PM: hibernation: 
Marking nosave pages: [mem 0x0009f000-0x000f]
Feb  1 01:12:33 ip-172-31-54-178 kernel: [  443.943041] PM: hibernation: 
Marking nosave pages: [mem 0xbffe8000-0x]
Feb  1 01:12:33 ip-172-31-54-178 kernel: [  443.943950] PM: hibernation: Basic 
memory bitmaps created
Feb  1 01:12:33 ip-172-31-54-178 kernel: [  443.943961] PM: hibernation: 
Preallocating image memory
Feb  1 01:12:33 ip-172-31-54-178 kernel: [  630.782421] PM: hibernation: 
Allocated 9655951 pages for snapshot
Feb  1 01:12:33 ip-172-31-54-178 kernel: [  630.782424] PM: hibernation: 
Allocated 38623804 kbytes in 186.83 seconds (206.73 MB/s)
Feb  1 01:12:33 ip-172-31-54-178 kernel: [  630.782426] Freezing remaining 
freezable tasks ... 
Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.789826] Freezing of tasks 
failed after 20.007 seconds (1 tasks refusing to freeze, wq_busy=0):
Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792830] task:kswapd0 
state:D stack:0 pid:  328 ppid: 2 flags:0x4000
Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792833] Call Trace:
Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792835]  
Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792837]  __schedule+0x248/0x5d0
Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792842]  schedule+0x58/0x100
Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792844]  io_schedule+0x46/0x80
Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792846]  
blk_mq_get_tag+0x117/0x2e0
Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792852]  ? 
destroy_sched_domains_rcu+0x40/0x40
Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792857]  
__blk_mq_alloc_requests+0xc4/0x1e0
Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792859]  
blk_mq_get_new_requests+0xce/0x190
Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792861]  
blk_mq_submit_bio+0x1e6/0x430
Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792864]  __submit_bio+0xf6/0x190
Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792866]  
submit_bio_noacct_nocheck+0xc2/0x120
Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792869]  
submit_bio_noacct+0x1c5/0x540
Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792871]  ? 
sio_write_complete+0x1f0/0x1f0
Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792875]  submit_bio+0x47/0xf0
Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792877]  
__swap_writepage+0x157/0x570
Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792879]  
swap_writepage+0x2f/0x80
Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792880]  pageout+0xe2/0x2f0
Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792883]  
shrink_page_list+0x60b/0xc80
Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792885]  
shrink_inactive_list+0x1bc/0x4d0
Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792886]  
shrink_lruvec+0x2f5/0x450
Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792888]  
shrink_node_memcgs+0x166/0x1d0
Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792890]  shrink_node+0x156/0x5a0
Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792891]  ? 
__schedule+0x250/0x5d0
Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792893]  
balance_pgdat+0x37b/0x880
Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792894]  ? 
zone_watermark_ok_safe+0x4f/0x100
Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792899]  ? 
balance_pgdat+0x880/0x880
Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792900]  kswapd+0x10c/0x1c0
Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792901]  ? 
balance_pgdat+0x880/0x880
Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792903]  kthread+0xd1/0xf0
Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792906]  ? 
kthread_complete_and_exit+0x20/0x20
Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792909]  ret_from_fork+0x22/0x30
Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792913]  
Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792921] 
Feb  1 01:12:33 ip-172-31-54-178 kernel: [  650.792922] Restarting kernel 
threads ... done.
Feb  1 01:12:33 ip-172-31-54-178 kernel: [  651.516499] PM: hibernation: Basic 
memory bitmaps freed
Feb  1 0

[Kernel-packages] [Bug 1993665] Re: Update the 470-server NVIDIA driver

2022-11-29 Thread Francis Ginther
The A100 is down with some hardware issues and there is no ETA when it
will be up again. Given that the testing passed on the DGX2 and the A100
is having hardware issues which quite likely impacted the testing, I'm
going to consider the kinetic testing as verified.

** Tags removed: verification-failed-kinetic
** Tags added: verification-done-kinetic

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to nvidia-graphics-drivers-470-server in
Ubuntu.
https://bugs.launchpad.net/bugs/1993665

Title:
  Update the 470-server NVIDIA driver

Status in fabric-manager-470 package in Ubuntu:
  In Progress
Status in libnvidia-nscq-470 package in Ubuntu:
  In Progress
Status in nvidia-graphics-drivers-470-server package in Ubuntu:
  In Progress
Status in fabric-manager-470 source package in Bionic:
  Fix Released
Status in libnvidia-nscq-470 source package in Bionic:
  Fix Released
Status in nvidia-graphics-drivers-470-server source package in Bionic:
  Fix Released
Status in fabric-manager-470 source package in Focal:
  Fix Released
Status in libnvidia-nscq-470 source package in Focal:
  Fix Released
Status in nvidia-graphics-drivers-470-server source package in Focal:
  Fix Released
Status in fabric-manager-470 source package in Jammy:
  Fix Released
Status in libnvidia-nscq-470 source package in Jammy:
  Fix Released
Status in nvidia-graphics-drivers-470-server source package in Jammy:
  Fix Released
Status in fabric-manager-470 source package in Kinetic:
  In Progress
Status in libnvidia-nscq-470 source package in Kinetic:
  In Progress
Status in nvidia-graphics-drivers-470-server source package in Kinetic:
  In Progress

Bug description:
  
  [Impact]
  These releases provide both bug fixes and new features, and we would like to
  make sure all of our users have access to these improvements.

  See the changelog entry below for a full list of changes and bugs.

  [Test Case]
  The following development and SRU process was followed:
  https://wiki.ubuntu.com/NVidiaUpdates

  Certification test suite must pass on a range of hardware:
  https://git.launchpad.net/plainbox-provider-sru/tree/units/sru.pxu

  The QA team that executed the tests will be in charge of attaching the
  artifacts and console output of the appropriate run to the bug. nVidia
  maintainers team members will not mark ‘verification-done’ until this
  has happened.

  [Regression Potential]
  In order to mitigate the regression potential, the results of the
  aforementioned system level tests are attached to this bug.

  [Discussion]

  [Changelog]

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/fabric-manager-470/+bug/1993665/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1993665] Re: Update the 470-server NVIDIA driver

2022-11-22 Thread Francis Ginther
Re-running through the testing on our DGX2 now passes for both DKMS and
LRM. I will need to retry the testing on A100 again and see if I missed
something like the fabricmanager not being ready yet.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to nvidia-graphics-drivers-470-server in
Ubuntu.
https://bugs.launchpad.net/bugs/1993665

Title:
  Update the 470-server NVIDIA driver

Status in fabric-manager-470 package in Ubuntu:
  In Progress
Status in libnvidia-nscq-470 package in Ubuntu:
  In Progress
Status in nvidia-graphics-drivers-470-server package in Ubuntu:
  In Progress
Status in fabric-manager-470 source package in Bionic:
  Fix Released
Status in libnvidia-nscq-470 source package in Bionic:
  Fix Released
Status in nvidia-graphics-drivers-470-server source package in Bionic:
  Fix Released
Status in fabric-manager-470 source package in Focal:
  Fix Released
Status in libnvidia-nscq-470 source package in Focal:
  Fix Released
Status in nvidia-graphics-drivers-470-server source package in Focal:
  Fix Released
Status in fabric-manager-470 source package in Jammy:
  Fix Released
Status in libnvidia-nscq-470 source package in Jammy:
  Fix Released
Status in nvidia-graphics-drivers-470-server source package in Jammy:
  Fix Released
Status in fabric-manager-470 source package in Kinetic:
  In Progress
Status in libnvidia-nscq-470 source package in Kinetic:
  In Progress
Status in nvidia-graphics-drivers-470-server source package in Kinetic:
  In Progress

Bug description:
  
  [Impact]
  These releases provide both bug fixes and new features, and we would like to
  make sure all of our users have access to these improvements.

  See the changelog entry below for a full list of changes and bugs.

  [Test Case]
  The following development and SRU process was followed:
  https://wiki.ubuntu.com/NVidiaUpdates

  Certification test suite must pass on a range of hardware:
  https://git.launchpad.net/plainbox-provider-sru/tree/units/sru.pxu

  The QA team that executed the tests will be in charge of attaching the
  artifacts and console output of the appropriate run to the bug. nVidia
  maintainers team members will not mark ‘verification-done’ until this
  has happened.

  [Regression Potential]
  In order to mitigate the regression potential, the results of the
  aforementioned system level tests are attached to this bug.

  [Discussion]

  [Changelog]

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/fabric-manager-470/+bug/1993665/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1993665] Re: Update the 470-server NVIDIA driver

2022-11-21 Thread Francis Ginther
Verification on kinetic is incomplete. Things do work on a cloud
instance with a single gpgpu. In these cases, both the DKMS and LRM
version of the driver works with the cuda samples test.

Problems are encountered when running on either the DGX2 or A100
systems. For the A100, I have not been able to get either the 470.141.03
(in -release) or 470.141.10 (in -proposed) drivers to work. The
470.141.03 driver did pass the cuda tests on the DGX2, testing with
470.141.10 is still in progress.

Both the DGX2 and A100 require the fabric-manager package. This could be
where the problem lies or it could be something in the driver that is
only exposed by these systems.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to nvidia-graphics-drivers-470-server in
Ubuntu.
https://bugs.launchpad.net/bugs/1993665

Title:
  Update the 470-server NVIDIA driver

Status in fabric-manager-470 package in Ubuntu:
  In Progress
Status in libnvidia-nscq-470 package in Ubuntu:
  In Progress
Status in nvidia-graphics-drivers-470-server package in Ubuntu:
  In Progress
Status in fabric-manager-470 source package in Bionic:
  Fix Released
Status in libnvidia-nscq-470 source package in Bionic:
  Fix Released
Status in nvidia-graphics-drivers-470-server source package in Bionic:
  Fix Released
Status in fabric-manager-470 source package in Focal:
  Fix Released
Status in libnvidia-nscq-470 source package in Focal:
  Fix Released
Status in nvidia-graphics-drivers-470-server source package in Focal:
  Fix Released
Status in fabric-manager-470 source package in Jammy:
  Fix Released
Status in libnvidia-nscq-470 source package in Jammy:
  Fix Released
Status in nvidia-graphics-drivers-470-server source package in Jammy:
  Fix Released
Status in fabric-manager-470 source package in Kinetic:
  In Progress
Status in libnvidia-nscq-470 source package in Kinetic:
  In Progress
Status in nvidia-graphics-drivers-470-server source package in Kinetic:
  In Progress

Bug description:
  
  [Impact]
  These releases provide both bug fixes and new features, and we would like to
  make sure all of our users have access to these improvements.

  See the changelog entry below for a full list of changes and bugs.

  [Test Case]
  The following development and SRU process was followed:
  https://wiki.ubuntu.com/NVidiaUpdates

  Certification test suite must pass on a range of hardware:
  https://git.launchpad.net/plainbox-provider-sru/tree/units/sru.pxu

  The QA team that executed the tests will be in charge of attaching the
  artifacts and console output of the appropriate run to the bug. nVidia
  maintainers team members will not mark ‘verification-done’ until this
  has happened.

  [Regression Potential]
  In order to mitigate the regression potential, the results of the
  aforementioned system level tests are attached to this bug.

  [Discussion]

  [Changelog]

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/fabric-manager-470/+bug/1993665/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1993665] Re: Update the 470-server NVIDIA driver

2022-11-10 Thread Francis Ginther
Tested bionic, focal and jammy on VMs and a DGX2. All cuda tests passed.

There is no updated kinetic driver, so unable to test there.

** Tags added: verification-done-bionic verification-done-focal
verification-done-jammy verification-failed-kinetic

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to nvidia-graphics-drivers-470-server in
Ubuntu.
https://bugs.launchpad.net/bugs/1993665

Title:
  Update the 470-server NVIDIA driver

Status in fabric-manager-470 package in Ubuntu:
  In Progress
Status in libnvidia-nscq-470 package in Ubuntu:
  In Progress
Status in nvidia-graphics-drivers-470-server package in Ubuntu:
  In Progress
Status in fabric-manager-470 source package in Bionic:
  In Progress
Status in libnvidia-nscq-470 source package in Bionic:
  In Progress
Status in nvidia-graphics-drivers-470-server source package in Bionic:
  In Progress
Status in fabric-manager-470 source package in Focal:
  In Progress
Status in libnvidia-nscq-470 source package in Focal:
  In Progress
Status in nvidia-graphics-drivers-470-server source package in Focal:
  In Progress
Status in fabric-manager-470 source package in Jammy:
  In Progress
Status in libnvidia-nscq-470 source package in Jammy:
  In Progress
Status in nvidia-graphics-drivers-470-server source package in Jammy:
  In Progress
Status in fabric-manager-470 source package in Kinetic:
  In Progress
Status in libnvidia-nscq-470 source package in Kinetic:
  In Progress
Status in nvidia-graphics-drivers-470-server source package in Kinetic:
  In Progress

Bug description:
  
  [Impact]
  These releases provide both bug fixes and new features, and we would like to
  make sure all of our users have access to these improvements.

  See the changelog entry below for a full list of changes and bugs.

  [Test Case]
  The following development and SRU process was followed:
  https://wiki.ubuntu.com/NVidiaUpdates

  Certification test suite must pass on a range of hardware:
  https://git.launchpad.net/plainbox-provider-sru/tree/units/sru.pxu

  The QA team that executed the tests will be in charge of attaching the
  artifacts and console output of the appropriate run to the bug. nVidia
  maintainers team members will not mark ‘verification-done’ until this
  has happened.

  [Regression Potential]
  In order to mitigate the regression potential, the results of the
  aforementioned system level tests are attached to this bug.

  [Discussion]

  [Changelog]

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/fabric-manager-470/+bug/1993665/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1991676] Re: Package grub-efi-arm64-signed 1.173.2~18.04.1+2.04-1ubuntu47.4 from bionic-proposed fails to install/upgrade (grub-install: error: efibootmgr: not found.)

2022-11-10 Thread Francis Ginther
@juliank, ah, I found another detail. This appears to only break when
the package is updated in the ADT testbed. My assumption is if the
latest package version is already in the base image, there is no package
update and therefore no breakage. For example:

[1] older image, fails: 
https://autopkgtest.ubuntu.com/results/autopkgtest-bionic/bionic/arm64/d/dpdk/20221027_190645_aae03@/log.gz
[2] newer image, passes: 
https://autopkgtest.ubuntu.com/results/autopkgtest-bionic/bionic/arm64/d/dpdk/20221028_185258_2a00f@/log.gz

The second run occurred about a day later. I can't tell if this is using
a new image, but when I inspected the artifacts from the run. I do see
`grub-efi-arm64-bin 2.04-1ubuntu47.4`, which is the latest and the
version that [1] tried to upgrade to.

So I guess we generally avoid this by refreshing the ADT base images.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-hwe-5.4 in Ubuntu.
https://bugs.launchpad.net/bugs/1991676

Title:
  Package grub-efi-arm64-signed 1.173.2~18.04.1+2.04-1ubuntu47.4 from
  bionic-proposed fails to install/upgrade (grub-install: error:
  efibootmgr: not found.)

Status in ubuntu-kernel-tests:
  New
Status in grub2-signed package in Ubuntu:
  Invalid
Status in linux-hwe-5.4 package in Ubuntu:
  Confirmed
Status in grub2-signed source package in Bionic:
  Triaged
Status in linux-hwe-5.4 source package in Bionic:
  Confirmed

Bug description:
  The ADT tests for arm64 kernels in Bionic are failing during the setup
  phase with the following errors:

  Setting up grub-efi-arm64-signed (1.173.2~18.04.1+2.04-1ubuntu47.4) ...
  Installing for arm64-efi platform.
  grub-install: error: efibootmgr: not found.
  dpkg: error processing package grub-efi-arm64-signed (--configure):
   installed grub-efi-arm64-signed package post-installation script subprocess 
returned error exit status 1
  Setting up libx11-6:arm64 (2:1.6.4-3ubuntu0.5) ...
  Processing triggers for man-db (2.8.3-2ubuntu0.1) ...
  Processing triggers for libc-bin (2.27-3ubuntu1.6) ...
  Errors were encountered while processing:
   grub-efi-arm64-signed
  E: Sub-process /usr/bin/dpkg returned an error code (1)
  blame: 
  badpkg: testbed setup commands failed with status 100
  autopkgtest [15:12:03]: ERROR: erroneous package: testbed setup commands 
failed with status 100

  ADT test log:
  
https://autopkgtest.ubuntu.com/results/autopkgtest-bionic/bionic/arm64/l/linux-hwe-5.4/20220930_151219_13ac3@/log.gz

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1991676/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1991676] Re: Package grub-efi-arm64-signed 1.173.2~18.04.1+2.04-1ubuntu47.4 from bionic-proposed fails to install/upgrade (grub-install: error: efibootmgr: not found.)

2022-11-10 Thread Francis Ginther
@juliank Hello, I see that you picked up
https://bugs.launchpad.net/ubuntu/+source/linux-hwe-5.4/+bug/1991676. I
just want to mention so that you are aware, that this is blocking most,
if not all, kernel ADT testing on bionic arm64.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-hwe-5.4 in Ubuntu.
https://bugs.launchpad.net/bugs/1991676

Title:
  Package grub-efi-arm64-signed 1.173.2~18.04.1+2.04-1ubuntu47.4 from
  bionic-proposed fails to install/upgrade (grub-install: error:
  efibootmgr: not found.)

Status in ubuntu-kernel-tests:
  New
Status in grub2-signed package in Ubuntu:
  Invalid
Status in linux-hwe-5.4 package in Ubuntu:
  Confirmed
Status in grub2-signed source package in Bionic:
  Triaged
Status in linux-hwe-5.4 source package in Bionic:
  Confirmed

Bug description:
  The ADT tests for arm64 kernels in Bionic are failing during the setup
  phase with the following errors:

  Setting up grub-efi-arm64-signed (1.173.2~18.04.1+2.04-1ubuntu47.4) ...
  Installing for arm64-efi platform.
  grub-install: error: efibootmgr: not found.
  dpkg: error processing package grub-efi-arm64-signed (--configure):
   installed grub-efi-arm64-signed package post-installation script subprocess 
returned error exit status 1
  Setting up libx11-6:arm64 (2:1.6.4-3ubuntu0.5) ...
  Processing triggers for man-db (2.8.3-2ubuntu0.1) ...
  Processing triggers for libc-bin (2.27-3ubuntu1.6) ...
  Errors were encountered while processing:
   grub-efi-arm64-signed
  E: Sub-process /usr/bin/dpkg returned an error code (1)
  blame: 
  badpkg: testbed setup commands failed with status 100
  autopkgtest [15:12:03]: ERROR: erroneous package: testbed setup commands 
failed with status 100

  ADT test log:
  
https://autopkgtest.ubuntu.com/results/autopkgtest-bionic/bionic/arm64/l/linux-hwe-5.4/20220930_151219_13ac3@/log.gz

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1991676/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1923114] Re: ubuntu_kernel_selftests: ./cpu-on-off-test.sh: line 94: echo: write error: Device or resource busy

2022-08-08 Thread Francis Ginther
** Tags added: 5.4

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-azure-4.15 in Ubuntu.
https://bugs.launchpad.net/bugs/1923114

Title:
  ubuntu_kernel_selftests: ./cpu-on-off-test.sh: line 94: echo: write
  error: Device or resource busy

Status in ubuntu-kernel-tests:
  In Progress
Status in linux-azure package in Ubuntu:
  New
Status in linux-azure-4.15 package in Ubuntu:
  New
Status in linux-azure source package in Trusty:
  New
Status in linux-azure-4.15 source package in Trusty:
  New
Status in linux-azure source package in Xenial:
  New
Status in linux-azure-4.15 source package in Xenial:
  New
Status in linux-azure source package in Bionic:
  New
Status in linux-azure-4.15 source package in Bionic:
  New
Status in linux-azure source package in Groovy:
  New
Status in linux-azure-4.15 source package in Groovy:
  New

Bug description:
  Test cpu-hotplug from ubuntu_kernel_selftests failed with
  bionic:linux-azure-4.15 running on a Basic A2 with 2 cores (besides
  other instance types):

  selftests: cpu-on-off-test.sh
  
  pid 28041's current affinity mask: 3
  pid 28041's new affinity mask: 1
  CPU online/offline summary:
  present_cpus = 0-1 present_max = 1
  Cpus in online state: 0-1
  Cpus in offline state: 0
  Limited scope test: one hotplug cpu
  (leaves cpu in the original state):
  online to offline to online: cpu 1
  not ok 1..1 selftests: cpu-on-off-test.sh [FAIL]
  ./cpu-on-off-test.sh: line 94: echo: write error: Device or resource busy
  offline_cpu_expect_success 1: unexpected fail

  http://10.246.72.46/4.15.0-1112.124~16.04.1-azure/xenial-linux-azure-
  azure-
  
amd64-4.15.0-Basic_A2-ubuntu_kernel_selftests/ubuntu_kernel_selftests/results/ubuntu_kernel_selftests.cpu-
  hotplug/debug/ubuntu_kernel_selftests.cpu-hotplug.DEBUG.html

  The problem happens at "autotest-client-
  tests/ubuntu_kernel_selftests/cpu-on-off-test.sh" when executing:

  echo 0 > $SYSFS/devices/system/cpu/cpu$1/online

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1923114/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1975509] Re: Update to the 510.73.08 ERD NVIDIA driver series in Bionic, Focal, Impish, Jammy, and Kinetic

2022-06-14 Thread Francis Ginther
Testing of nvidia-fabricmanager-510 and libnvidia-nscq-510 has been
successfully performed again against the packages in -proposed. These
are good to release from a testing perspective.

** Tags removed: verification-needed verification-needed-bionic 
verification-needed-focal verification-needed-impish verification-needed-jammy
** Tags added: verification-done verification-done-bionic 
verification-done-focal verification-done-impish verification-done-jammy

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-restricted-modules in Ubuntu.
https://bugs.launchpad.net/bugs/1975509

Title:
  Update to the 510.73.08 ERD NVIDIA driver series in Bionic, Focal,
  Impish, Jammy, and Kinetic

Status in fabric-manager-510 package in Ubuntu:
  Fix Committed
Status in libnvidia-nscq-510 package in Ubuntu:
  Fix Committed
Status in linux-restricted-modules package in Ubuntu:
  Confirmed
Status in nvidia-graphics-drivers-510-server package in Ubuntu:
  Fix Committed
Status in fabric-manager-510 source package in Bionic:
  Fix Committed
Status in libnvidia-nscq-510 source package in Bionic:
  Fix Committed
Status in linux-restricted-modules source package in Bionic:
  Confirmed
Status in nvidia-graphics-drivers-510-server source package in Bionic:
  Fix Released
Status in fabric-manager-510 source package in Focal:
  Fix Committed
Status in libnvidia-nscq-510 source package in Focal:
  Fix Committed
Status in linux-restricted-modules source package in Focal:
  Confirmed
Status in nvidia-graphics-drivers-510-server source package in Focal:
  Fix Released
Status in fabric-manager-510 source package in Impish:
  Fix Committed
Status in libnvidia-nscq-510 source package in Impish:
  Fix Committed
Status in linux-restricted-modules source package in Impish:
  Confirmed
Status in nvidia-graphics-drivers-510-server source package in Impish:
  Fix Released
Status in fabric-manager-510 source package in Jammy:
  Fix Committed
Status in libnvidia-nscq-510 source package in Jammy:
  Fix Committed
Status in linux-restricted-modules source package in Jammy:
  Confirmed
Status in nvidia-graphics-drivers-510-server source package in Jammy:
  Fix Released
Status in fabric-manager-510 source package in Kinetic:
  Fix Committed
Status in libnvidia-nscq-510 source package in Kinetic:
  Fix Committed
Status in linux-restricted-modules source package in Kinetic:
  Confirmed
Status in nvidia-graphics-drivers-510-server source package in Kinetic:
  Fix Committed

Bug description:
  [Impact]
  These releases provide both bug fixes and new features, and we would like to
  make sure all of our users have access to these improvements.

  See the changelog entry below for a full list of changes and bugs.

  [Test Case]
  The following development and SRU process was followed:
  https://wiki.ubuntu.com/NVidiaUpdates

  Certification test suite must pass on a range of hardware:
  https://git.launchpad.net/plainbox-provider-sru/tree/units/sru.pxu

  The QA team that executed the tests will be in charge of attaching the
  artifacts and console output of the appropriate run to the bug. nVidia
  maintainers team members will not mark ‘verification-done’ until this
  has happened.

  [Regression Potential]
  In order to mitigate the regression potential, the results of the
  aforementioned system level tests are attached to this bug.

  [Discussion]

  [Changelog]

  === 510 kinetic/jammy/impish/focal/bionic ===

* New upstream release (LP: #1975509):
  - When calculating the address of grid barrier allocated for a CUDA 
stream, there was an off-by-one error. The address calculation is 
corrected in thisrelease.
  - An issue that caused an AC cycle test to fail with "AssertionError: 
NVLink links with inappropriate status found" is resolved.
  - An issue that caused NX 11 to become nonresponsive during a graphics 
operation is resolved.
  - Linking issues were observed when using libnvfm.so. Now and other 
depend tools use dynamic linking with libstdc++ and libgcc.
  - An intermittent error CUDA_ERROR_NVLINK_UNCORRECTABLE caused by some
non-fatal nvlink interrupts is resolved.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/fabric-manager-510/+bug/1975509/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1978475] Re: Docker container ports cannot be allocated

2022-06-13 Thread Francis Ginther
Hello Sebastian,

I've been unable to reproduce this issue with the 5.13.0-1029-aws kernel
and the docker-compose example available from [1]. Are you able to
provide complete steps to reproduce?

[1] - https://docs.docker.com/compose/gettingstarted/

Thanks

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-aws-5.13 in Ubuntu.
https://bugs.launchpad.net/bugs/1978475

Title:
  Docker container ports cannot be allocated

Status in linux-aws-5.13 package in Ubuntu:
  Confirmed

Bug description:
  This is a follow-up bug to
  https://bugs.launchpad.net/ubuntu/+source/linux-aws-5.13/+bug/1977919

  I can confirm that the problem is indeed not fully fixed.
  @electricdaemon said:

  > Test kernel posted fixes crash but has another bug with unkillable
  stuck defunct docker-proxy service causing more issues. Bug is not
  solved. Tested on Linux AWS Lightsail instance.

  
  What I'm seeing is that docker-compose stacks either don't start at all or 
only start partially. In both cases the affected containers cannot start due to 
their host port being already allocated.  I can say with absolute certainty 
that the ports on the host are dedicated to container applications and no other 
service is actually bound to the affected port numbers.

  # uname -a
  Linux ip-10-0-69-193 5.13.0-1029-aws #32~20.04.1-Ubuntu SMP Thu Jun 9 
13:03:13 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

  # apt-cache policy docker containerd
  docker:
    Installed: (none)
    Candidate: 1.5-2
    Version table:
   1.5-2 500
  500 http://eu-central-1.ec2.archive.ubuntu.com/ubuntu focal/universe 
amd64 Packages
  containerd:
    Installed: (none)
    Candidate: 1.5.9-0ubuntu1~20.04.4
    Version table:
   1.5.9-0ubuntu1~20.04.4 500
  500 http://eu-central-1.ec2.archive.ubuntu.com/ubuntu 
focal-updates/main amd64 Packages
  500 http://security.ubuntu.com/ubuntu focal-security/main amd64 
Packages
   1.3.3-0ubuntu2 500
  500 http://eu-central-1.ec2.archive.ubuntu.com/ubuntu focal/main 
amd64 Packages

  # docker-compose --version
  docker-compose version 1.29.2, build 5becea4c

  root@ip-10-0-69-193:/opt/myapp8/myappserv/int# docker-compose up -d
  Creating network "myappserv-int_default" with the default driver
  Creating myapp-migrator-int ... done
  Creating myapp-dealer-int   ...
  Creating myapp-offer-int...
  Creating myapp-customer-int ...
  Creating myapp-customer-int ... error
  Creating myapp-dealer-int   ... done
  Creating myapp-offer-int... done
  : port is already allocated

  ERROR: for customer  Cannot start service customer: driver failed programming 
external connectivity on endpoint myapp8-customer-int 
(fe4112364528b0e7d192c793929c579e8a81af715118c8f83ad7e65e7397f3be): Bind for 
0.0.0.0:9001 failed: port is already allocated
  ERROR: Encountered errors while bringing up the project.

  root@ip-10-0-69-193:/opt/myapp8/myappserv/int# docker-compose down
  Stopping myapp8-offer-int  ... done
  Stopping myapp8-dealer-int ... done
  Removing myapp8-customer-int ... done
  Removing myapp8-offer-int... done
  Removing myapp8-dealer-int   ... done
  Removing myapp8-migrator-int ... done
  Removing network myappserv-int_default

  root@ip-10-0-69-193:/opt/myapp8/myappserv/int# docker-compose up -d
  Creating network "myappserv-int_default" with the default driver
  Creating myapp8-migrator-int ... done
  Creating myapp8-offer-int...
  Creating myapp8-customer-int ...
  Creating myapp8-customer-int ... error
  WARNING: Host is already in use by another container
  Creating myapp8-offer-int... done
  ERROR: for myapp8-customer-int  Cannot start service customer: driver failed 
programming external connectivity on endpoint myapp8-customer-int 
(72fc08854cd278e63cd3234e7fb03c08cb045efdcfb9e42075a1250d893645d5): Bind for 
0.0.0.0:9001 failed
  Creating myapp8-dealer-int   ... done

  ERROR: for customer  Cannot start service customer: driver failed programming 
external connectivity on endpoint myapp8-customer-int 
(72fc08854cd278e63cd3234e7fb03c08cb045efdcfb9e42075a1250d893645d5): Bind for 
0.0.0.0:9001 failed: port is already allocated
  ERROR: Encountered errors while bringing up the project.

  # docker-compose config

  services:
    customer:
  container_name: myapp8-customer-int
  depends_on:
    migrator:
  condition: service_completed_successfully
  image: reg.mydomain.tld/myapp8/customer:430d4ca
  ports:
  - published: 9001
    target: 9001
  restart: always
    dealer:
  container_name: myapp8-dealer-int
  depends_on:
    migrator:
  condition: service_completed_successfully
  image: reg.mydomain.tld/myapp8/dealer:430d4ca
  ports:
  - published: 9002
    target: 9002
  restart: always
    migrator:
  container_name: myapp8-migrator-int
  image: reg.mydomain.tld/myapp8/migrator:430d4ca
    offer:
  container_name: my

[Kernel-packages] [Bug 1977919] Re: Docker container creation causes kernel oops on linux-aws 5.13.0.1028.31~20.04.22

2022-06-10 Thread Francis Ginther
All of the updated 5.13 kernels have now made it to the archive and into
both the focal-updates and focal-security pockets. That list of kernels
is:

linux-aws-5.13 - 5.13.0-1029.32~20.04.1
linux-azure-5.13 - 5.13.0-1029.34~20.04.1
linux-gcp-5.13 - 5.13.0-1031.37~20.04.1
linux-oracle-5.13 - 5.13.0-1034.40~20.04.1

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-aws-5.13 in Ubuntu.
https://bugs.launchpad.net/bugs/1977919

Title:
  Docker container creation causes kernel oops on linux-aws
  5.13.0.1028.31~20.04.22

Status in linux-aws-5.13 package in Ubuntu:
  Confirmed
Status in linux-azure-5.13 package in Ubuntu:
  Confirmed
Status in linux-gcp-5.13 package in Ubuntu:
  Confirmed
Status in linux-intel-iotg-5.15 package in Ubuntu:
  Confirmed
Status in linux-oracle-5.13 package in Ubuntu:
  Confirmed
Status in linux-aws-5.13 source package in Focal:
  Fix Committed
Status in linux-azure-5.13 source package in Focal:
  Fix Committed
Status in linux-gcp-5.13 source package in Focal:
  Fix Committed
Status in linux-intel-iotg-5.15 source package in Focal:
  Won't Fix
Status in linux-oracle-5.13 source package in Focal:
  Fix Committed

Bug description:
  Running the attached script on the latest AWS AMI for Ubuntu 20.04, I
  get a kernel panic and hard reset of the node.

  [   12.314552] VFS: Close: file count is 0
  [   12.351090] [ cut here ]
  [   12.351093] kernel BUG at include/linux/fs.h:3104!
  [   12.355272] invalid opcode:  [#1] SMP PTI
  [   12.358963] CPU: 1 PID: 863 Comm: sed Not tainted 5.13.0-1028-aws 
#31~20.04.1-Ubuntu
  [   12.366241] Hardware name: Amazon EC2 m5.large/, BIOS 1.0 10/16/2017
  [   12.371130] RIP: 0010:__fput+0x247/0x250
  [   12.374897] Code: 00 48 85 ff 0f 84 8b fe ff ff f6 c7 40 0f 85 82 fe ff ff 
e8 ab 38 00 00 e9 78 fe ff ff 4c 89 f7 e8 2e 88 02 00 e9 b5 fe ff ff <0f> 0b 0f 
1f 80 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 53 31 db 48
  [   12.389075] RSP: 0018:b50280d9fd88 EFLAGS: 00010246
  [   12.393425] RAX:  RBX: 000a801d RCX: 
9152e0716000
  [   12.398679] RDX: 9152cf075280 RSI: 0001 RDI: 

  [   12.403879] RBP: b50280d9fdb0 R08: 0001 R09: 
9152dfcba2c8
  [   12.409102] R10: b50280d9fd88 R11: 9152d04e9d10 R12: 
9152d04e9d00
  [   12.414333] R13: 9152dfcba2c8 R14: 9152cf0752a0 R15: 
9152dfc2e180
  [   12.419533] FS:  () GS:9153ea90() 
knlGS:
  [   12.426937] CS:  0010 DS:  ES:  CR0: 80050033
  [   12.431506] CR2: 556cf30250a8 CR3: bce10006 CR4: 
007706e0
  [   12.436716] DR0:  DR1:  DR2: 

  [   12.441941] DR3:  DR6: fffe0ff0 DR7: 
0400
  [   12.447170] PKRU: 5554
  [   12.450355] Call Trace:
  [   12.453408]  
  [   12.456296]  fput+0xe/0x10
  [   12.459633]  task_work_run+0x70/0xb0
  [   12.463157]  do_exit+0x37b/0xaf0
  [   12.466570]  do_group_exit+0x43/0xb0
  [   12.470142]  __x64_sys_exit_group+0x18/0x20
  [   12.473989]  do_syscall_64+0x61/0xb0
  [   12.477565]  ? exit_to_user_mode_prepare+0x9b/0x1c0
  [   12.481734]  ? do_user_addr_fault+0x1d0/0x650
  [   12.485665]  ? irqentry_exit_to_user_mode+0x9/0x20
  [   12.489790]  ? irqentry_exit+0x19/0x30
  [   12.493443]  ? exc_page_fault+0x8f/0x170
  [   12.497199]  ? asm_exc_page_fault+0x8/0x30
  [   12.501013]  entry_SYSCALL_64_after_hwframe+0x44/0xae
  [   12.505289] RIP: 0033:0x7f80d42a1bd6
  [   12.508868] Code: Unable to access opcode bytes at RIP 0x7f80d42a1bac.
  [   12.513783] RSP: 002b:7ffe924f9ed8 EFLAGS: 0246 ORIG_RAX: 
00e7
  [   12.520897] RAX: ffda RBX: 7f80d45a4740 RCX: 
7f80d42a1bd6
  [   12.526115] RDX:  RSI: 003c RDI: 

  [   12.531328] RBP:  R08: 00e7 R09: 
fe98
  [   12.536484] R10: 7f80d3d422a0 R11: 0246 R12: 
7f80d45a4740
  [   12.541687] R13: 0002 R14: 7f80d45ad708 R15: 

  [   12.546916]  
  [   12.549829] Modules linked in: xt_conntrack xt_MASQUERADE 
nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_filter 
iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c 
bpfilter br_netfilter bridge stp llc aufs overlay nls_iso8859_1 dm_multipath 
scsi_dh_rdac scsi_dh_emc scsi_dh_alua crct10dif_pclmul ppdev crc32_pclmul 
ghash_clmulni_intel aesni_intel crypto_simd psmouse cryptd parport_pc 
input_leds parport ena serio_raw sch_fq_codel ipmi_devintf ipmi_msghandler msr 
drm ip_tables x_tables autofs4
  [   12.583913] ---[ end trace 77367fed4d782aa4 ]---
  [   12.587963] RIP: 0010:__fput+0x247/0x250
  [   12.591729] Code: 00 48 85 ff 0f 84 8b fe ff ff f6 c7 40 0f 85 82 fe ff ff 
e8 ab 38 00 00 e9 78 fe ff ff 4c 89 f7 e8 2e 8

[Kernel-packages] [Bug 1977919] Re: Docker container creation causes kernel oops on linux-aws 5.13.0.1028.31~20.04.22

2022-06-09 Thread Francis Ginther
Updated kernels are in flight. The updated kernel packages and versions
are:

linux-aws-5.13- 5.13.0-1029.32~20.04.1
linux-azure-5.13  - 5.13.0-1029.34~20.04.1
linux-gcp-5.13- 5.13.0-1031.37~20.04.1
linux-oracle-5.13 - 5.13.0-1034.40~20.04.1

The azure and gcp kernels are already in focal-updates. The aws kernel
is in focal-proposed and the oracle kernel should be there very soon.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-aws-5.13 in Ubuntu.
https://bugs.launchpad.net/bugs/1977919

Title:
  Docker container creation causes kernel oops on linux-aws
  5.13.0.1028.31~20.04.22

Status in linux-aws-5.13 package in Ubuntu:
  Confirmed
Status in linux-azure-5.13 package in Ubuntu:
  Confirmed
Status in linux-gcp-5.13 package in Ubuntu:
  Confirmed
Status in linux-intel-iotg-5.15 package in Ubuntu:
  Confirmed
Status in linux-oracle-5.13 package in Ubuntu:
  Confirmed
Status in linux-aws-5.13 source package in Focal:
  Fix Committed
Status in linux-azure-5.13 source package in Focal:
  Fix Committed
Status in linux-gcp-5.13 source package in Focal:
  Fix Committed
Status in linux-intel-iotg-5.15 source package in Focal:
  Won't Fix
Status in linux-oracle-5.13 source package in Focal:
  Fix Committed

Bug description:
  Running the attached script on the latest AWS AMI for Ubuntu 20.04, I
  get a kernel panic and hard reset of the node.

  [   12.314552] VFS: Close: file count is 0
  [   12.351090] [ cut here ]
  [   12.351093] kernel BUG at include/linux/fs.h:3104!
  [   12.355272] invalid opcode:  [#1] SMP PTI
  [   12.358963] CPU: 1 PID: 863 Comm: sed Not tainted 5.13.0-1028-aws 
#31~20.04.1-Ubuntu
  [   12.366241] Hardware name: Amazon EC2 m5.large/, BIOS 1.0 10/16/2017
  [   12.371130] RIP: 0010:__fput+0x247/0x250
  [   12.374897] Code: 00 48 85 ff 0f 84 8b fe ff ff f6 c7 40 0f 85 82 fe ff ff 
e8 ab 38 00 00 e9 78 fe ff ff 4c 89 f7 e8 2e 88 02 00 e9 b5 fe ff ff <0f> 0b 0f 
1f 80 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 53 31 db 48
  [   12.389075] RSP: 0018:b50280d9fd88 EFLAGS: 00010246
  [   12.393425] RAX:  RBX: 000a801d RCX: 
9152e0716000
  [   12.398679] RDX: 9152cf075280 RSI: 0001 RDI: 

  [   12.403879] RBP: b50280d9fdb0 R08: 0001 R09: 
9152dfcba2c8
  [   12.409102] R10: b50280d9fd88 R11: 9152d04e9d10 R12: 
9152d04e9d00
  [   12.414333] R13: 9152dfcba2c8 R14: 9152cf0752a0 R15: 
9152dfc2e180
  [   12.419533] FS:  () GS:9153ea90() 
knlGS:
  [   12.426937] CS:  0010 DS:  ES:  CR0: 80050033
  [   12.431506] CR2: 556cf30250a8 CR3: bce10006 CR4: 
007706e0
  [   12.436716] DR0:  DR1:  DR2: 

  [   12.441941] DR3:  DR6: fffe0ff0 DR7: 
0400
  [   12.447170] PKRU: 5554
  [   12.450355] Call Trace:
  [   12.453408]  
  [   12.456296]  fput+0xe/0x10
  [   12.459633]  task_work_run+0x70/0xb0
  [   12.463157]  do_exit+0x37b/0xaf0
  [   12.466570]  do_group_exit+0x43/0xb0
  [   12.470142]  __x64_sys_exit_group+0x18/0x20
  [   12.473989]  do_syscall_64+0x61/0xb0
  [   12.477565]  ? exit_to_user_mode_prepare+0x9b/0x1c0
  [   12.481734]  ? do_user_addr_fault+0x1d0/0x650
  [   12.485665]  ? irqentry_exit_to_user_mode+0x9/0x20
  [   12.489790]  ? irqentry_exit+0x19/0x30
  [   12.493443]  ? exc_page_fault+0x8f/0x170
  [   12.497199]  ? asm_exc_page_fault+0x8/0x30
  [   12.501013]  entry_SYSCALL_64_after_hwframe+0x44/0xae
  [   12.505289] RIP: 0033:0x7f80d42a1bd6
  [   12.508868] Code: Unable to access opcode bytes at RIP 0x7f80d42a1bac.
  [   12.513783] RSP: 002b:7ffe924f9ed8 EFLAGS: 0246 ORIG_RAX: 
00e7
  [   12.520897] RAX: ffda RBX: 7f80d45a4740 RCX: 
7f80d42a1bd6
  [   12.526115] RDX:  RSI: 003c RDI: 

  [   12.531328] RBP:  R08: 00e7 R09: 
fe98
  [   12.536484] R10: 7f80d3d422a0 R11: 0246 R12: 
7f80d45a4740
  [   12.541687] R13: 0002 R14: 7f80d45ad708 R15: 

  [   12.546916]  
  [   12.549829] Modules linked in: xt_conntrack xt_MASQUERADE 
nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_filter 
iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c 
bpfilter br_netfilter bridge stp llc aufs overlay nls_iso8859_1 dm_multipath 
scsi_dh_rdac scsi_dh_emc scsi_dh_alua crct10dif_pclmul ppdev crc32_pclmul 
ghash_clmulni_intel aesni_intel crypto_simd psmouse cryptd parport_pc 
input_leds parport ena serio_raw sch_fq_codel ipmi_devintf ipmi_msghandler msr 
drm ip_tables x_tables autofs4
  [   12.583913] ---[ end trace 77367fed4d782aa4 ]---
  [   12.587963] RIP: 0010:__fput+0x247/0x250
  [   12.591729] Code: 00 48 85 ff 0f 84 8b fe ff 

[Kernel-packages] [Bug 1975509] Re: Update to the 510.73.08 ERD NVIDIA driver series in Bionic, Focal, Impish, Jammy, and Kinetic

2022-06-09 Thread Francis Ginther
The fabric-manager-510 and libnvidia-nscq-510 were tested across all
series on an A100 system. All testing passed the standard cuda testing.
The packages tested were from https://launchpad.net/~canonical-kernel-
team/+archive/ubuntu/ppa/+packages?field.name_filter=-510&field.status_filter=published&field.series_filter=

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-restricted-modules in Ubuntu.
https://bugs.launchpad.net/bugs/1975509

Title:
  Update to the 510.73.08 ERD NVIDIA driver series in Bionic, Focal,
  Impish, Jammy, and Kinetic

Status in fabric-manager-510 package in Ubuntu:
  New
Status in libnvidia-nscq-510 package in Ubuntu:
  New
Status in linux-restricted-modules package in Ubuntu:
  Confirmed
Status in nvidia-graphics-drivers-510-server package in Ubuntu:
  Fix Committed
Status in fabric-manager-510 source package in Bionic:
  New
Status in libnvidia-nscq-510 source package in Bionic:
  New
Status in linux-restricted-modules source package in Bionic:
  Confirmed
Status in nvidia-graphics-drivers-510-server source package in Bionic:
  Fix Released
Status in fabric-manager-510 source package in Focal:
  New
Status in libnvidia-nscq-510 source package in Focal:
  New
Status in linux-restricted-modules source package in Focal:
  Confirmed
Status in nvidia-graphics-drivers-510-server source package in Focal:
  Fix Released
Status in fabric-manager-510 source package in Impish:
  New
Status in libnvidia-nscq-510 source package in Impish:
  New
Status in linux-restricted-modules source package in Impish:
  Confirmed
Status in nvidia-graphics-drivers-510-server source package in Impish:
  Fix Released
Status in fabric-manager-510 source package in Jammy:
  New
Status in libnvidia-nscq-510 source package in Jammy:
  New
Status in linux-restricted-modules source package in Jammy:
  Confirmed
Status in nvidia-graphics-drivers-510-server source package in Jammy:
  Fix Released
Status in fabric-manager-510 source package in Kinetic:
  New
Status in libnvidia-nscq-510 source package in Kinetic:
  New
Status in linux-restricted-modules source package in Kinetic:
  Confirmed
Status in nvidia-graphics-drivers-510-server source package in Kinetic:
  Fix Committed

Bug description:
  [Impact]
  These releases provide both bug fixes and new features, and we would like to
  make sure all of our users have access to these improvements.

  See the changelog entry below for a full list of changes and bugs.

  [Test Case]
  The following development and SRU process was followed:
  https://wiki.ubuntu.com/NVidiaUpdates

  Certification test suite must pass on a range of hardware:
  https://git.launchpad.net/plainbox-provider-sru/tree/units/sru.pxu

  The QA team that executed the tests will be in charge of attaching the
  artifacts and console output of the appropriate run to the bug. nVidia
  maintainers team members will not mark ‘verification-done’ until this
  has happened.

  [Regression Potential]
  In order to mitigate the regression potential, the results of the
  aforementioned system level tests are attached to this bug.

  [Discussion]

  [Changelog]

  === 510 kinetic/jammy/impish/focal/bionic ===

* New upstream release (LP: #1975509):
  - When calculating the address of grid barrier allocated for a CUDA 
stream, there was an off-by-one error. The address calculation is 
corrected in thisrelease.
  - An issue that caused an AC cycle test to fail with "AssertionError: 
NVLink links with inappropriate status found" is resolved.
  - An issue that caused NX 11 to become nonresponsive during a graphics 
operation is resolved.
  - Linking issues were observed when using libnvfm.so. Now and other 
depend tools use dynamic linking with libstdc++ and libgcc.
  - An intermittent error CUDA_ERROR_NVLINK_UNCORRECTABLE caused by some
non-fatal nvlink interrupts is resolved.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/fabric-manager-510/+bug/1975509/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1977919] Re: Docker container creation causes kernel oops on linux-aws 5.13.0.1028.31~20.04.22

2022-06-08 Thread Francis Ginther
Work on this issue continues. We have identified the following impacted
kernels and versions:

 focal linux-aws-5.13 5.13.0-1028.31~20.04.1
 focal linux-azure-5.13 5.13.0-1028.33~20.04.1
 focal linux-gcp-5.13 5.13.0-1030.36~20.04.1
 focal linux-oracle-5.13 5.13.0-1033.39~20.04.1

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-aws in Ubuntu.
https://bugs.launchpad.net/bugs/1977919

Title:
  Docker container creation causes kernel oops on linux-aws
  5.13.0.1028.31~20.04.22

Status in linux-aws package in Ubuntu:
  Confirmed
Status in linux-gcp package in Ubuntu:
  Confirmed

Bug description:
  Running the attached script on the latest AWS AMI for Ubuntu 20.04, I
  get a kernel panic and hard reset of the node.

  [   12.314552] VFS: Close: file count is 0
  [   12.351090] [ cut here ]
  [   12.351093] kernel BUG at include/linux/fs.h:3104!
  [   12.355272] invalid opcode:  [#1] SMP PTI
  [   12.358963] CPU: 1 PID: 863 Comm: sed Not tainted 5.13.0-1028-aws 
#31~20.04.1-Ubuntu
  [   12.366241] Hardware name: Amazon EC2 m5.large/, BIOS 1.0 10/16/2017
  [   12.371130] RIP: 0010:__fput+0x247/0x250
  [   12.374897] Code: 00 48 85 ff 0f 84 8b fe ff ff f6 c7 40 0f 85 82 fe ff ff 
e8 ab 38 00 00 e9 78 fe ff ff 4c 89 f7 e8 2e 88 02 00 e9 b5 fe ff ff <0f> 0b 0f 
1f 80 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 53 31 db 48
  [   12.389075] RSP: 0018:b50280d9fd88 EFLAGS: 00010246
  [   12.393425] RAX:  RBX: 000a801d RCX: 
9152e0716000
  [   12.398679] RDX: 9152cf075280 RSI: 0001 RDI: 

  [   12.403879] RBP: b50280d9fdb0 R08: 0001 R09: 
9152dfcba2c8
  [   12.409102] R10: b50280d9fd88 R11: 9152d04e9d10 R12: 
9152d04e9d00
  [   12.414333] R13: 9152dfcba2c8 R14: 9152cf0752a0 R15: 
9152dfc2e180
  [   12.419533] FS:  () GS:9153ea90() 
knlGS:
  [   12.426937] CS:  0010 DS:  ES:  CR0: 80050033
  [   12.431506] CR2: 556cf30250a8 CR3: bce10006 CR4: 
007706e0
  [   12.436716] DR0:  DR1:  DR2: 

  [   12.441941] DR3:  DR6: fffe0ff0 DR7: 
0400
  [   12.447170] PKRU: 5554
  [   12.450355] Call Trace:
  [   12.453408]  
  [   12.456296]  fput+0xe/0x10
  [   12.459633]  task_work_run+0x70/0xb0
  [   12.463157]  do_exit+0x37b/0xaf0
  [   12.466570]  do_group_exit+0x43/0xb0
  [   12.470142]  __x64_sys_exit_group+0x18/0x20
  [   12.473989]  do_syscall_64+0x61/0xb0
  [   12.477565]  ? exit_to_user_mode_prepare+0x9b/0x1c0
  [   12.481734]  ? do_user_addr_fault+0x1d0/0x650
  [   12.485665]  ? irqentry_exit_to_user_mode+0x9/0x20
  [   12.489790]  ? irqentry_exit+0x19/0x30
  [   12.493443]  ? exc_page_fault+0x8f/0x170
  [   12.497199]  ? asm_exc_page_fault+0x8/0x30
  [   12.501013]  entry_SYSCALL_64_after_hwframe+0x44/0xae
  [   12.505289] RIP: 0033:0x7f80d42a1bd6
  [   12.508868] Code: Unable to access opcode bytes at RIP 0x7f80d42a1bac.
  [   12.513783] RSP: 002b:7ffe924f9ed8 EFLAGS: 0246 ORIG_RAX: 
00e7
  [   12.520897] RAX: ffda RBX: 7f80d45a4740 RCX: 
7f80d42a1bd6
  [   12.526115] RDX:  RSI: 003c RDI: 

  [   12.531328] RBP:  R08: 00e7 R09: 
fe98
  [   12.536484] R10: 7f80d3d422a0 R11: 0246 R12: 
7f80d45a4740
  [   12.541687] R13: 0002 R14: 7f80d45ad708 R15: 

  [   12.546916]  
  [   12.549829] Modules linked in: xt_conntrack xt_MASQUERADE 
nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_filter 
iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c 
bpfilter br_netfilter bridge stp llc aufs overlay nls_iso8859_1 dm_multipath 
scsi_dh_rdac scsi_dh_emc scsi_dh_alua crct10dif_pclmul ppdev crc32_pclmul 
ghash_clmulni_intel aesni_intel crypto_simd psmouse cryptd parport_pc 
input_leds parport ena serio_raw sch_fq_codel ipmi_devintf ipmi_msghandler msr 
drm ip_tables x_tables autofs4
  [   12.583913] ---[ end trace 77367fed4d782aa4 ]---
  [   12.587963] RIP: 0010:__fput+0x247/0x250
  [   12.591729] Code: 00 48 85 ff 0f 84 8b fe ff ff f6 c7 40 0f 85 82 fe ff ff 
e8 ab 38 00 00 e9 78 fe ff ff 4c 89 f7 e8 2e 88 02 00 e9 b5 fe ff ff <0f> 0b 0f 
1f 80 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 53 31 db 48
  [   12.605796] RSP: 0018:b50280d9fd88 EFLAGS: 00010246
  [   12.610166] RAX:  RBX: 000a801d RCX: 
9152e0716000
  [   12.615417] RDX: 9152cf075280 RSI: 0001 RDI: 

  [   12.620635] RBP: b50280d9fdb0 R08: 0001 R09: 
9152dfcba2c8
  [   12.625878] R10: b50280d9fd88 R11: 9152d04e9d10 R12: 
9152d04e9d00
  [   12.631121] R13: 9152dfcba2c8 R14: 9152cf0752a0 R15: 
9152dfc2

[Kernel-packages] [Bug 1973034] Re: linux generic fails to boot on azure arm64 instance types

2022-05-12 Thread Francis Ginther
** Changed in: linux (Ubuntu)
   Status: Incomplete => Confirmed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1973034

Title:
  linux generic fails to boot on azure arm64 instance types

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  Azure now has arm64 instances in a preview, for example
  Standard_D2pds_v5. These work with the b/linux-azure and f/linux-azure
  kernels, but fail to boot with linux-generic.

  Looks like a storage device issue (from serial console):
  Begin: Running /scripts/init-premount ... done.
  Begin: Mounting root file system ... Begin: Running /scripts/local-top ... 
done.
  Begin: Running /scripts/local-premount ... [4.651830] Btrfs loaded, 
crc32c=crc32c-generic
  [4.651830] Btrfs loaded, crc32c=crc32c-generic
  Scanning for Btrfs filesystems
  done.
  Begin: Waiting for root file system ... Begin: Running /scripts/local-block 
... mdadm: No devices listed in conf file were found.
  done.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: error opening /dev/md?*: No such file or directory
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  done.
  Gave up waiting for root file system device.  Common problems:
   - Boot args (cat /proc/cmdline)
 - Check rootdelay= (did the system wait long enough?)
   - Missing modules (cat /proc/modules; ls /dev)
  ALERT!  UUID=b9c04583-2c65-4891-aa8e-a494eb94fd14 does not exist.  Dropping 
to a shell!
  --- 
  ProblemType: Bug
  AlsaDevices: Error: command ['ls', '-l', '/dev/snd/'] failed with exit code 
2: ls: cannot access '/dev/snd/': No such file or directory
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
  ApportVersion: 2.20.11-0ubuntu27.23
  Architecture: arm64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
  CRDA: Error: command ['iw', 'reg', 'get'] failed with exit code 1: nl80211 
not found.
  CasperMD5CheckResult: skip
  DistroRelease: Ubuntu 20.04
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
  Lspci-vt:
   -+-[3c75:00]---02.0  Mellanox Technologies MT27800 Family [ConnectX-5 
Virtual Function]
\-[:00]-
  Lsusb: Error: command ['lsusb'] failed with exit code 1:
  Lsusb-t:
   
  Lsusb-v: Error: command ['lsusb', '-v'] failed with exit code 1:
  MachineType: Microsoft Corporation Virtual Machine
  Package: linux (not installed)
  PciMultimedia:
   
  ProcEnviron:
   TERM=screen
   PATH=(custom, no user)
   LANG=C.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 hyperv_fb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.13.0-1023-azure 
root=PARTUUID=379f8683-bd7e-48a0-8460-5c7f68b4b091 ro console=tty1 
console=ttyAMA0 earlycon=pl011,0xeffec000 initcall_blacklist=arm_pmu_acpi_init 
panic=-1
  ProcVersionSignature: Ubuntu 5.13.0-1023.27~20.04.1-azure 5.13.19
  RelatedPackageVersions:
   linux-restricted-modules-5.13.0-1023-azure N/A
   linux-backports-modules-5.13.0-1023-azure  N/A
   linux-firmware 1.187.30
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
  Tags:  focal uec-images
  Uname: Linux 5.13.0-1023-azure aarch64
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups: N/A
  _MarkForUpload: True
  dmi.bios.date: 02/07/2022
  dmi.bios.release: 4.1
  dmi.bios.vendor: Microsoft Corporation
  dmi.bios.version: Hyper-V UEFI Release v4.1
  dmi.board.asset.tag: None
  dmi.board.name: Virtual Machine
  dmi.board.vendor: Microsof

[Kernel-packages] [Bug 1973034] Re: linux generic fails to boot on azure arm64 instance types

2022-05-11 Thread Francis Ginther
Artifacts were collected from a new VM running focal/linux-azure just
prior to rebooting to linux-generic (which gets stuck at initramfs).

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1973034

Title:
  linux generic fails to boot on azure arm64 instance types

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  Azure now has arm64 instances in a preview, for example
  Standard_D2pds_v5. These work with the b/linux-azure and f/linux-azure
  kernels, but fail to boot with linux-generic.

  Looks like a storage device issue (from serial console):
  Begin: Running /scripts/init-premount ... done.
  Begin: Mounting root file system ... Begin: Running /scripts/local-top ... 
done.
  Begin: Running /scripts/local-premount ... [4.651830] Btrfs loaded, 
crc32c=crc32c-generic
  [4.651830] Btrfs loaded, crc32c=crc32c-generic
  Scanning for Btrfs filesystems
  done.
  Begin: Waiting for root file system ... Begin: Running /scripts/local-block 
... mdadm: No devices listed in conf file were found.
  done.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: error opening /dev/md?*: No such file or directory
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  done.
  Gave up waiting for root file system device.  Common problems:
   - Boot args (cat /proc/cmdline)
 - Check rootdelay= (did the system wait long enough?)
   - Missing modules (cat /proc/modules; ls /dev)
  ALERT!  UUID=b9c04583-2c65-4891-aa8e-a494eb94fd14 does not exist.  Dropping 
to a shell!
  --- 
  ProblemType: Bug
  AlsaDevices: Error: command ['ls', '-l', '/dev/snd/'] failed with exit code 
2: ls: cannot access '/dev/snd/': No such file or directory
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
  ApportVersion: 2.20.11-0ubuntu27.23
  Architecture: arm64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
  CRDA: Error: command ['iw', 'reg', 'get'] failed with exit code 1: nl80211 
not found.
  CasperMD5CheckResult: skip
  DistroRelease: Ubuntu 20.04
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
  Lspci-vt:
   -+-[3c75:00]---02.0  Mellanox Technologies MT27800 Family [ConnectX-5 
Virtual Function]
\-[:00]-
  Lsusb: Error: command ['lsusb'] failed with exit code 1:
  Lsusb-t:
   
  Lsusb-v: Error: command ['lsusb', '-v'] failed with exit code 1:
  MachineType: Microsoft Corporation Virtual Machine
  Package: linux (not installed)
  PciMultimedia:
   
  ProcEnviron:
   TERM=screen
   PATH=(custom, no user)
   LANG=C.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 hyperv_fb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.13.0-1023-azure 
root=PARTUUID=379f8683-bd7e-48a0-8460-5c7f68b4b091 ro console=tty1 
console=ttyAMA0 earlycon=pl011,0xeffec000 initcall_blacklist=arm_pmu_acpi_init 
panic=-1
  ProcVersionSignature: Ubuntu 5.13.0-1023.27~20.04.1-azure 5.13.19
  RelatedPackageVersions:
   linux-restricted-modules-5.13.0-1023-azure N/A
   linux-backports-modules-5.13.0-1023-azure  N/A
   linux-firmware 1.187.30
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
  Tags:  focal uec-images
  Uname: Linux 5.13.0-1023-azure aarch64
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups: N/A
  _MarkForUpload: True
  dmi.bios.date: 02/07/2022
  dmi.bios.release: 4.1
  dmi.bios.vendor: Microsoft Corporation
  dmi.bios.version: Hyper-V UEFI Release v4.1
  dmi.board.asset.t

[Kernel-packages] [Bug 1973034] UdevDb.txt

2022-05-11 Thread Francis Ginther
apport information

** Attachment added: "UdevDb.txt"
   https://bugs.launchpad.net/bugs/1973034/+attachment/5588652/+files/UdevDb.txt

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1973034

Title:
  linux generic fails to boot on azure arm64 instance types

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  Azure now has arm64 instances in a preview, for example
  Standard_D2pds_v5. These work with the b/linux-azure and f/linux-azure
  kernels, but fail to boot with linux-generic.

  Looks like a storage device issue (from serial console):
  Begin: Running /scripts/init-premount ... done.
  Begin: Mounting root file system ... Begin: Running /scripts/local-top ... 
done.
  Begin: Running /scripts/local-premount ... [4.651830] Btrfs loaded, 
crc32c=crc32c-generic
  [4.651830] Btrfs loaded, crc32c=crc32c-generic
  Scanning for Btrfs filesystems
  done.
  Begin: Waiting for root file system ... Begin: Running /scripts/local-block 
... mdadm: No devices listed in conf file were found.
  done.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: error opening /dev/md?*: No such file or directory
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  done.
  Gave up waiting for root file system device.  Common problems:
   - Boot args (cat /proc/cmdline)
 - Check rootdelay= (did the system wait long enough?)
   - Missing modules (cat /proc/modules; ls /dev)
  ALERT!  UUID=b9c04583-2c65-4891-aa8e-a494eb94fd14 does not exist.  Dropping 
to a shell!
  --- 
  ProblemType: Bug
  AlsaDevices: Error: command ['ls', '-l', '/dev/snd/'] failed with exit code 
2: ls: cannot access '/dev/snd/': No such file or directory
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
  ApportVersion: 2.20.11-0ubuntu27.23
  Architecture: arm64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
  CRDA: Error: command ['iw', 'reg', 'get'] failed with exit code 1: nl80211 
not found.
  CasperMD5CheckResult: skip
  DistroRelease: Ubuntu 20.04
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
  Lspci-vt:
   -+-[3c75:00]---02.0  Mellanox Technologies MT27800 Family [ConnectX-5 
Virtual Function]
\-[:00]-
  Lsusb: Error: command ['lsusb'] failed with exit code 1:
  Lsusb-t:
   
  Lsusb-v: Error: command ['lsusb', '-v'] failed with exit code 1:
  MachineType: Microsoft Corporation Virtual Machine
  Package: linux (not installed)
  PciMultimedia:
   
  ProcEnviron:
   TERM=screen
   PATH=(custom, no user)
   LANG=C.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 hyperv_fb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.13.0-1023-azure 
root=PARTUUID=379f8683-bd7e-48a0-8460-5c7f68b4b091 ro console=tty1 
console=ttyAMA0 earlycon=pl011,0xeffec000 initcall_blacklist=arm_pmu_acpi_init 
panic=-1
  ProcVersionSignature: Ubuntu 5.13.0-1023.27~20.04.1-azure 5.13.19
  RelatedPackageVersions:
   linux-restricted-modules-5.13.0-1023-azure N/A
   linux-backports-modules-5.13.0-1023-azure  N/A
   linux-firmware 1.187.30
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
  Tags:  focal uec-images
  Uname: Linux 5.13.0-1023-azure aarch64
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups: N/A
  _MarkForUpload: True
  dmi.bios.date: 02/07/2022
  dmi.bios.release: 4.1
  dmi.bios.vendor: Microsoft Corporation
  dmi.bios.version: Hyper-V UEFI Release v4.1
  dmi.board.asset.tag: 

[Kernel-packages] [Bug 1973034] ProcInterrupts.txt

2022-05-11 Thread Francis Ginther
apport information

** Attachment added: "ProcInterrupts.txt"
   
https://bugs.launchpad.net/bugs/1973034/+attachment/5588650/+files/ProcInterrupts.txt

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1973034

Title:
  linux generic fails to boot on azure arm64 instance types

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  Azure now has arm64 instances in a preview, for example
  Standard_D2pds_v5. These work with the b/linux-azure and f/linux-azure
  kernels, but fail to boot with linux-generic.

  Looks like a storage device issue (from serial console):
  Begin: Running /scripts/init-premount ... done.
  Begin: Mounting root file system ... Begin: Running /scripts/local-top ... 
done.
  Begin: Running /scripts/local-premount ... [4.651830] Btrfs loaded, 
crc32c=crc32c-generic
  [4.651830] Btrfs loaded, crc32c=crc32c-generic
  Scanning for Btrfs filesystems
  done.
  Begin: Waiting for root file system ... Begin: Running /scripts/local-block 
... mdadm: No devices listed in conf file were found.
  done.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: error opening /dev/md?*: No such file or directory
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  done.
  Gave up waiting for root file system device.  Common problems:
   - Boot args (cat /proc/cmdline)
 - Check rootdelay= (did the system wait long enough?)
   - Missing modules (cat /proc/modules; ls /dev)
  ALERT!  UUID=b9c04583-2c65-4891-aa8e-a494eb94fd14 does not exist.  Dropping 
to a shell!
  --- 
  ProblemType: Bug
  AlsaDevices: Error: command ['ls', '-l', '/dev/snd/'] failed with exit code 
2: ls: cannot access '/dev/snd/': No such file or directory
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
  ApportVersion: 2.20.11-0ubuntu27.23
  Architecture: arm64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
  CRDA: Error: command ['iw', 'reg', 'get'] failed with exit code 1: nl80211 
not found.
  CasperMD5CheckResult: skip
  DistroRelease: Ubuntu 20.04
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
  Lspci-vt:
   -+-[3c75:00]---02.0  Mellanox Technologies MT27800 Family [ConnectX-5 
Virtual Function]
\-[:00]-
  Lsusb: Error: command ['lsusb'] failed with exit code 1:
  Lsusb-t:
   
  Lsusb-v: Error: command ['lsusb', '-v'] failed with exit code 1:
  MachineType: Microsoft Corporation Virtual Machine
  Package: linux (not installed)
  PciMultimedia:
   
  ProcEnviron:
   TERM=screen
   PATH=(custom, no user)
   LANG=C.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 hyperv_fb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.13.0-1023-azure 
root=PARTUUID=379f8683-bd7e-48a0-8460-5c7f68b4b091 ro console=tty1 
console=ttyAMA0 earlycon=pl011,0xeffec000 initcall_blacklist=arm_pmu_acpi_init 
panic=-1
  ProcVersionSignature: Ubuntu 5.13.0-1023.27~20.04.1-azure 5.13.19
  RelatedPackageVersions:
   linux-restricted-modules-5.13.0-1023-azure N/A
   linux-backports-modules-5.13.0-1023-azure  N/A
   linux-firmware 1.187.30
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
  Tags:  focal uec-images
  Uname: Linux 5.13.0-1023-azure aarch64
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups: N/A
  _MarkForUpload: True
  dmi.bios.date: 02/07/2022
  dmi.bios.release: 4.1
  dmi.bios.vendor: Microsoft Corporation
  dmi.bios.version: Hyper-V UEFI Release v4.1
  dmi.

[Kernel-packages] [Bug 1973034] acpidump.txt

2022-05-11 Thread Francis Ginther
apport information

** Attachment added: "acpidump.txt"
   
https://bugs.launchpad.net/bugs/1973034/+attachment/5588654/+files/acpidump.txt

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1973034

Title:
  linux generic fails to boot on azure arm64 instance types

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  Azure now has arm64 instances in a preview, for example
  Standard_D2pds_v5. These work with the b/linux-azure and f/linux-azure
  kernels, but fail to boot with linux-generic.

  Looks like a storage device issue (from serial console):
  Begin: Running /scripts/init-premount ... done.
  Begin: Mounting root file system ... Begin: Running /scripts/local-top ... 
done.
  Begin: Running /scripts/local-premount ... [4.651830] Btrfs loaded, 
crc32c=crc32c-generic
  [4.651830] Btrfs loaded, crc32c=crc32c-generic
  Scanning for Btrfs filesystems
  done.
  Begin: Waiting for root file system ... Begin: Running /scripts/local-block 
... mdadm: No devices listed in conf file were found.
  done.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: error opening /dev/md?*: No such file or directory
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  done.
  Gave up waiting for root file system device.  Common problems:
   - Boot args (cat /proc/cmdline)
 - Check rootdelay= (did the system wait long enough?)
   - Missing modules (cat /proc/modules; ls /dev)
  ALERT!  UUID=b9c04583-2c65-4891-aa8e-a494eb94fd14 does not exist.  Dropping 
to a shell!
  --- 
  ProblemType: Bug
  AlsaDevices: Error: command ['ls', '-l', '/dev/snd/'] failed with exit code 
2: ls: cannot access '/dev/snd/': No such file or directory
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
  ApportVersion: 2.20.11-0ubuntu27.23
  Architecture: arm64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
  CRDA: Error: command ['iw', 'reg', 'get'] failed with exit code 1: nl80211 
not found.
  CasperMD5CheckResult: skip
  DistroRelease: Ubuntu 20.04
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
  Lspci-vt:
   -+-[3c75:00]---02.0  Mellanox Technologies MT27800 Family [ConnectX-5 
Virtual Function]
\-[:00]-
  Lsusb: Error: command ['lsusb'] failed with exit code 1:
  Lsusb-t:
   
  Lsusb-v: Error: command ['lsusb', '-v'] failed with exit code 1:
  MachineType: Microsoft Corporation Virtual Machine
  Package: linux (not installed)
  PciMultimedia:
   
  ProcEnviron:
   TERM=screen
   PATH=(custom, no user)
   LANG=C.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 hyperv_fb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.13.0-1023-azure 
root=PARTUUID=379f8683-bd7e-48a0-8460-5c7f68b4b091 ro console=tty1 
console=ttyAMA0 earlycon=pl011,0xeffec000 initcall_blacklist=arm_pmu_acpi_init 
panic=-1
  ProcVersionSignature: Ubuntu 5.13.0-1023.27~20.04.1-azure 5.13.19
  RelatedPackageVersions:
   linux-restricted-modules-5.13.0-1023-azure N/A
   linux-backports-modules-5.13.0-1023-azure  N/A
   linux-firmware 1.187.30
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
  Tags:  focal uec-images
  Uname: Linux 5.13.0-1023-azure aarch64
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups: N/A
  _MarkForUpload: True
  dmi.bios.date: 02/07/2022
  dmi.bios.release: 4.1
  dmi.bios.vendor: Microsoft Corporation
  dmi.bios.version: Hyper-V UEFI Release v4.1
  dmi.board.asset.

[Kernel-packages] [Bug 1973034] WifiSyslog.txt

2022-05-11 Thread Francis Ginther
apport information

** Attachment added: "WifiSyslog.txt"
   
https://bugs.launchpad.net/bugs/1973034/+attachment/5588653/+files/WifiSyslog.txt

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1973034

Title:
  linux generic fails to boot on azure arm64 instance types

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  Azure now has arm64 instances in a preview, for example
  Standard_D2pds_v5. These work with the b/linux-azure and f/linux-azure
  kernels, but fail to boot with linux-generic.

  Looks like a storage device issue (from serial console):
  Begin: Running /scripts/init-premount ... done.
  Begin: Mounting root file system ... Begin: Running /scripts/local-top ... 
done.
  Begin: Running /scripts/local-premount ... [4.651830] Btrfs loaded, 
crc32c=crc32c-generic
  [4.651830] Btrfs loaded, crc32c=crc32c-generic
  Scanning for Btrfs filesystems
  done.
  Begin: Waiting for root file system ... Begin: Running /scripts/local-block 
... mdadm: No devices listed in conf file were found.
  done.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: error opening /dev/md?*: No such file or directory
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  done.
  Gave up waiting for root file system device.  Common problems:
   - Boot args (cat /proc/cmdline)
 - Check rootdelay= (did the system wait long enough?)
   - Missing modules (cat /proc/modules; ls /dev)
  ALERT!  UUID=b9c04583-2c65-4891-aa8e-a494eb94fd14 does not exist.  Dropping 
to a shell!
  --- 
  ProblemType: Bug
  AlsaDevices: Error: command ['ls', '-l', '/dev/snd/'] failed with exit code 
2: ls: cannot access '/dev/snd/': No such file or directory
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
  ApportVersion: 2.20.11-0ubuntu27.23
  Architecture: arm64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
  CRDA: Error: command ['iw', 'reg', 'get'] failed with exit code 1: nl80211 
not found.
  CasperMD5CheckResult: skip
  DistroRelease: Ubuntu 20.04
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
  Lspci-vt:
   -+-[3c75:00]---02.0  Mellanox Technologies MT27800 Family [ConnectX-5 
Virtual Function]
\-[:00]-
  Lsusb: Error: command ['lsusb'] failed with exit code 1:
  Lsusb-t:
   
  Lsusb-v: Error: command ['lsusb', '-v'] failed with exit code 1:
  MachineType: Microsoft Corporation Virtual Machine
  Package: linux (not installed)
  PciMultimedia:
   
  ProcEnviron:
   TERM=screen
   PATH=(custom, no user)
   LANG=C.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 hyperv_fb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.13.0-1023-azure 
root=PARTUUID=379f8683-bd7e-48a0-8460-5c7f68b4b091 ro console=tty1 
console=ttyAMA0 earlycon=pl011,0xeffec000 initcall_blacklist=arm_pmu_acpi_init 
panic=-1
  ProcVersionSignature: Ubuntu 5.13.0-1023.27~20.04.1-azure 5.13.19
  RelatedPackageVersions:
   linux-restricted-modules-5.13.0-1023-azure N/A
   linux-backports-modules-5.13.0-1023-azure  N/A
   linux-firmware 1.187.30
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
  Tags:  focal uec-images
  Uname: Linux 5.13.0-1023-azure aarch64
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups: N/A
  _MarkForUpload: True
  dmi.bios.date: 02/07/2022
  dmi.bios.release: 4.1
  dmi.bios.vendor: Microsoft Corporation
  dmi.bios.version: Hyper-V UEFI Release v4.1
  dmi.board.as

[Kernel-packages] [Bug 1973034] ProcCpuinfoMinimal.txt

2022-05-11 Thread Francis Ginther
apport information

** Attachment added: "ProcCpuinfoMinimal.txt"
   
https://bugs.launchpad.net/bugs/1973034/+attachment/5588649/+files/ProcCpuinfoMinimal.txt

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1973034

Title:
  linux generic fails to boot on azure arm64 instance types

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  Azure now has arm64 instances in a preview, for example
  Standard_D2pds_v5. These work with the b/linux-azure and f/linux-azure
  kernels, but fail to boot with linux-generic.

  Looks like a storage device issue (from serial console):
  Begin: Running /scripts/init-premount ... done.
  Begin: Mounting root file system ... Begin: Running /scripts/local-top ... 
done.
  Begin: Running /scripts/local-premount ... [4.651830] Btrfs loaded, 
crc32c=crc32c-generic
  [4.651830] Btrfs loaded, crc32c=crc32c-generic
  Scanning for Btrfs filesystems
  done.
  Begin: Waiting for root file system ... Begin: Running /scripts/local-block 
... mdadm: No devices listed in conf file were found.
  done.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: error opening /dev/md?*: No such file or directory
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  done.
  Gave up waiting for root file system device.  Common problems:
   - Boot args (cat /proc/cmdline)
 - Check rootdelay= (did the system wait long enough?)
   - Missing modules (cat /proc/modules; ls /dev)
  ALERT!  UUID=b9c04583-2c65-4891-aa8e-a494eb94fd14 does not exist.  Dropping 
to a shell!
  --- 
  ProblemType: Bug
  AlsaDevices: Error: command ['ls', '-l', '/dev/snd/'] failed with exit code 
2: ls: cannot access '/dev/snd/': No such file or directory
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
  ApportVersion: 2.20.11-0ubuntu27.23
  Architecture: arm64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
  CRDA: Error: command ['iw', 'reg', 'get'] failed with exit code 1: nl80211 
not found.
  CasperMD5CheckResult: skip
  DistroRelease: Ubuntu 20.04
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
  Lspci-vt:
   -+-[3c75:00]---02.0  Mellanox Technologies MT27800 Family [ConnectX-5 
Virtual Function]
\-[:00]-
  Lsusb: Error: command ['lsusb'] failed with exit code 1:
  Lsusb-t:
   
  Lsusb-v: Error: command ['lsusb', '-v'] failed with exit code 1:
  MachineType: Microsoft Corporation Virtual Machine
  Package: linux (not installed)
  PciMultimedia:
   
  ProcEnviron:
   TERM=screen
   PATH=(custom, no user)
   LANG=C.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 hyperv_fb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.13.0-1023-azure 
root=PARTUUID=379f8683-bd7e-48a0-8460-5c7f68b4b091 ro console=tty1 
console=ttyAMA0 earlycon=pl011,0xeffec000 initcall_blacklist=arm_pmu_acpi_init 
panic=-1
  ProcVersionSignature: Ubuntu 5.13.0-1023.27~20.04.1-azure 5.13.19
  RelatedPackageVersions:
   linux-restricted-modules-5.13.0-1023-azure N/A
   linux-backports-modules-5.13.0-1023-azure  N/A
   linux-firmware 1.187.30
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
  Tags:  focal uec-images
  Uname: Linux 5.13.0-1023-azure aarch64
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups: N/A
  _MarkForUpload: True
  dmi.bios.date: 02/07/2022
  dmi.bios.release: 4.1
  dmi.bios.vendor: Microsoft Corporation
  dmi.bios.version: Hyper-V UEFI Release v4.

[Kernel-packages] [Bug 1973034] ProcModules.txt

2022-05-11 Thread Francis Ginther
apport information

** Attachment added: "ProcModules.txt"
   
https://bugs.launchpad.net/bugs/1973034/+attachment/5588651/+files/ProcModules.txt

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1973034

Title:
  linux generic fails to boot on azure arm64 instance types

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  Azure now has arm64 instances in a preview, for example
  Standard_D2pds_v5. These work with the b/linux-azure and f/linux-azure
  kernels, but fail to boot with linux-generic.

  Looks like a storage device issue (from serial console):
  Begin: Running /scripts/init-premount ... done.
  Begin: Mounting root file system ... Begin: Running /scripts/local-top ... 
done.
  Begin: Running /scripts/local-premount ... [4.651830] Btrfs loaded, 
crc32c=crc32c-generic
  [4.651830] Btrfs loaded, crc32c=crc32c-generic
  Scanning for Btrfs filesystems
  done.
  Begin: Waiting for root file system ... Begin: Running /scripts/local-block 
... mdadm: No devices listed in conf file were found.
  done.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: error opening /dev/md?*: No such file or directory
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  done.
  Gave up waiting for root file system device.  Common problems:
   - Boot args (cat /proc/cmdline)
 - Check rootdelay= (did the system wait long enough?)
   - Missing modules (cat /proc/modules; ls /dev)
  ALERT!  UUID=b9c04583-2c65-4891-aa8e-a494eb94fd14 does not exist.  Dropping 
to a shell!
  --- 
  ProblemType: Bug
  AlsaDevices: Error: command ['ls', '-l', '/dev/snd/'] failed with exit code 
2: ls: cannot access '/dev/snd/': No such file or directory
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
  ApportVersion: 2.20.11-0ubuntu27.23
  Architecture: arm64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
  CRDA: Error: command ['iw', 'reg', 'get'] failed with exit code 1: nl80211 
not found.
  CasperMD5CheckResult: skip
  DistroRelease: Ubuntu 20.04
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
  Lspci-vt:
   -+-[3c75:00]---02.0  Mellanox Technologies MT27800 Family [ConnectX-5 
Virtual Function]
\-[:00]-
  Lsusb: Error: command ['lsusb'] failed with exit code 1:
  Lsusb-t:
   
  Lsusb-v: Error: command ['lsusb', '-v'] failed with exit code 1:
  MachineType: Microsoft Corporation Virtual Machine
  Package: linux (not installed)
  PciMultimedia:
   
  ProcEnviron:
   TERM=screen
   PATH=(custom, no user)
   LANG=C.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 hyperv_fb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.13.0-1023-azure 
root=PARTUUID=379f8683-bd7e-48a0-8460-5c7f68b4b091 ro console=tty1 
console=ttyAMA0 earlycon=pl011,0xeffec000 initcall_blacklist=arm_pmu_acpi_init 
panic=-1
  ProcVersionSignature: Ubuntu 5.13.0-1023.27~20.04.1-azure 5.13.19
  RelatedPackageVersions:
   linux-restricted-modules-5.13.0-1023-azure N/A
   linux-backports-modules-5.13.0-1023-azure  N/A
   linux-firmware 1.187.30
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
  Tags:  focal uec-images
  Uname: Linux 5.13.0-1023-azure aarch64
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups: N/A
  _MarkForUpload: True
  dmi.bios.date: 02/07/2022
  dmi.bios.release: 4.1
  dmi.bios.vendor: Microsoft Corporation
  dmi.bios.version: Hyper-V UEFI Release v4.1
  dmi.board.

[Kernel-packages] [Bug 1973034] CurrentDmesg.txt

2022-05-11 Thread Francis Ginther
apport information

** Attachment added: "CurrentDmesg.txt"
   
https://bugs.launchpad.net/bugs/1973034/+attachment/5588646/+files/CurrentDmesg.txt

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1973034

Title:
  linux generic fails to boot on azure arm64 instance types

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  Azure now has arm64 instances in a preview, for example
  Standard_D2pds_v5. These work with the b/linux-azure and f/linux-azure
  kernels, but fail to boot with linux-generic.

  Looks like a storage device issue (from serial console):
  Begin: Running /scripts/init-premount ... done.
  Begin: Mounting root file system ... Begin: Running /scripts/local-top ... 
done.
  Begin: Running /scripts/local-premount ... [4.651830] Btrfs loaded, 
crc32c=crc32c-generic
  [4.651830] Btrfs loaded, crc32c=crc32c-generic
  Scanning for Btrfs filesystems
  done.
  Begin: Waiting for root file system ... Begin: Running /scripts/local-block 
... mdadm: No devices listed in conf file were found.
  done.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: error opening /dev/md?*: No such file or directory
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  done.
  Gave up waiting for root file system device.  Common problems:
   - Boot args (cat /proc/cmdline)
 - Check rootdelay= (did the system wait long enough?)
   - Missing modules (cat /proc/modules; ls /dev)
  ALERT!  UUID=b9c04583-2c65-4891-aa8e-a494eb94fd14 does not exist.  Dropping 
to a shell!
  --- 
  ProblemType: Bug
  AlsaDevices: Error: command ['ls', '-l', '/dev/snd/'] failed with exit code 
2: ls: cannot access '/dev/snd/': No such file or directory
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
  ApportVersion: 2.20.11-0ubuntu27.23
  Architecture: arm64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
  CRDA: Error: command ['iw', 'reg', 'get'] failed with exit code 1: nl80211 
not found.
  CasperMD5CheckResult: skip
  DistroRelease: Ubuntu 20.04
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
  Lspci-vt:
   -+-[3c75:00]---02.0  Mellanox Technologies MT27800 Family [ConnectX-5 
Virtual Function]
\-[:00]-
  Lsusb: Error: command ['lsusb'] failed with exit code 1:
  Lsusb-t:
   
  Lsusb-v: Error: command ['lsusb', '-v'] failed with exit code 1:
  MachineType: Microsoft Corporation Virtual Machine
  Package: linux (not installed)
  PciMultimedia:
   
  ProcEnviron:
   TERM=screen
   PATH=(custom, no user)
   LANG=C.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 hyperv_fb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.13.0-1023-azure 
root=PARTUUID=379f8683-bd7e-48a0-8460-5c7f68b4b091 ro console=tty1 
console=ttyAMA0 earlycon=pl011,0xeffec000 initcall_blacklist=arm_pmu_acpi_init 
panic=-1
  ProcVersionSignature: Ubuntu 5.13.0-1023.27~20.04.1-azure 5.13.19
  RelatedPackageVersions:
   linux-restricted-modules-5.13.0-1023-azure N/A
   linux-backports-modules-5.13.0-1023-azure  N/A
   linux-firmware 1.187.30
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
  Tags:  focal uec-images
  Uname: Linux 5.13.0-1023-azure aarch64
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups: N/A
  _MarkForUpload: True
  dmi.bios.date: 02/07/2022
  dmi.bios.release: 4.1
  dmi.bios.vendor: Microsoft Corporation
  dmi.bios.version: Hyper-V UEFI Release v4.1
  dmi.boar

[Kernel-packages] [Bug 1973034] ProcCpuinfo.txt

2022-05-11 Thread Francis Ginther
apport information

** Attachment added: "ProcCpuinfo.txt"
   
https://bugs.launchpad.net/bugs/1973034/+attachment/5588648/+files/ProcCpuinfo.txt

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1973034

Title:
  linux generic fails to boot on azure arm64 instance types

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  Azure now has arm64 instances in a preview, for example
  Standard_D2pds_v5. These work with the b/linux-azure and f/linux-azure
  kernels, but fail to boot with linux-generic.

  Looks like a storage device issue (from serial console):
  Begin: Running /scripts/init-premount ... done.
  Begin: Mounting root file system ... Begin: Running /scripts/local-top ... 
done.
  Begin: Running /scripts/local-premount ... [4.651830] Btrfs loaded, 
crc32c=crc32c-generic
  [4.651830] Btrfs loaded, crc32c=crc32c-generic
  Scanning for Btrfs filesystems
  done.
  Begin: Waiting for root file system ... Begin: Running /scripts/local-block 
... mdadm: No devices listed in conf file were found.
  done.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: error opening /dev/md?*: No such file or directory
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  done.
  Gave up waiting for root file system device.  Common problems:
   - Boot args (cat /proc/cmdline)
 - Check rootdelay= (did the system wait long enough?)
   - Missing modules (cat /proc/modules; ls /dev)
  ALERT!  UUID=b9c04583-2c65-4891-aa8e-a494eb94fd14 does not exist.  Dropping 
to a shell!
  --- 
  ProblemType: Bug
  AlsaDevices: Error: command ['ls', '-l', '/dev/snd/'] failed with exit code 
2: ls: cannot access '/dev/snd/': No such file or directory
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
  ApportVersion: 2.20.11-0ubuntu27.23
  Architecture: arm64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
  CRDA: Error: command ['iw', 'reg', 'get'] failed with exit code 1: nl80211 
not found.
  CasperMD5CheckResult: skip
  DistroRelease: Ubuntu 20.04
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
  Lspci-vt:
   -+-[3c75:00]---02.0  Mellanox Technologies MT27800 Family [ConnectX-5 
Virtual Function]
\-[:00]-
  Lsusb: Error: command ['lsusb'] failed with exit code 1:
  Lsusb-t:
   
  Lsusb-v: Error: command ['lsusb', '-v'] failed with exit code 1:
  MachineType: Microsoft Corporation Virtual Machine
  Package: linux (not installed)
  PciMultimedia:
   
  ProcEnviron:
   TERM=screen
   PATH=(custom, no user)
   LANG=C.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 hyperv_fb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.13.0-1023-azure 
root=PARTUUID=379f8683-bd7e-48a0-8460-5c7f68b4b091 ro console=tty1 
console=ttyAMA0 earlycon=pl011,0xeffec000 initcall_blacklist=arm_pmu_acpi_init 
panic=-1
  ProcVersionSignature: Ubuntu 5.13.0-1023.27~20.04.1-azure 5.13.19
  RelatedPackageVersions:
   linux-restricted-modules-5.13.0-1023-azure N/A
   linux-backports-modules-5.13.0-1023-azure  N/A
   linux-firmware 1.187.30
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
  Tags:  focal uec-images
  Uname: Linux 5.13.0-1023-azure aarch64
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups: N/A
  _MarkForUpload: True
  dmi.bios.date: 02/07/2022
  dmi.bios.release: 4.1
  dmi.bios.vendor: Microsoft Corporation
  dmi.bios.version: Hyper-V UEFI Release v4.1
  dmi.board.

[Kernel-packages] [Bug 1973034] Lspci.txt

2022-05-11 Thread Francis Ginther
apport information

** Attachment added: "Lspci.txt"
   https://bugs.launchpad.net/bugs/1973034/+attachment/5588647/+files/Lspci.txt

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1973034

Title:
  linux generic fails to boot on azure arm64 instance types

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  Azure now has arm64 instances in a preview, for example
  Standard_D2pds_v5. These work with the b/linux-azure and f/linux-azure
  kernels, but fail to boot with linux-generic.

  Looks like a storage device issue (from serial console):
  Begin: Running /scripts/init-premount ... done.
  Begin: Mounting root file system ... Begin: Running /scripts/local-top ... 
done.
  Begin: Running /scripts/local-premount ... [4.651830] Btrfs loaded, 
crc32c=crc32c-generic
  [4.651830] Btrfs loaded, crc32c=crc32c-generic
  Scanning for Btrfs filesystems
  done.
  Begin: Waiting for root file system ... Begin: Running /scripts/local-block 
... mdadm: No devices listed in conf file were found.
  done.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: error opening /dev/md?*: No such file or directory
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  done.
  Gave up waiting for root file system device.  Common problems:
   - Boot args (cat /proc/cmdline)
 - Check rootdelay= (did the system wait long enough?)
   - Missing modules (cat /proc/modules; ls /dev)
  ALERT!  UUID=b9c04583-2c65-4891-aa8e-a494eb94fd14 does not exist.  Dropping 
to a shell!
  --- 
  ProblemType: Bug
  AlsaDevices: Error: command ['ls', '-l', '/dev/snd/'] failed with exit code 
2: ls: cannot access '/dev/snd/': No such file or directory
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
  ApportVersion: 2.20.11-0ubuntu27.23
  Architecture: arm64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
  CRDA: Error: command ['iw', 'reg', 'get'] failed with exit code 1: nl80211 
not found.
  CasperMD5CheckResult: skip
  DistroRelease: Ubuntu 20.04
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
  Lspci-vt:
   -+-[3c75:00]---02.0  Mellanox Technologies MT27800 Family [ConnectX-5 
Virtual Function]
\-[:00]-
  Lsusb: Error: command ['lsusb'] failed with exit code 1:
  Lsusb-t:
   
  Lsusb-v: Error: command ['lsusb', '-v'] failed with exit code 1:
  MachineType: Microsoft Corporation Virtual Machine
  Package: linux (not installed)
  PciMultimedia:
   
  ProcEnviron:
   TERM=screen
   PATH=(custom, no user)
   LANG=C.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 hyperv_fb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.13.0-1023-azure 
root=PARTUUID=379f8683-bd7e-48a0-8460-5c7f68b4b091 ro console=tty1 
console=ttyAMA0 earlycon=pl011,0xeffec000 initcall_blacklist=arm_pmu_acpi_init 
panic=-1
  ProcVersionSignature: Ubuntu 5.13.0-1023.27~20.04.1-azure 5.13.19
  RelatedPackageVersions:
   linux-restricted-modules-5.13.0-1023-azure N/A
   linux-backports-modules-5.13.0-1023-azure  N/A
   linux-firmware 1.187.30
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
  Tags:  focal uec-images
  Uname: Linux 5.13.0-1023-azure aarch64
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups: N/A
  _MarkForUpload: True
  dmi.bios.date: 02/07/2022
  dmi.bios.release: 4.1
  dmi.bios.vendor: Microsoft Corporation
  dmi.bios.version: Hyper-V UEFI Release v4.1
  dmi.board.asset.tag: No

[Kernel-packages] [Bug 1973034] Re: linux generic fails to boot on azure arm64 instance types

2022-05-11 Thread Francis Ginther
apport information

** Tags added: apport-collected focal uec-images

** Description changed:

  Azure now has arm64 instances in a preview, for example
  Standard_D2pds_v5. These work with the b/linux-azure and f/linux-azure
  kernels, but fail to boot with linux-generic.
  
  Looks like a storage device issue (from serial console):
  Begin: Running /scripts/init-premount ... done.
  Begin: Mounting root file system ... Begin: Running /scripts/local-top ... 
done.
  Begin: Running /scripts/local-premount ... [4.651830] Btrfs loaded, 
crc32c=crc32c-generic
  [4.651830] Btrfs loaded, crc32c=crc32c-generic
  Scanning for Btrfs filesystems
  done.
  Begin: Waiting for root file system ... Begin: Running /scripts/local-block 
... mdadm: No devices listed in conf file were found.
  done.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: error opening /dev/md?*: No such file or directory
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  done.
  Gave up waiting for root file system device.  Common problems:
   - Boot args (cat /proc/cmdline)
 - Check rootdelay= (did the system wait long enough?)
   - Missing modules (cat /proc/modules; ls /dev)
  ALERT!  UUID=b9c04583-2c65-4891-aa8e-a494eb94fd14 does not exist.  Dropping 
to a shell!
+ --- 
+ ProblemType: Bug
+ AlsaDevices: Error: command ['ls', '-l', '/dev/snd/'] failed with exit code 
2: ls: cannot access '/dev/snd/': No such file or directory
+ AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
+ ApportVersion: 2.20.11-0ubuntu27.23
+ Architecture: arm64
+ ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
+ CRDA: Error: command ['iw', 'reg', 'get'] failed with exit code 1: nl80211 
not found.
+ CasperMD5CheckResult: skip
+ DistroRelease: Ubuntu 20.04
+ IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
+ Lspci-vt:
+  -+-[3c75:00]---02.0  Mellanox Technologies MT27800 Family [ConnectX-5 
Virtual Function]
+   \-[:00]-
+ Lsusb: Error: command ['lsusb'] failed with exit code 1:
+ Lsusb-t:
+  
+ Lsusb-v: Error: command ['lsusb', '-v'] failed with exit code 1:
+ MachineType: Microsoft Corporation Virtual Machine
+ Package: linux (not installed)
+ PciMultimedia:
+  
+ ProcEnviron:
+  TERM=screen
+  PATH=(custom, no user)
+  LANG=C.UTF-8
+  SHELL=/bin/bash
+ ProcFB: 0 hyperv_fb
+ ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.13.0-1023-azure 
root=PARTUUID=379f8683-bd7e-48a0-8460-5c7f68b4b091 ro console=tty1 
console=ttyAMA0 earlycon=pl011,0xeffec000 initcall_blacklist=arm_pmu_acpi_init 
panic=-1
+ ProcVersionSignature: Ubuntu 5.13.0-1023.27~20.04.1-azure 5.13.19
+ RelatedPackageVersions:
+  linux-restricted-modules-5.13.0-1023-azure N/A
+  linux-backports-modules-5.13.0-1023-azure  N/A
+  linux-firmware 1.187.30
+ RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
+ Tags:  focal uec-images
+ Uname: Linux 5.13.0-1023-azure aarch64
+ UpgradeStatus: No upgrade log present (probably fresh install)
+ UserGroups: N/A
+ _MarkForUpload: True
+ dmi.bios.date: 02/07/2022
+ dmi.bios.release: 4.1
+ dmi.bios.vendor: Microsoft Corporation
+ dmi.bios.version: Hyper-V UEFI Release v4.1
+ dmi.board.asset.tag: None
+ dmi.board.name: Virtual Machine
+ dmi.board.vendor: Microsoft Corporation
+ dmi.board.version: Hyper-V UEFI Release v4.1
+ dmi.chassis.asset.tag: 7783-7084-3265-9085-8269-3286-77
+ dmi.chassis.type: 3
+ dmi.chassis.vendor: Microsoft Corporation
+ dmi.chassis.version: Hyper-V UEFI Release v4.1
+ dmi.modalias: 
dmi:bvnMicrosoftCorp

[Kernel-packages] [Bug 1973034] [NEW] linux generic fails to boot on azure arm64 instance types

2022-05-11 Thread Francis Ginther
Public bug reported:

Azure now has arm64 instances in a preview, for example
Standard_D2pds_v5. These work with the b/linux-azure and f/linux-azure
kernels, but fail to boot with linux-generic.

Looks like a storage device issue (from serial console):
Begin: Running /scripts/init-premount ... done.
Begin: Mounting root file system ... Begin: Running /scripts/local-top ... done.
Begin: Running /scripts/local-premount ... [4.651830] Btrfs loaded, 
crc32c=crc32c-generic
[4.651830] Btrfs loaded, crc32c=crc32c-generic
Scanning for Btrfs filesystems
done.
Begin: Waiting for root file system ... Begin: Running /scripts/local-block ... 
mdadm: No devices listed in conf file were found.
done.
mdadm: No devices listed in conf file were found.
mdadm: No devices listed in conf file were found.
mdadm: No devices listed in conf file were found.
mdadm: No devices listed in conf file were found.
mdadm: No devices listed in conf file were found.
mdadm: No devices listed in conf file were found.
mdadm: No devices listed in conf file were found.
mdadm: No devices listed in conf file were found.
mdadm: No devices listed in conf file were found.
mdadm: No devices listed in conf file were found.
mdadm: No devices listed in conf file were found.
mdadm: No devices listed in conf file were found.
mdadm: No devices listed in conf file were found.
mdadm: No devices listed in conf file were found.
mdadm: No devices listed in conf file were found.
mdadm: No devices listed in conf file were found.
mdadm: No devices listed in conf file were found.
mdadm: No devices listed in conf file were found.
mdadm: No devices listed in conf file were found.
mdadm: error opening /dev/md?*: No such file or directory
mdadm: No devices listed in conf file were found.
mdadm: No devices listed in conf file were found.
mdadm: No devices listed in conf file were found.
mdadm: No devices listed in conf file were found.
mdadm: No devices listed in conf file were found.
mdadm: No devices listed in conf file were found.
mdadm: No devices listed in conf file were found.
mdadm: No devices listed in conf file were found.
mdadm: No devices listed in conf file were found.
mdadm: No devices listed in conf file were found.
mdadm: No devices listed in conf file were found.
done.
Gave up waiting for root file system device.  Common problems:
 - Boot args (cat /proc/cmdline)
   - Check rootdelay= (did the system wait long enough?)
 - Missing modules (cat /proc/modules; ls /dev)
ALERT!  UUID=b9c04583-2c65-4891-aa8e-a494eb94fd14 does not exist.  Dropping to 
a shell!

** Affects: linux (Ubuntu)
 Importance: Undecided
 Status: Incomplete

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1973034

Title:
  linux generic fails to boot on azure arm64 instance types

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  Azure now has arm64 instances in a preview, for example
  Standard_D2pds_v5. These work with the b/linux-azure and f/linux-azure
  kernels, but fail to boot with linux-generic.

  Looks like a storage device issue (from serial console):
  Begin: Running /scripts/init-premount ... done.
  Begin: Mounting root file system ... Begin: Running /scripts/local-top ... 
done.
  Begin: Running /scripts/local-premount ... [4.651830] Btrfs loaded, 
crc32c=crc32c-generic
  [4.651830] Btrfs loaded, crc32c=crc32c-generic
  Scanning for Btrfs filesystems
  done.
  Begin: Waiting for root file system ... Begin: Running /scripts/local-block 
... mdadm: No devices listed in conf file were found.
  done.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: error opening /dev/md?*: No such file or directory
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were found.
  mdadm: No devices listed in conf file were fo

[Kernel-packages] [Bug 1968062] Re: jammy/linux-aws hibernation timeout on xen instances

2022-04-06 Thread Francis Ginther
In this screenshot, it appears the system has resumed as the login
screen is shown along with the messages from the hibernation memory
consumption utility. The first memory message was generated prior to the
hibernation (matches the message from the pre-hibernation image). The
second message could have been generated before the hibernation or after
the resume (there isn't enough data to know for sure).

** Attachment added: "Second screenshot after resume initiated"
   
https://bugs.launchpad.net/ubuntu/+source/linux-aws/+bug/1968062/+attachment/5577701/+files/post-hibernate.12.jpg

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-aws in Ubuntu.
https://bugs.launchpad.net/bugs/1968062

Title:
  jammy/linux-aws hibernation timeout on xen instances

Status in linux-aws package in Ubuntu:
  New

Bug description:
  Hibernation testing of jammy/linux-aws 5.15.0-1003-aws is failing on
  all xen instance types (c3/c4/i3/m3/m4/r3/r4/t2). The failure happens
  while attempting to resume from the first attempt to hibernate.
  Testing on nitro instances types (c5/m5/r5/t3) all pass.

  After the resume, the system is inaccessible via ssh. The console
  screenshot does change, but the console log obtained from `aws ec2
  get-console-output` does not.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-aws/+bug/1968062/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1968062] Re: jammy/linux-aws hibernation timeout on xen instances

2022-04-06 Thread Francis Ginther
This screenshot was taken a few minutes after the resume attempt. These
ssm-amazon-agent messages repeat every 120 seconds with a new set. But
this is all the progress we see from either the screenshot or the serial
console. There are no new memory consumption messages indicating that
the resume was complete.

** Attachment added: "Third screenshot after resume initiated"
   
https://bugs.launchpad.net/ubuntu/+source/linux-aws/+bug/1968062/+attachment/5577703/+files/post-hibernate.16.jpg

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-aws in Ubuntu.
https://bugs.launchpad.net/bugs/1968062

Title:
  jammy/linux-aws hibernation timeout on xen instances

Status in linux-aws package in Ubuntu:
  New

Bug description:
  Hibernation testing of jammy/linux-aws 5.15.0-1003-aws is failing on
  all xen instance types (c3/c4/i3/m3/m4/r3/r4/t2). The failure happens
  while attempting to resume from the first attempt to hibernate.
  Testing on nitro instances types (c5/m5/r5/t3) all pass.

  After the resume, the system is inaccessible via ssh. The console
  screenshot does change, but the console log obtained from `aws ec2
  get-console-output` does not.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-aws/+bug/1968062/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1968062] Re: jammy/linux-aws hibernation timeout on xen instances

2022-04-06 Thread Francis Ginther
** Attachment added: "Last screenshot before hibernation"
   
https://bugs.launchpad.net/ubuntu/+source/linux-aws/+bug/1968062/+attachment/5577676/+files/pre-hibernation.04.jpg

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-aws in Ubuntu.
https://bugs.launchpad.net/bugs/1968062

Title:
  jammy/linux-aws hibernation timeout on xen instances

Status in linux-aws package in Ubuntu:
  New

Bug description:
  Hibernation testing of jammy/linux-aws 5.15.0-1003-aws is failing on
  all xen instance types (c3/c4/i3/m3/m4/r3/r4/t2). The failure happens
  while attempting to resume from the first attempt to hibernate.
  Testing on nitro instances types (c5/m5/r5/t3) all pass.

  After the resume, the system is inaccessible via ssh. The console
  screenshot does change, but the console log obtained from `aws ec2
  get-console-output` does not.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-aws/+bug/1968062/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1968062] Re: jammy/linux-aws hibernation timeout on xen instances

2022-04-06 Thread Francis Ginther
** Attachment added: "First screenshot after resume initiated"
   
https://bugs.launchpad.net/ubuntu/+source/linux-aws/+bug/1968062/+attachment/5577677/+files/post-hibernate.01.jpg

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-aws in Ubuntu.
https://bugs.launchpad.net/bugs/1968062

Title:
  jammy/linux-aws hibernation timeout on xen instances

Status in linux-aws package in Ubuntu:
  New

Bug description:
  Hibernation testing of jammy/linux-aws 5.15.0-1003-aws is failing on
  all xen instance types (c3/c4/i3/m3/m4/r3/r4/t2). The failure happens
  while attempting to resume from the first attempt to hibernate.
  Testing on nitro instances types (c5/m5/r5/t3) all pass.

  After the resume, the system is inaccessible via ssh. The console
  screenshot does change, but the console log obtained from `aws ec2
  get-console-output` does not.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-aws/+bug/1968062/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1968062] [NEW] jammy/linux-aws hibernation timeout on xen instances

2022-04-06 Thread Francis Ginther
Public bug reported:

Hibernation testing of jammy/linux-aws 5.15.0-1003-aws is failing on all
xen instance types (c3/c4/i3/m3/m4/r3/r4/t2). The failure happens while
attempting to resume from the first attempt to hibernate. Testing on
nitro instances types (c5/m5/r5/t3) all pass.

After the resume, the system is inaccessible via ssh. The console
screenshot does change, but the console log obtained from `aws ec2 get-
console-output` does not.

** Affects: linux-aws (Ubuntu)
 Importance: Undecided
 Status: New

** Attachment added: "serial console log"
   
https://bugs.launchpad.net/bugs/1968062/+attachment/5577675/+files/aws-jammy-all-c3.8xlarge-9-1.txt

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-aws in Ubuntu.
https://bugs.launchpad.net/bugs/1968062

Title:
  jammy/linux-aws hibernation timeout on xen instances

Status in linux-aws package in Ubuntu:
  New

Bug description:
  Hibernation testing of jammy/linux-aws 5.15.0-1003-aws is failing on
  all xen instance types (c3/c4/i3/m3/m4/r3/r4/t2). The failure happens
  while attempting to resume from the first attempt to hibernate.
  Testing on nitro instances types (c5/m5/r5/t3) all pass.

  After the resume, the system is inaccessible via ssh. The console
  screenshot does change, but the console log obtained from `aws ec2
  get-console-output` does not.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-aws/+bug/1968062/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1960871] [NEW] linux-modules-extra-* fails to install due to dependency on unsigned package

2022-02-14 Thread Francis Ginther
Public bug reported:

Several SRU tests are failing the test setup due to failure to install
the modules-extra package:

* Command: 
yes "" | DEBIAN_FRONTEND=noninteractive apt-get install --yes --force-yes
automake bison build-essential byacc flex git keyutils libacl1-dev libaio-
dev libcap-dev libmm-dev libnuma-dev libsctp-dev libselinux1-dev libssl-
dev libtirpc-dev pkg-config quota xfslibs-dev xfsprogs gcc linux-modules-
extra-4.15.0-1120-aws
Exit status: 100
Duration: 0.908210039139

stdout:
Reading package lists...
Building dependency tree...
Reading state information...
xfsprogs is already the newest version (4.9.0+nmu1ubuntu2).
xfsprogs set to manually installed.
git is already the newest version (1:2.17.1-1ubuntu0.9).
git set to manually installed.
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:

The following packages have unmet dependencies:
 linux-modules-extra-4.15.0-1120-aws : Depends: 
linux-image-unsigned-4.15.0-1120-aws but it is not going to be installed
stderr:
W: --force-yes is deprecated, use one of the options starting with --allow 
instead.
E: Unable to correct problems, you have held broken packages.

** Affects: linux-aws (Ubuntu)
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-aws in Ubuntu.
https://bugs.launchpad.net/bugs/1960871

Title:
  linux-modules-extra-* fails to install due to dependency on unsigned
  package

Status in linux-aws package in Ubuntu:
  New

Bug description:
  Several SRU tests are failing the test setup due to failure to install
  the modules-extra package:

  * Command: 
  yes "" | DEBIAN_FRONTEND=noninteractive apt-get install --yes --force-yes
  automake bison build-essential byacc flex git keyutils libacl1-dev libaio-
  dev libcap-dev libmm-dev libnuma-dev libsctp-dev libselinux1-dev libssl-
  dev libtirpc-dev pkg-config quota xfslibs-dev xfsprogs gcc linux-modules-
  extra-4.15.0-1120-aws
  Exit status: 100
  Duration: 0.908210039139

  stdout:
  Reading package lists...
  Building dependency tree...
  Reading state information...
  xfsprogs is already the newest version (4.9.0+nmu1ubuntu2).
  xfsprogs set to manually installed.
  git is already the newest version (1:2.17.1-1ubuntu0.9).
  git set to manually installed.
  Some packages could not be installed. This may mean that you have
  requested an impossible situation or if you are using the unstable
  distribution that some required packages have not yet been created
  or been moved out of Incoming.
  The following information may help to resolve the situation:

  The following packages have unmet dependencies:
   linux-modules-extra-4.15.0-1120-aws : Depends: 
linux-image-unsigned-4.15.0-1120-aws but it is not going to be installed
  stderr:
  W: --force-yes is deprecated, use one of the options starting with --allow 
instead.
  E: Unable to correct problems, you have held broken packages.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-aws/+bug/1960871/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1960094] Re: lxc/1:4.0.6-0ubuntu1~20.04.1 undefined symbol: strlcat in Focal

2022-02-09 Thread Francis Ginther
This is the result of pulling the lxc test sources from the git repo,
but using the lxc from the archive. Currently, the archive has version
4.0.6 and the git repo has been updated to 4.0.12 as an upload is in
progress (it's in the unapproved queue as this comment is being
written).

The result is a mismatch in the tests and the package and test failures.
Switching to a different version of the test sources results in a
passing test.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1960094

Title:
  lxc/1:4.0.6-0ubuntu1~20.04.1 undefined symbol: strlcat in Focal

Status in linux package in Ubuntu:
  Incomplete
Status in linux source package in Focal:
  New

Bug description:
  There are failures in ubuntu_lxc regression tests on Focal/linux/5.4.0-99.112 
sru cycle 2022.01.03 with the error
  lxc-create: symbol lookup error: lxc-create: undefined symbol: strlcat

  These errors did not appear on previous kernels in the same cycle and
  now have a few tests failing on all architectures and systems as of
  Feb 4th 2022 it seems. Log with details is attached in the comments.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1960094/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1960094] Re: lxc/1:4.0.6-0ubuntu1~20.04.1 undefined symbol: strlcat in Focal

2022-02-08 Thread Francis Ginther
I've retested two released kernels that passed the lxc test last cycle:

* focal/azure 5.4.0-1068.71
* focal/azure-5-11 5.11.0-1028.31~20.04.1

Both tests now show the same testcase failures where they were passing
before. Will start digging into any other changes in the environment.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1960094

Title:
  lxc/1:4.0.6-0ubuntu1~20.04.1 undefined symbol: strlcat in Focal

Status in linux package in Ubuntu:
  New
Status in lxc package in Ubuntu:
  Incomplete
Status in linux source package in Focal:
  New
Status in lxc source package in Focal:
  Incomplete

Bug description:
  There are failures in ubuntu_lxc regression tests on Focal/linux/5.4.0-99.112 
sru cycle 2022.01.03 with the error
  lxc-create: symbol lookup error: lxc-create: undefined symbol: strlcat

  These errors did not appear on previous kernels in the same cycle and
  now have a few tests failing on all architectures and systems as of
  Feb 4th 2022 it seems. Log with details is attached in the comments.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1960094/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1928888] Re: test_utils_testsuite from ubuntu_qrt_apparmor linux ADT test failure with linux/5.11.0-18.19

2021-11-09 Thread Francis Ginther
Tests are now passing.

** Changed in: ubuntu-kernel-tests
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/192

Title:
  test_utils_testsuite from ubuntu_qrt_apparmor linux ADT test failure
  with linux/5.11.0-18.19

Status in QA Regression Testing:
  Fix Released
Status in ubuntu-kernel-tests:
  Invalid
Status in linux package in Ubuntu:
  Invalid

Bug description:
  This is a scripted bug report about ADT failures while running linux
  tests for linux/5.11.0-18.19 on hirsute. Whether this is caused by the
  dep8 tests of the tested source or the kernel has yet to be
  determined.

  Not a regression. Found to occur previously on hirsute/linux
  5.11.0-14.15

  Testing failed on:
  amd64: 
https://autopkgtest.ubuntu.com/results/autopkgtest-hirsute/hirsute/amd64/l/linux/20210515_005957_75e5a@/log.gz
  arm64: 
https://autopkgtest.ubuntu.com/results/autopkgtest-hirsute/hirsute/arm64/l/linux/20210513_203508_96fd3@/log.gz
  ppc64el: 
https://autopkgtest.ubuntu.com/results/autopkgtest-hirsute/hirsute/ppc64el/l/linux/20210513_163708_c0203@/log.gz
  s390x: 
https://autopkgtest.ubuntu.com/results/autopkgtest-hirsute/hirsute/s390x/l/linux/20210513_144454_54b04@/log.gz


test_zz_cleanup_source_tree (__main__.ApparmorTestsuites)
Cleanup downloaded source ... ok

==
FAIL: test_utils_testsuite (__main__.ApparmorTestsuites)
Run utils (make check)
--
Traceback (most recent call last):
  File 
"/tmp/autopkgtest.gBRfIs/build.V37/src/autotest/client/tmp/ubuntu_qrt_apparmor/src/qa-regression-testing/scripts/./test-apparmor.py",
 line 1841, in test_utils_testsuite
self.assertEqual(expected, rc, result + report)
AssertionError: 0 != 2 : Got exit code 2, expected 0
ERROR: capability CAP_CHECKPOINT_RESTORE not found in severity.db
make: *** [Makefile:81: check_severity_db] Error 1


==
FAIL: test_utils_testsuite3 (__main__.ApparmorTestsuites)
Run utils (make check with python3)
--
Traceback (most recent call last):
  File 
"/tmp/autopkgtest.gBRfIs/build.V37/src/autotest/client/tmp/ubuntu_qrt_apparmor/src/qa-regression-testing/scripts/./test-apparmor.py",
 line 1862, in test_utils_testsuite3
self.assertEqual(expected, rc, result + report)
AssertionError: 0 != 2 : Got exit code 2, expected 0
ERROR: capability CAP_CHECKPOINT_RESTORE not found in severity.db
make: *** [Makefile:81: check_severity_db] Error 1


--
Ran 58 tests in 1448.768s

FAILED (failures=2)
  23:36:54 INFO |   END ERROR   ubuntu_qrt_apparmor.test-apparmor.py
ubuntu_qrt_apparmor.test-apparmor.pytimestamp=1621035414localtime=May 
14 23:36:54

To manage notifications about this bug go to:
https://bugs.launchpad.net/qa-regression-testing/+bug/192/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1939673] Re: Update the 470 and the 470-server NVIDIA drivers

2021-09-07 Thread Francis Ginther
Retested with the latest kernels as of Sept 7, 2021. Now able to see
that ubuntu-drivers now lists the 470-server driver as an available
option. For both Focal and Hirsute, I first had to enable -proposed. For
Bionic, it worked with the 4.15.0-156.163 kernel that was released this
week.

Tested combinations:
Release   Kernel
Bionic    4.15.0-156.163
Focal     5.4.0-85.95 (from focal-proposed)
Hirsute   5.11.0-35.37 (from hirsute-proposed)

Tested each driver with a cuda workload and use of nvidia-smi to verify
the driver functions as expected after install.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to nvidia-graphics-drivers-470 in Ubuntu.
https://bugs.launchpad.net/bugs/1939673

Title:
  Update the 470 and the 470-server NVIDIA drivers

Status in nvidia-graphics-drivers-470 package in Ubuntu:
  Fix Released
Status in nvidia-graphics-drivers-470-server package in Ubuntu:
  Triaged
Status in nvidia-graphics-drivers-470 source package in Bionic:
  Fix Committed
Status in nvidia-graphics-drivers-470-server source package in Bionic:
  Fix Committed
Status in nvidia-graphics-drivers-470 source package in Focal:
  Fix Committed
Status in nvidia-graphics-drivers-470-server source package in Focal:
  Fix Committed
Status in nvidia-graphics-drivers-470 source package in Hirsute:
  Fix Committed
Status in nvidia-graphics-drivers-470-server source package in Hirsute:
  Fix Committed

Bug description:
  Update the 470 (UDA) and 470-server (ERD) NVIDIA series in Bionic,
  Focal, Hirsute.

  [Impact]
  These releases provide both bug fixes and new features, and we would like to
  make sure all of our users have access to these improvements.

  See the changelog entry below for a full list of changes and bugs.

  [Test Case]
  The following development and SRU process was followed:
  https://wiki.ubuntu.com/NVidiaUpdates

  Certification test suite must pass on a range of hardware:
  https://git.launchpad.net/plainbox-provider-sru/tree/units/sru.pxu

  The QA team that executed the tests will be in charge of attaching the
  artifacts and console output of the appropriate run to the bug. nVidia
  maintainers team members will not mark ‘verification-done’ until this
  has happened.

  [Regression Potential]
  In order to mitigate the regression potential, the results of the
  aforementioned system level tests are attached to this bug.

  [Discussion]

  [Changelog]

  == 470.57.02 (470-server) ==

* debian/nvidia_supported,
  debian/rules (LP: #1939673):
  - Use the json database file to generate the modaliases.
  - Fix aliases generation.
We were not matching some of the aliases because of some missing
zeroes in the subdevice and subvendor ids.
* debian/pm-aliases-gen:
  - Fix aliases generation for runtimepm.
We were not matching some of the aliases because of some missing
zeroes in the subdevice and subvendor ids.

  == 470.63.01 (470) ==

* New upstream release (LP: #1939673):
  - Added support for the following GPUs:
  NVIDIA RTX A2000
  - Fixed a Vulkan performance regression that affected rFactor2.
* debian/nvidia_supported,
  debian/rules:
  - Use the json database file to generate the modaliases.
  - Fix aliases generation.
We were not matching some of the aliases because of some missing
zeroes in the subdevice and subvendor ids.
* debian/pm-aliases-gen:
  - Fix aliases generation for runtimepm.
We were not matching some of the aliases because of some missing
zeroes in the subdevice and subvendor ids.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/nvidia-graphics-drivers-470/+bug/1939673/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1939673] Re: Update the 470 and the 470-server NVIDIA drivers

2021-08-27 Thread Francis Ginther
While the 470-server drivers installed and passed the cuda testing
across bionic/focal/hirsute, "ubuntu-drivers" does not identify this as
a option for any release. The response I see is:

$ sudo ubuntu-drivers list --gpgpu
WARNING:root:_pkg_get_support nvidia-driver-390: package has invalid Support 
Legacyheader, cannot determine support level
nvidia-driver-450-server, (kernel modules provided by 
linux-modules-nvidia-450-server-generic)
nvidia-driver-390, (kernel modules provided by linux-modules-nvidia-390-generic)
nvidia-driver-418-server, (kernel modules provided by 
linux-modules-nvidia-418-server-generic)
nvidia-driver-470, (kernel modules provided by linux-modules-nvidia-470-generic)
nvidia-driver-460-server, (kernel modules provided by 
linux-modules-nvidia-460-server-generic)
nvidia-driver-460, (kernel modules provided by linux-modules-nvidia-460-generic)

This is from a hirsute host running linux-generic using a "Tesla
V100-SXM2-16GB".

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to nvidia-graphics-drivers-470 in Ubuntu.
https://bugs.launchpad.net/bugs/1939673

Title:
  Update the 470 and the 470-server NVIDIA drivers

Status in nvidia-graphics-drivers-470 package in Ubuntu:
  Triaged
Status in nvidia-graphics-drivers-470-server package in Ubuntu:
  Triaged
Status in nvidia-graphics-drivers-470 source package in Bionic:
  Fix Committed
Status in nvidia-graphics-drivers-470-server source package in Bionic:
  Fix Released
Status in nvidia-graphics-drivers-470 source package in Focal:
  Fix Committed
Status in nvidia-graphics-drivers-470-server source package in Focal:
  Fix Released
Status in nvidia-graphics-drivers-470 source package in Hirsute:
  Fix Committed
Status in nvidia-graphics-drivers-470-server source package in Hirsute:
  Fix Released

Bug description:
  Update the 470 (UDA) and 470-server (ERD) NVIDIA series in Bionic,
  Focal, Hirsute.

  [Impact]
  These releases provide both bug fixes and new features, and we would like to
  make sure all of our users have access to these improvements.

  See the changelog entry below for a full list of changes and bugs.

  [Test Case]
  The following development and SRU process was followed:
  https://wiki.ubuntu.com/NVidiaUpdates

  Certification test suite must pass on a range of hardware:
  https://git.launchpad.net/plainbox-provider-sru/tree/units/sru.pxu

  The QA team that executed the tests will be in charge of attaching the
  artifacts and console output of the appropriate run to the bug. nVidia
  maintainers team members will not mark ‘verification-done’ until this
  has happened.

  [Regression Potential]
  In order to mitigate the regression potential, the results of the
  aforementioned system level tests are attached to this bug.

  [Discussion]

  [Changelog]

  == 470.57.02 (470-server) ==

* debian/nvidia_supported,
  debian/rules (LP: #1939673):
  - Use the json database file to generate the modaliases.
  - Fix aliases generation.
We were not matching some of the aliases because of some missing
zeroes in the subdevice and subvendor ids.
* debian/pm-aliases-gen:
  - Fix aliases generation for runtimepm.
We were not matching some of the aliases because of some missing
zeroes in the subdevice and subvendor ids.

  == 470.63.01 (470) ==

* New upstream release (LP: #1939673):
  - Added support for the following GPUs:
  NVIDIA RTX A2000
  - Fixed a Vulkan performance regression that affected rFactor2.
* debian/nvidia_supported,
  debian/rules:
  - Use the json database file to generate the modaliases.
  - Fix aliases generation.
We were not matching some of the aliases because of some missing
zeroes in the subdevice and subvendor ids.
* debian/pm-aliases-gen:
  - Fix aliases generation for runtimepm.
We were not matching some of the aliases because of some missing
zeroes in the subdevice and subvendor ids.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/nvidia-graphics-drivers-470/+bug/1939673/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1936577] Re: Introduce the 470-server series

2021-07-29 Thread Francis Ginther
Testing has completed for this 470-server driver across bionic, focal
and hirsute. The LRM version of the driver was tested against the linux
generic kernels currently in -proposed:

4.15.0-152-generic
5.4.0-81-generic
5.11.0-25-generic

Testing consisted of running a basic set of the cuda samples (from
11.0).

Additional testing on bionic and focal was done with the Data Center GPU
Manager to exercise nvidia-fabricmanager and libnvidia-nscq-470.

** Tags removed: verification-needed-bionic verification-needed-focal 
verification-needed-hirsute
** Tags added: verification-done-bionic verification-done-focal 
verification-done-hirsute

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-restricted-modules in Ubuntu.
https://bugs.launchpad.net/bugs/1936577

Title:
  Introduce the 470-server series

Status in linux-restricted-modules package in Ubuntu:
  Triaged
Status in nvidia-graphics-drivers-460-server package in Ubuntu:
  Fix Committed
Status in linux-restricted-modules source package in Bionic:
  Triaged
Status in nvidia-graphics-drivers-460-server source package in Bionic:
  Fix Committed
Status in linux-restricted-modules source package in Focal:
  Triaged
Status in nvidia-graphics-drivers-460-server source package in Focal:
  Fix Committed
Status in linux-restricted-modules source package in Hirsute:
  Triaged
Status in nvidia-graphics-drivers-460-server source package in Hirsute:
  Fix Committed

Bug description:
  Introduce the 470 NVIDIA ERD (-server) series in Bionic, Focal,
  Hirsute.

  [Impact]
  These releases provide both bug fixes and new features, and we would like to
  make sure all of our users have access to these improvements.

  See the changelog entry below for a full list of changes and bugs.

  [Test Case]
  The following development and SRU process was followed:
  https://wiki.ubuntu.com/NVidiaUpdates

  Certification test suite must pass on a range of hardware:
  https://git.launchpad.net/plainbox-provider-sru/tree/units/sru.pxu

  The QA team that executed the tests will be in charge of attaching the
  artifacts and console output of the appropriate run to the bug. nVidia
  maintainers team members will not mark ‘verification-done’ until this
  has happened.

  [Regression Potential]
  In order to mitigate the regression potential, the results of the
  aforementioned system level tests are attached to this bug.

  [Discussion]

  [Changelog]

  == 470.57.02 (470-server) ==

* Initial release (LP: #1936577).

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-restricted-modules/+bug/1936577/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1876687] Re: function traceon/off triggers in ftace from ubuntu_kernel_selftests failed on B/F

2021-07-16 Thread Francis Ginther
Failed on bionic:linux generic amd64 host spitfire sru-20210621.

** Summary changed:

- function traceon/off triggers in ftace from ubuntu_kernel_selftests failed on 
Focal
+ function traceon/off triggers in ftace from ubuntu_kernel_selftests failed on 
B/F

** Tags added: sru-20210621

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1876687

Title:
  function traceon/off triggers in ftace from ubuntu_kernel_selftests
  failed on B/F

Status in ubuntu-kernel-tests:
  New
Status in linux package in Ubuntu:
  Incomplete

Bug description:
  Issue found on Focal 5.4.0-29.33 with node amaura (passed on rizzo,
  rizzo failed with other failures)

  # [27] ftrace - test for function traceon/off triggers [FAIL]

  Need to retest on amaura to check if this is just a glitch.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1876687/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1931131] Re: Update the 465 and the 460 NVIDIA driver series

2021-07-08 Thread Francis Ginther
@albertomilone. I've completed the basic CUDA testing of the server
drivers. As with the prior version, the 450-server and 460-server
drivers passed across the generic kernels for bionic, focal, groovy,
hirsute and impish. This applies to both the dkms and lrm installations.
Impish was tested with the 5.11 kernel.

For the 418-server driver, the dkms driver passed for bionic, focal and
groovy did not work with hirsute or impish (both with the 5.11 kernel).
The LRM driver worked across all 5 series. This was the same behavior as
the prior version of the driver.

I plan on doing additional testing to get coverage on more platforms,
but this passes the minimal testing criteria.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to nvidia-graphics-drivers-460 in Ubuntu.
https://bugs.launchpad.net/bugs/1931131

Title:
  Update the 465 and the 460 NVIDIA driver series

Status in linux package in Ubuntu:
  In Progress
Status in nvidia-graphics-drivers-390 package in Ubuntu:
  In Progress
Status in nvidia-graphics-drivers-418-server package in Ubuntu:
  In Progress
Status in nvidia-graphics-drivers-450-server package in Ubuntu:
  In Progress
Status in nvidia-graphics-drivers-460 package in Ubuntu:
  In Progress
Status in nvidia-graphics-drivers-460-server package in Ubuntu:
  New
Status in nvidia-graphics-drivers-465 package in Ubuntu:
  In Progress
Status in nvidia-settings package in Ubuntu:
  In Progress
Status in linux source package in Bionic:
  In Progress
Status in nvidia-graphics-drivers-390 source package in Bionic:
  Fix Committed
Status in nvidia-graphics-drivers-418-server source package in Bionic:
  Fix Committed
Status in nvidia-graphics-drivers-450-server source package in Bionic:
  Fix Committed
Status in nvidia-graphics-drivers-460 source package in Bionic:
  Fix Committed
Status in nvidia-graphics-drivers-460-server source package in Bionic:
  Fix Committed
Status in nvidia-graphics-drivers-465 source package in Bionic:
  Fix Committed
Status in nvidia-settings source package in Bionic:
  Fix Committed
Status in linux source package in Focal:
  In Progress
Status in nvidia-graphics-drivers-390 source package in Focal:
  Fix Committed
Status in nvidia-graphics-drivers-418-server source package in Focal:
  Fix Committed
Status in nvidia-graphics-drivers-450-server source package in Focal:
  Fix Committed
Status in nvidia-graphics-drivers-460 source package in Focal:
  Fix Committed
Status in nvidia-graphics-drivers-460-server source package in Focal:
  Fix Committed
Status in nvidia-graphics-drivers-465 source package in Focal:
  Fix Committed
Status in nvidia-settings source package in Focal:
  Fix Committed
Status in linux source package in Groovy:
  In Progress
Status in nvidia-graphics-drivers-390 source package in Groovy:
  Fix Committed
Status in nvidia-graphics-drivers-418-server source package in Groovy:
  Fix Committed
Status in nvidia-graphics-drivers-450-server source package in Groovy:
  Fix Committed
Status in nvidia-graphics-drivers-460 source package in Groovy:
  Fix Committed
Status in nvidia-graphics-drivers-460-server source package in Groovy:
  Fix Committed
Status in nvidia-graphics-drivers-465 source package in Groovy:
  Fix Committed
Status in nvidia-settings source package in Groovy:
  Fix Committed
Status in linux source package in Hirsute:
  In Progress
Status in nvidia-graphics-drivers-390 source package in Hirsute:
  Fix Committed
Status in nvidia-graphics-drivers-418-server source package in Hirsute:
  Fix Committed
Status in nvidia-graphics-drivers-450-server source package in Hirsute:
  Fix Committed
Status in nvidia-graphics-drivers-460 source package in Hirsute:
  Fix Committed
Status in nvidia-graphics-drivers-460-server source package in Hirsute:
  Fix Committed
Status in nvidia-graphics-drivers-465 source package in Hirsute:
  Fix Committed
Status in nvidia-settings source package in Hirsute:
  Fix Committed

Bug description:
  Update the 465 and the 460 NVIDIA driver series, and add support for
  Linux 5.13 to all the driver series.

  [Impact]
  These releases provide both bug fixes and new features, and we would like to
  make sure all of our users have access to these improvements.

  See the changelog entry below for a full list of changes and bugs.

  [Test Case]
  The following development and SRU process was followed:
  https://wiki.ubuntu.com/NVidiaUpdates

  Certification test suite must pass on a range of hardware:
  https://git.launchpad.net/plainbox-provider-sru/tree/units/sru.pxu

  The QA team that executed the tests will be in charge of attaching the
  artifacts and console output of the appropriate run to the bug. nVidia
  maintainers team members will not mark ‘verification-done’ until this
  has happened.

  [Regression Potential]
  In order to mitigate the regression potential, the results of the
  aforementioned system level tests are attached to this bug.

  [Discussion]

  [Changelog]

  

[Kernel-packages] [Bug 1934424] [NEW] kernel NULL pointer dereference during xen hibernation

2021-07-01 Thread Francis Ginther
Public bug reported:

Encountered the following panic while doing hibernation/resume testing
with linux-aws 5.8 on Focal on an m3.xlarge (xen) instance type:

[  594.291317] ACPI: Hardware changed while hibernated, success doubtful!
[  594.411609] BUG: kernel NULL pointer dereference, address: 01f4
[  594.424658] #PF: supervisor write access in kernel mode
[  594.424660] #PF: error_code(0x0002) - not-present page
[  594.424661] PGD 0 P4D 0 
[  594.424665] Oops: 0002 [#1] SMP PTI
[  594.424668] CPU: 3 PID: 362 Comm: systemd-timesyn Not tainted 5.8.0-1036-aws 
#38~20.04.1-Ubuntu
[  594.424669] Hardware name: Xen HVM domU, BIOS 4.2.amazon 08/24/2006
[  594.424675] RIP: 0010:_raw_spin_lock_irqsave+0x23/0x40
[  594.424678] Code: 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 54 9c 
58 0f 1f 44 00 00 49 89 c4 fa 66 0f 1f 44 00 00 31 c0 ba 01 00 00 00  0f b1 
17 75 07 4c 89 e0 41 5c 5d c3 89 c6 e8 e9 d1 56 ff 66 90
[  594.424679] RSP: 0018:c94e3848 EFLAGS: 00010046
[  594.424680] RAX:  RBX: 8883bcc0d000 RCX: 0e02
[  594.424681] RDX: 0001 RSI:  RDI: 01f4
[  594.424682] RBP: c94e3850 R08: 8883b90b5ec0 R09: 005a
[  594.424683] R10: c94e3910 R11:  R12: 0206
[  594.424684] R13: ea000ee42d40 R14:  R15: 0001
[  594.424686] FS:  7f65ba055980() GS:8883c0ac() 
knlGS:
[  594.424687] CS:  0010 DS:  ES:  CR0: 80050033
[  594.424688] CR2: 01f4 CR3: 0003b99f0001 CR4: 001606e0
[  594.424692] Call Trace:
[  594.424699]  xennet_start_xmit+0x158/0x570
[  594.424704]  dev_hard_start_xmit+0x91/0x1f0
[  594.424706]  ? validate_xmit_skb+0x300/0x340
[  594.424710]  sch_direct_xmit+0x113/0x340
[  594.424712]  __dev_queue_xmit+0x57c/0x8e0
[  594.424714]  ? neigh_add_timer+0x37/0x60
[  594.424716]  dev_queue_xmit+0x10/0x20
[  594.424717]  neigh_resolve_output+0x112/0x1c0
[  594.424721]  ip_finish_output2+0x19b/0x590
[  594.424723]  __ip_finish_output+0xc8/0x1e0
[  594.424725]  ip_finish_output+0x2d/0xb0
[  594.424728]  ip_output+0x7a/0xf0
[  594.424730]  ? __ip_finish_output+0x1e0/0x1e0
[  594.424732]  ip_local_out+0x3d/0x50
[  594.424734]  ip_send_skb+0x19/0x40
[  594.424737]  udp_send_skb.isra.0+0x165/0x390
[  594.424739]  udp_sendmsg+0xb0e/0xd50
[  594.424742]  ? ip_reply_glue_bits+0x50/0x50
[  594.424747]  ? delete_from_swap_cache+0x6a/0x90
[  594.424750]  ? _cond_resched+0x19/0x30
[  594.424754]  ? aa_sk_perm+0x43/0x1b0
[  594.424757]  inet_sendmsg+0x65/0x70
[  594.424759]  ? security_socket_sendmsg+0x35/0x50
[  594.424760]  ? inet_sendmsg+0x65/0x70
[  594.424764]  sock_sendmsg+0x5e/0x70
[  594.424766]  __sys_sendto+0x113/0x190
[  594.424770]  ? __secure_computing+0x42/0xe0
[  594.424774]  ? syscall_trace_enter+0x10d/0x280
[  594.424777]  __x64_sys_sendto+0x29/0x30
[  594.424781]  do_syscall_64+0x49/0xc0
[  594.424783]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  594.424785] RIP: 0033:0x7f65baee4844
[  594.424788] Code: 42 3f f7 ff 44 8b 4c 24 2c 4c 8b 44 24 20 89 c5 44 8b 54 
24 28 48 8b 54 24 18 b8 2c 00 00 00 48 8b 74 24 10 8b 7c 24 08 0f 05 <48> 3d 00 
f0 ff ff 77 30 89 ef 48 89 44 24 08 e8 68 3f f7 ff 48 8b
[  594.424789] RSP: 002b:7ffe9b5fd3a0 EFLAGS: 0293 ORIG_RAX: 
002c
[  594.424790] RAX: ffda RBX: 7ffe9b5fd4e0 RCX: 7f65baee4844
[  594.424791] RDX: 0030 RSI: 7ffe9b5fd3f0 RDI: 0010
[  594.424792] RBP:  R08: 560426541678 R09: 0010
[  594.424793] R10: 0040 R11: 0293 R12: 
[  594.424794] R13: 7ffe9b5fd3e4 R14: 0068 R15: 
[  594.424796] Modules linked in: btrfs blake2b_generic xor raid6_pq ufs msdos 
xfs libcrc32c dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua 
crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd 
psmouse cryptd input_leds glue_helper serio_raw floppy sch_fq_codel drm 
ip_tables x_tables autofs4
[  594.424813] CR2: 01f4
[  594.424821] ---[ end trace bb5f35055c1a8060 ]---
[  594.424822] RIP: 0010:_raw_spin_lock_irqsave+0x23/0x40
[  594.424824] Code: 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 54 9c 
58 0f 1f 44 00 00 49 89 c4 fa 66 0f 1f 44 00 00 31 c0 ba 01 00 00 00  0f b1 
17 75 07 4c 89 e0 41 5c 5d c3 89 c6 e8 e9 d1 56 ff 66 90
[  594.424824] RSP: 0018:c94e3848 EFLAGS: 00010046
[  594.424825] RAX:  RBX: 8883bcc0d000 RCX: 0e02
[  594.424826] RDX: 0001 RSI:  RDI: 01f4
[  594.424826] RBP: c94e3850 R08: 8883b90b5ec0 R09: 005a
[  594.424827] R10: c94e3910 R11:  R12: 0206
[  594.424827] R13: ea000ee42d40 R14:  R15: 0001
[  594.424828] FS:  7f65ba055980() GS:8883c0ac0

[Kernel-packages] [Bug 1923191] Re: cpuhotplug from ubuntu_ltp failed for cpuhotplug02 cpuhotplug03 cpuhotplug04 cpuhotplug06

2021-05-26 Thread Francis Ginther
** Tags added: 5.4 sru-20210510

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-azure in Ubuntu.
https://bugs.launchpad.net/bugs/1923191

Title:
  cpuhotplug from ubuntu_ltp failed for  cpuhotplug02 cpuhotplug03
  cpuhotplug04 cpuhotplug06

Status in ubuntu-kernel-tests:
  New
Status in linux-azure package in Ubuntu:
  New
Status in linux-azure-4.15 package in Ubuntu:
  New
Status in linux-azure source package in Trusty:
  New
Status in linux-azure-4.15 source package in Trusty:
  New
Status in linux-azure source package in Xenial:
  New
Status in linux-azure-4.15 source package in Xenial:
  New
Status in linux-azure source package in Bionic:
  New
Status in linux-azure-4.15 source package in Bionic:
  New

Bug description:
  cpuhotplug02:

  Name: cpuhotplug02
  Date: Tue Mar 30 13:32:56 UTC 2021
  Desc: What happens to a process when its CPU is offlined?

  CPU is 1
  sh: echo: I/O error
  cpuhotplug02 1 TFAIL: process did not change from CPU 1
  tag=cpuhotplug02 stime=161776 dur=5 exit=exited stat=1 core=no cu=2 
cs=2
  startup='Tue Mar 30 13:33:06 2021'

  
  cpuhotplug03:

  Name: cpuhotplug03
  Date: Tue Mar 30 13:33:06 UTC 2021
  Desc: Do tasks get scheduled to a newly on-lined CPU?

  CPU is 1
  sh: echo: I/O error
  cpuhotplug03 1 TBROK: CPU1 cannot be offlined
  USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
  root 20613 0.0 0.0 4636 864 ? R 13:33 0:00 /bin/sh 
/opt/ltp/testcases/bin/cpuhotplug_do_spin_loop
  root 20614 0.0 0.0 4636 812 ? R 13:33 0:00 /bin/sh 
/opt/ltp/testcases/bin/cpuhotplug_do_spin_loop
  root 20615 0.0 0.0 4636 812 ? R 13:33 0:00 /bin/sh 
/opt/ltp/testcases/bin/cpuhotplug_do_spin_loop
  root 20616 0.0 0.0 4636 880 ? R 13:33 0:00 /bin/sh 
/opt/ltp/testcases/bin/cpuhotplug_do_spin_loop
  root 20620 0.0 0.0 14864 1092 ? S 13:33 0:00 grep cpuhotplug_do_spin_loop
  cpuhotplug03 1 TINFO: Onlining CPU 1
  1 /bin/sh /opt/ltp/testcases/bin/cpuhotplug_do_spin_loop
  1 /bin/sh /opt/ltp/testcases/bin/cpuhotplug_do_spin_loop
  0 /bin/sh /opt/ltp/testcases/bin/cpuhotplug_do_spin_loop
  0 /bin/sh /opt/ltp/testcases/bin/cpuhotplug_do_spin_loop
  cpuhotplug03 1 TPASS: 2 cpuhotplug_do_spin_loop processes found on CPU1
  tag=cpuhotplug03 stime=161786 dur=2 exit=exited stat=2 core=no cu=242 
cs=38
  startup='Tue Mar 30 13:33:08 2021'

  
  cpuhotplug04:

  Name: cpuhotplug04
  Date: Tue Mar 30 13:33:08 UTC 2021
  Desc: Does it prevent us from offlining the last CPU?

  sh: echo: I/O error
  cpuhotplug04 1 TFAIL: Could not offline cpu1
  tag=cpuhotplug04 stime=161788 dur=0 exit=exited stat=1 core=no cu=4 
cs=4
  startup='Tue Mar 30 13:33:08 2021'

  
  cpuhotplug06

  Name: cpuhotplug06
  Date: Tue Mar 30 13:33:08 UTC 2021
  Desc: Does top work properly when CPU hotplug events occur?

  CPU is 1
  sh: echo: I/O error
  cpuhotplug06 1 TBROK: CPU1 cannot be offlined
  20913 ? 00:00:00 top
  tag=cpuhotplug06 stime=161788 dur=1 exit=exited stat=2 core=no cu=2 
cs=3
  startup='Tue Mar 30 13:33:15 2021'

  
  
http://10.246.72.46/4.15.0-1112.125-azure/bionic-linux-azure-4.15-azure-amd64-4.15.0-Basic_A2-ubuntu_ltp/ubuntu_ltp/results/ubuntu_ltp.cpuhotplug/debug/ubuntu_ltp.cpuhotplug.DEBUG.html

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1923191/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1925522] Re: Introduce the 465 driver series, fabric-manager, and libnvidia-nscq

2021-05-25 Thread Francis Ginther
I've completed additional testing on bionic on a dgx2:

nvidia-graphics-drivers-450-server:
Both the bionic lrm and dkms versions were tested successfully with via the 
cuda samples test (version 11.0).

fabric-manager-450 and libnvidia-nscq-450:
These passed using the cuda samples test and 'dcgmi discovery -l' on a dgx-2 
host using bionic.

I also retested the focal lrm version nvidia-graphics-drivers-450-server
on a gcp cloud VM. It now installs with the kernel and lrm packages in
focal-proposed.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to nvidia-settings in Ubuntu.
https://bugs.launchpad.net/bugs/1925522

Title:
  Introduce the 465 driver series, fabric-manager, and libnvidia-nscq

Status in fabric-manager-450 package in Ubuntu:
  Fix Released
Status in fabric-manager-460 package in Ubuntu:
  Fix Released
Status in libnvidia-nscq-450 package in Ubuntu:
  Fix Released
Status in libnvidia-nscq-460 package in Ubuntu:
  Fix Released
Status in linux-restricted-modules package in Ubuntu:
  In Progress
Status in nvidia-graphics-drivers-450-server package in Ubuntu:
  In Progress
Status in nvidia-graphics-drivers-460 package in Ubuntu:
  In Progress
Status in nvidia-graphics-drivers-465 package in Ubuntu:
  Fix Released
Status in nvidia-settings package in Ubuntu:
  In Progress
Status in fabric-manager-450 source package in Bionic:
  Fix Committed
Status in fabric-manager-460 source package in Bionic:
  Fix Committed
Status in libnvidia-nscq-450 source package in Bionic:
  Fix Committed
Status in libnvidia-nscq-460 source package in Bionic:
  Fix Committed
Status in linux-restricted-modules source package in Bionic:
  Fix Committed
Status in nvidia-graphics-drivers-450-server source package in Bionic:
  Fix Committed
Status in nvidia-graphics-drivers-460 source package in Bionic:
  Fix Committed
Status in nvidia-graphics-drivers-465 source package in Bionic:
  Fix Committed
Status in nvidia-settings source package in Bionic:
  Fix Committed
Status in fabric-manager-450 source package in Focal:
  Fix Committed
Status in fabric-manager-460 source package in Focal:
  Fix Committed
Status in libnvidia-nscq-450 source package in Focal:
  Fix Committed
Status in libnvidia-nscq-460 source package in Focal:
  Fix Committed
Status in linux-restricted-modules source package in Focal:
  Fix Committed
Status in nvidia-graphics-drivers-450-server source package in Focal:
  Fix Committed
Status in nvidia-graphics-drivers-460 source package in Focal:
  Fix Committed
Status in nvidia-graphics-drivers-465 source package in Focal:
  Fix Committed
Status in nvidia-settings source package in Focal:
  Fix Committed
Status in fabric-manager-450 source package in Groovy:
  Fix Committed
Status in fabric-manager-460 source package in Groovy:
  Fix Committed
Status in libnvidia-nscq-450 source package in Groovy:
  Fix Committed
Status in libnvidia-nscq-460 source package in Groovy:
  Fix Committed
Status in linux-restricted-modules source package in Groovy:
  Fix Committed
Status in nvidia-graphics-drivers-450-server source package in Groovy:
  Fix Committed
Status in nvidia-graphics-drivers-460 source package in Groovy:
  Fix Committed
Status in nvidia-graphics-drivers-465 source package in Groovy:
  Fix Committed
Status in nvidia-settings source package in Groovy:
  Fix Committed
Status in fabric-manager-450 source package in Hirsute:
  Fix Committed
Status in fabric-manager-460 source package in Hirsute:
  Fix Committed
Status in libnvidia-nscq-450 source package in Hirsute:
  Fix Committed
Status in libnvidia-nscq-460 source package in Hirsute:
  Fix Committed
Status in linux-restricted-modules source package in Hirsute:
  Fix Committed
Status in nvidia-graphics-drivers-450-server source package in Hirsute:
  Fix Committed
Status in nvidia-graphics-drivers-460 source package in Hirsute:
  Fix Committed
Status in nvidia-graphics-drivers-465 source package in Hirsute:
  Fix Committed
Status in nvidia-settings source package in Hirsute:
  Fix Committed

Bug description:
  Introduce the new NVIDIA 465 driver series, fabric-manager and
  libnvidia-nscq. Also migrate the UDA 450 series to the 460 series.

  [Impact]
  These releases provide both bug fixes and new features, and we would like to
  make sure all of our users have access to these improvements.

  See the changelog entry below for a full list of changes and bugs.

  [Test Case]
  The following development and SRU process was followed:
  https://wiki.ubuntu.com/NVidiaUpdates

  Certification test suite must pass on a range of hardware:
  https://git.launchpad.net/plainbox-provider-sru/tree/units/sru.pxu

  The QA team that executed the tests will be in charge of attaching the
  artifacts and console output of the appropriate run to the bug. nVidia
  maintainers team members will not mark ‘verification-done’ until this
  has happened.

  [Regression Potential]
  In order to mitigate the regression potential, the r

[Kernel-packages] [Bug 1925522] Re: Introduce the 465 driver series, fabric-manager, and libnvidia-nscq

2021-05-25 Thread Francis Ginther
fabric-manager-460 and libnvidia-nscq-460:
These passed using the cuda samples test and 'dcgmi discovery -l' test on a 
dgx-2 host using bionic and focal.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to nvidia-settings in Ubuntu.
https://bugs.launchpad.net/bugs/1925522

Title:
  Introduce the 465 driver series, fabric-manager, and libnvidia-nscq

Status in fabric-manager-450 package in Ubuntu:
  Fix Released
Status in fabric-manager-460 package in Ubuntu:
  Fix Released
Status in libnvidia-nscq-450 package in Ubuntu:
  Fix Released
Status in libnvidia-nscq-460 package in Ubuntu:
  Fix Released
Status in linux-restricted-modules package in Ubuntu:
  In Progress
Status in nvidia-graphics-drivers-450-server package in Ubuntu:
  In Progress
Status in nvidia-graphics-drivers-460 package in Ubuntu:
  In Progress
Status in nvidia-graphics-drivers-465 package in Ubuntu:
  Fix Released
Status in nvidia-settings package in Ubuntu:
  In Progress
Status in fabric-manager-450 source package in Bionic:
  Fix Committed
Status in fabric-manager-460 source package in Bionic:
  Fix Committed
Status in libnvidia-nscq-450 source package in Bionic:
  Fix Committed
Status in libnvidia-nscq-460 source package in Bionic:
  Fix Committed
Status in linux-restricted-modules source package in Bionic:
  Fix Committed
Status in nvidia-graphics-drivers-450-server source package in Bionic:
  Fix Committed
Status in nvidia-graphics-drivers-460 source package in Bionic:
  Fix Committed
Status in nvidia-graphics-drivers-465 source package in Bionic:
  Fix Committed
Status in nvidia-settings source package in Bionic:
  Fix Committed
Status in fabric-manager-450 source package in Focal:
  Fix Committed
Status in fabric-manager-460 source package in Focal:
  Fix Committed
Status in libnvidia-nscq-450 source package in Focal:
  Fix Committed
Status in libnvidia-nscq-460 source package in Focal:
  Fix Committed
Status in linux-restricted-modules source package in Focal:
  Fix Committed
Status in nvidia-graphics-drivers-450-server source package in Focal:
  Fix Committed
Status in nvidia-graphics-drivers-460 source package in Focal:
  Fix Committed
Status in nvidia-graphics-drivers-465 source package in Focal:
  Fix Committed
Status in nvidia-settings source package in Focal:
  Fix Committed
Status in fabric-manager-450 source package in Groovy:
  Fix Committed
Status in fabric-manager-460 source package in Groovy:
  Fix Committed
Status in libnvidia-nscq-450 source package in Groovy:
  Fix Committed
Status in libnvidia-nscq-460 source package in Groovy:
  Fix Committed
Status in linux-restricted-modules source package in Groovy:
  Fix Committed
Status in nvidia-graphics-drivers-450-server source package in Groovy:
  Fix Committed
Status in nvidia-graphics-drivers-460 source package in Groovy:
  Fix Committed
Status in nvidia-graphics-drivers-465 source package in Groovy:
  Fix Committed
Status in nvidia-settings source package in Groovy:
  Fix Committed
Status in fabric-manager-450 source package in Hirsute:
  Fix Committed
Status in fabric-manager-460 source package in Hirsute:
  Fix Committed
Status in libnvidia-nscq-450 source package in Hirsute:
  Fix Committed
Status in libnvidia-nscq-460 source package in Hirsute:
  Fix Committed
Status in linux-restricted-modules source package in Hirsute:
  Fix Committed
Status in nvidia-graphics-drivers-450-server source package in Hirsute:
  Fix Committed
Status in nvidia-graphics-drivers-460 source package in Hirsute:
  Fix Committed
Status in nvidia-graphics-drivers-465 source package in Hirsute:
  Fix Committed
Status in nvidia-settings source package in Hirsute:
  Fix Committed

Bug description:
  Introduce the new NVIDIA 465 driver series, fabric-manager and
  libnvidia-nscq. Also migrate the UDA 450 series to the 460 series.

  [Impact]
  These releases provide both bug fixes and new features, and we would like to
  make sure all of our users have access to these improvements.

  See the changelog entry below for a full list of changes and bugs.

  [Test Case]
  The following development and SRU process was followed:
  https://wiki.ubuntu.com/NVidiaUpdates

  Certification test suite must pass on a range of hardware:
  https://git.launchpad.net/plainbox-provider-sru/tree/units/sru.pxu

  The QA team that executed the tests will be in charge of attaching the
  artifacts and console output of the appropriate run to the bug. nVidia
  maintainers team members will not mark ‘verification-done’ until this
  has happened.

  [Regression Potential]
  In order to mitigate the regression potential, the results of the
  aforementioned system level tests are attached to this bug.

  [Discussion]

  [Changelog]

  == 450-server ==

    * New upstream release (LP: #1925522).
  - Fixed an issue with the NSCQ library that caused clients such as
    DCGM to fail to load the library. Since the NSCQ library version
    needs to match the driver

[Kernel-packages] [Bug 1925522] Re: Introduce the 465 driver series, fabric-manager, and libnvidia-nscq

2021-05-21 Thread Francis Ginther
nvidia-graphics-drivers-450-server, 450.119.04 results:

The DKMS driver was tested via the cuda samples test. This passed for
bionic, focal and groovy, but failed for hirsute and impish. This
failure was expected as the nvidia_uvm module doens't currently load
with the 5.11 kernel in hirsute and impish.

The LRM driver was also tested via the cuda samples test. This passed
for groovy, hirsute and impish, but failed on bionic and focal as there
isn't yet a matching lrm package for the 450.119.04 driver. When these
packages are available later in the kernel SRU cycle, it will be
retested.

fabric-manager-450 and libnvidia-nscq-450:
These passed using the cuda samples test and 'dcgmi discovery -l' on a dgx-2 
host using focal.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to nvidia-settings in Ubuntu.
https://bugs.launchpad.net/bugs/1925522

Title:
  Introduce the 465 driver series, fabric-manager, and libnvidia-nscq

Status in fabric-manager-450 package in Ubuntu:
  Fix Released
Status in fabric-manager-460 package in Ubuntu:
  Fix Released
Status in libnvidia-nscq-450 package in Ubuntu:
  Fix Released
Status in libnvidia-nscq-460 package in Ubuntu:
  Fix Released
Status in linux-restricted-modules package in Ubuntu:
  In Progress
Status in nvidia-graphics-drivers-450-server package in Ubuntu:
  In Progress
Status in nvidia-graphics-drivers-460 package in Ubuntu:
  In Progress
Status in nvidia-graphics-drivers-465 package in Ubuntu:
  Fix Released
Status in nvidia-settings package in Ubuntu:
  In Progress
Status in fabric-manager-450 source package in Bionic:
  Fix Committed
Status in fabric-manager-460 source package in Bionic:
  Fix Committed
Status in libnvidia-nscq-450 source package in Bionic:
  Fix Committed
Status in libnvidia-nscq-460 source package in Bionic:
  Fix Committed
Status in linux-restricted-modules source package in Bionic:
  Fix Committed
Status in nvidia-graphics-drivers-450-server source package in Bionic:
  Fix Committed
Status in nvidia-graphics-drivers-460 source package in Bionic:
  Fix Committed
Status in nvidia-graphics-drivers-465 source package in Bionic:
  Fix Committed
Status in nvidia-settings source package in Bionic:
  Fix Committed
Status in fabric-manager-450 source package in Focal:
  Fix Committed
Status in fabric-manager-460 source package in Focal:
  Fix Committed
Status in libnvidia-nscq-450 source package in Focal:
  Fix Committed
Status in libnvidia-nscq-460 source package in Focal:
  Fix Committed
Status in linux-restricted-modules source package in Focal:
  Fix Committed
Status in nvidia-graphics-drivers-450-server source package in Focal:
  Fix Committed
Status in nvidia-graphics-drivers-460 source package in Focal:
  Fix Committed
Status in nvidia-graphics-drivers-465 source package in Focal:
  Fix Committed
Status in nvidia-settings source package in Focal:
  Fix Committed
Status in fabric-manager-450 source package in Groovy:
  Fix Committed
Status in fabric-manager-460 source package in Groovy:
  Fix Committed
Status in libnvidia-nscq-450 source package in Groovy:
  Fix Committed
Status in libnvidia-nscq-460 source package in Groovy:
  Fix Committed
Status in linux-restricted-modules source package in Groovy:
  Fix Committed
Status in nvidia-graphics-drivers-450-server source package in Groovy:
  Fix Committed
Status in nvidia-graphics-drivers-460 source package in Groovy:
  Fix Committed
Status in nvidia-graphics-drivers-465 source package in Groovy:
  Fix Committed
Status in nvidia-settings source package in Groovy:
  Fix Committed
Status in fabric-manager-450 source package in Hirsute:
  Fix Committed
Status in fabric-manager-460 source package in Hirsute:
  Fix Committed
Status in libnvidia-nscq-450 source package in Hirsute:
  Fix Committed
Status in libnvidia-nscq-460 source package in Hirsute:
  Fix Committed
Status in linux-restricted-modules source package in Hirsute:
  Fix Committed
Status in nvidia-graphics-drivers-450-server source package in Hirsute:
  Fix Committed
Status in nvidia-graphics-drivers-460 source package in Hirsute:
  Fix Committed
Status in nvidia-graphics-drivers-465 source package in Hirsute:
  Fix Committed
Status in nvidia-settings source package in Hirsute:
  Fix Committed

Bug description:
  Introduce the new NVIDIA 465 driver series, fabric-manager and
  libnvidia-nscq. Also migrate the UDA 450 series to the 460 series.

  [Impact]
  These releases provide both bug fixes and new features, and we would like to
  make sure all of our users have access to these improvements.

  See the changelog entry below for a full list of changes and bugs.

  [Test Case]
  The following development and SRU process was followed:
  https://wiki.ubuntu.com/NVidiaUpdates

  Certification test suite must pass on a range of hardware:
  https://git.launchpad.net/plainbox-provider-sru/tree/units/sru.pxu

  The QA team that executed the tests will be in charge of attaching the
  artifacts

[Kernel-packages] [Bug 1925407] Re: ubuntu_aufs_smoke_test failed on Hirsute RISCV (CONFIG_AUFS_FS is not set)

2021-05-11 Thread Francis Ginther
This is resolved in autotest-client-tests
5b2d46bf401e50a13be5c0641e0157d58c2a7669

** Changed in: ubuntu-kernel-tests
   Status: New => Fix Released

** Changed in: linux-riscv (Ubuntu)
   Status: New => Won't Fix

** Changed in: linux-riscv (Ubuntu Hirsute)
   Status: New => Won't Fix

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-riscv in Ubuntu.
https://bugs.launchpad.net/bugs/1925407

Title:
  ubuntu_aufs_smoke_test failed on Hirsute RISCV (CONFIG_AUFS_FS is not
  set)

Status in ubuntu-kernel-tests:
  Fix Released
Status in linux-riscv package in Ubuntu:
  Won't Fix
Status in linux-riscv source package in Hirsute:
  Won't Fix

Bug description:
  Issue found on 5.11.0-1005.5 RISCV kernel

  Test failed with:
   Running 
'/home/ubuntu/autotest/client/tests/ubuntu_aufs_smoke_test/ubuntu_aufs_smoke_test.sh'
   mount: /tmp/aufs/aufs-root: unknown filesystem type 'aufs'.
   aufs: mount: FAILED: ret=32

  It looks like this is simply because the kernel configs were not enabled:
    ubuntu@riscv64-hirsute:~$  grep AUFS /boot/config-5.11.0-1005-generic
    # CONFIG_AUFS_FS is not set

  I think it's more like a decision making here.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1925407/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1923062] Re: NVIDIA CVE-2021-1076 CVE-2021-1077

2021-04-21 Thread Francis Ginther
Also tested linux-modules-nvidia-460-server-generic-hwe-18.04 on bionic
and linux-modules-nvidia-460-server-generic-hwe-20.04 on focal. They
look good now.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to nvidia-graphics-drivers-390 in Ubuntu.
https://bugs.launchpad.net/bugs/1923062

Title:
  NVIDIA CVE-2021-1076  CVE-2021-1077

Status in nvidia-graphics-drivers-390 package in Ubuntu:
  In Progress
Status in nvidia-graphics-drivers-418-server package in Ubuntu:
  In Progress
Status in nvidia-graphics-drivers-450 package in Ubuntu:
  In Progress
Status in nvidia-graphics-drivers-450-server package in Ubuntu:
  In Progress
Status in nvidia-graphics-drivers-460 package in Ubuntu:
  In Progress
Status in nvidia-graphics-drivers-460-server package in Ubuntu:
  In Progress
Status in nvidia-graphics-drivers-390 source package in Bionic:
  In Progress
Status in nvidia-graphics-drivers-418-server source package in Bionic:
  In Progress
Status in nvidia-graphics-drivers-450 source package in Bionic:
  In Progress
Status in nvidia-graphics-drivers-450-server source package in Bionic:
  In Progress
Status in nvidia-graphics-drivers-460 source package in Bionic:
  In Progress
Status in nvidia-graphics-drivers-460-server source package in Bionic:
  In Progress
Status in nvidia-graphics-drivers-390 source package in Focal:
  In Progress
Status in nvidia-graphics-drivers-418-server source package in Focal:
  In Progress
Status in nvidia-graphics-drivers-450 source package in Focal:
  In Progress
Status in nvidia-graphics-drivers-450-server source package in Focal:
  In Progress
Status in nvidia-graphics-drivers-460 source package in Focal:
  In Progress
Status in nvidia-graphics-drivers-460-server source package in Focal:
  In Progress
Status in nvidia-graphics-drivers-390 source package in Groovy:
  In Progress
Status in nvidia-graphics-drivers-418-server source package in Groovy:
  In Progress
Status in nvidia-graphics-drivers-450 source package in Groovy:
  In Progress
Status in nvidia-graphics-drivers-450-server source package in Groovy:
  In Progress
Status in nvidia-graphics-drivers-460 source package in Groovy:
  In Progress
Status in nvidia-graphics-drivers-460-server source package in Groovy:
  In Progress

Bug description:
  Here is the list of the affected drivers:

  418-server - CVE-2021-1076

  460, 450, 460-server, 450-server - CVE-2021-1076  CVE-2021-1077

  390 - CVE-2021-1076

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/nvidia-graphics-drivers-390/+bug/1923062/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1923062] Re: NVIDIA CVE-2021-1076 CVE-2021-1077

2021-04-21 Thread Francis Ginther
Testing of the server drivers is also complete. All drivers were able to
install as both dkms and l-r-m with no issues. A cuda workload and
nvidia-smi was also tested for each.

Note: no upgrade testing was attempted. Drivers were fully purged before
testing the next driver.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to nvidia-graphics-drivers-390 in Ubuntu.
https://bugs.launchpad.net/bugs/1923062

Title:
  NVIDIA CVE-2021-1076  CVE-2021-1077

Status in nvidia-graphics-drivers-390 package in Ubuntu:
  In Progress
Status in nvidia-graphics-drivers-418-server package in Ubuntu:
  In Progress
Status in nvidia-graphics-drivers-450 package in Ubuntu:
  In Progress
Status in nvidia-graphics-drivers-450-server package in Ubuntu:
  In Progress
Status in nvidia-graphics-drivers-460 package in Ubuntu:
  In Progress
Status in nvidia-graphics-drivers-460-server package in Ubuntu:
  In Progress
Status in nvidia-graphics-drivers-390 source package in Bionic:
  In Progress
Status in nvidia-graphics-drivers-418-server source package in Bionic:
  In Progress
Status in nvidia-graphics-drivers-450 source package in Bionic:
  In Progress
Status in nvidia-graphics-drivers-450-server source package in Bionic:
  In Progress
Status in nvidia-graphics-drivers-460 source package in Bionic:
  In Progress
Status in nvidia-graphics-drivers-460-server source package in Bionic:
  In Progress
Status in nvidia-graphics-drivers-390 source package in Focal:
  In Progress
Status in nvidia-graphics-drivers-418-server source package in Focal:
  In Progress
Status in nvidia-graphics-drivers-450 source package in Focal:
  In Progress
Status in nvidia-graphics-drivers-450-server source package in Focal:
  In Progress
Status in nvidia-graphics-drivers-460 source package in Focal:
  In Progress
Status in nvidia-graphics-drivers-460-server source package in Focal:
  In Progress
Status in nvidia-graphics-drivers-390 source package in Groovy:
  In Progress
Status in nvidia-graphics-drivers-418-server source package in Groovy:
  In Progress
Status in nvidia-graphics-drivers-450 source package in Groovy:
  In Progress
Status in nvidia-graphics-drivers-450-server source package in Groovy:
  In Progress
Status in nvidia-graphics-drivers-460 source package in Groovy:
  In Progress
Status in nvidia-graphics-drivers-460-server source package in Groovy:
  In Progress

Bug description:
  Here is the list of the affected drivers:

  418-server - CVE-2021-1076

  460, 450, 460-server, 450-server - CVE-2021-1076  CVE-2021-1077

  390 - CVE-2021-1076

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/nvidia-graphics-drivers-390/+bug/1923062/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1922387] Re: BUG: kernel NULL pointer dereference, address: 0000000000000050

2021-04-02 Thread Francis Ginther
This panic occurred while running the ubuntu_kernel_selftests suite. The
last bit of logs are:

13:33:20 DEBUG| [stdout] # selftests: ftrace: ftracetest
13:33:20 DEBUG| [stdout] # === Ftrace unit tests ===
13:33:28 DEBUG| [stdout] # [1] Basic trace file check [PASS]
13:37:04 DEBUG| [stdout] # [2] Basic test for tracers [PASS]
13:39:48 DEBUG| [stdout] # [3] Basic trace clock test [PASS]
13:39:56 DEBUG| [stdout] # [4] Basic event tracing check [PASS]
13:40:04 DEBUG| [stdout] # [5] Change the ringbuffer size [PASS]
13:40:20 DEBUG| [stdout] # [6] Snapshot and tracing setting [PASS]
13:40:35 DEBUG| [stdout] # [7] trace_pipe and trace_marker [PASS]
13:40:51 DEBUG| [stdout] # [8] Generic dynamic event - add/remove kprobe events 
[PASS]
13:41:07 DEBUG| [stdout] # [9] Generic dynamic event - add/remove synthetic 
events [PASS]
13:41:14 DEBUG| [stdout] # [10] Generic dynamic event - selective clear 
(compatibility) [PASS]
13:41:22 DEBUG| [stdout] # [11] Generic dynamic event - generic clear event 
[PASS]
13:41:46 DEBUG| [stdout] # [12] event tracing - enable/disable with event level 
files [PASS]
13:42:17 DEBUG| [stdout] # [13] event tracing - restricts events based on pid 
[PASS]
13:42:41 DEBUG| [stdout] # [14] event tracing - enable/disable with subsystem 
level files [PASS]
13:43:05 DEBUG| [stdout] # [15] event tracing - enable/disable with top level 
files [PASS]
13:43:14 DEBUG| [stdout] # [16] Test trace_printk from module [PASS]
13:43:56 DEBUG| [stdout] # [17] ftrace - function graph filters with stack 
tracer [PASS]
13:44:29 DEBUG| [stdout] # [18] ftrace - function graph filters [PASS]
13:45:49 DEBUG| [stdout] # [19] ftrace - function pid filters [PASS]
13:46:06 DEBUG| [stdout] # [20] ftrace - stacktrace filter command [PASS]
13:46:38 DEBUG| [stdout] # [21] ftrace - function trace with cpumask [PASS]
13:47:13 DEBUG| [stdout] # [22] ftrace - test for function event triggers [PASS]
13:47:21 DEBUG| [stdout] # [23] ftrace - function trace on module [PASS]
13:47:31 DEBUG| [stdout] # [24] ftrace - function profiling [PASS]
13:48:07 DEBUG| [stdout] # [25] ftrace - function profiler with function 
tracing [PASS]
13:48:25 DEBUG| [stdout] # [26] ftrace - test reading of set_ftrace_filter 
[PASS]
 END OF MESSAGES 

This job was run twice. The prior run also hung before completing, but
we don't have a console log for that time period, so it's unclear if it
also panic'd. It's last messages were:

04:44:27 DEBUG| [stdout] # selftests: timers: nsleep-lat
04:44:48 DEBUG| [stdout] # nsleep latency CLOCK_REALTIME [OK]
04:45:09 DEBUG| [stdout] # nsleep latency CLOCK_MONOTONIC [OK]
04:45:09 DEBUG| [stdout] # nsleep latency CLOCK_MONOTONIC_RAW [UNSUPPORTED]
04:45:09 DEBUG| [stdout] # nsleep latency CLOCK_REALTIME_COARSE [UNSUPPORTED]
04:45:09 DEBUG| [stdout] # nsleep latency CLOCK_MONOTONIC_COARSE [UNSUPPORTED]
04:45:30 DEBUG| [stdout] # nsleep latency CLOCK_BOOTTIME [OK]
04:45:52 DEBUG| [stdout] # nsleep latency CLOCK_REALTIME_ALARM [OK]
04:46:13 DEBUG| [stdout] # nsleep latency CLOCK_BOOTTIME_ALARM [OK]
04:46:34 DEBUG| [stdout] # nsleep latency CLOCK_TAI [OK]
04:46:34 DEBUG| [stdout] # # Pass 0 Fail 0 Xfail 0 Xpass 0 Skip 0 Error 0
04:46:34 DEBUG| [stdout] ok 3 selftests: timers: nsleep-lat
04:46:34 DEBUG| [stdout] # selftests: timers: set-timer-lat

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1922387

Title:
  BUG: kernel NULL pointer dereference, address: 0050

Status in linux package in Ubuntu:
  New
Status in linux source package in Focal:
  Confirmed
Status in linux source package in Groovy:
  New
Status in linux source package in Hirsute:
  New

Bug description:
  I observed the following kernel panic with the 5.4.0-71.79-generic
  kernel while running kernel selftests:

  blanka login: [ 1671.958400] mmiotrace: Error taking CPU253 down: -28
  [ 1672.118199] mmiotrace: Error taking CPU254 down: -28
  [ 1672.230306] mmiotrace: Error taking CPU255 down: -28
  [ 2503.359753] BUG: kernel NULL pointer dereference, address: 0050
  [ 2503.367527] #PF: supervisor read access in kernel mode
  [ 2503.373257] #PF: error_code(0x) - not-present page
  [ 2503.378989] PGD 0 P4D 0 
  [ 2503.381812] Oops:  [#1] SMP NOPTI
  [ 2503.385896] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G   OE 
5.4.0-71-generic #79-Ubuntu
  [ 2503.395795] Hardware name: NVIDIA DGXA100 920-23687-2530-000/DGXA100, BIOS 
0.33 01/19/2021
  [ 2503.405027] RIP: 0010:trace_event_raw_event_wbt_timer+0x6f/0x100
  [ 2503.411728] Code: 59 80 e5 02 0f 85 8f 00 00 00 4c 89 e6 ba 34 00 00 00 48 
8d 7d a0 e8 d0 a4 ca ff 49 89 c4 48 85 c0 74 37 49 8b 87 b8 03 00 00 <48> 8b 70 
50 48 85 f6 74 45 49 8d 7c 24 08 ba 20 00 00 00 e8 59 91
  [ 2503.432683] RSP: 0018:a8d6c0003d90 EFLAGS: 00010286
  [ 2503.438513] RAX:  RBX:  RCX: 
8100
  [ 2503.446474] RDX: 9968a228f418 RSI: 00

[Kernel-packages] [Bug 1922387] Re: BUG: kernel NULL pointer dereference, address: 0000000000000050

2021-04-02 Thread Francis Ginther
This panic occurred while running the ubuntu_kernel_selftests suite. The
last bit of logs are:

13:33:20 DEBUG| [stdout] # selftests: ftrace: ftracetest
13:33:20 DEBUG| [stdout] # === Ftrace unit tests ===
13:33:28 DEBUG| [stdout] # [1] Basic trace file check   [PASS]
13:37:04 DEBUG| [stdout] # [2] Basic test for tracers   [PASS]
13:39:48 DEBUG| [stdout] # [3] Basic trace clock test   [PASS]
13:39:56 DEBUG| [stdout] # [4] Basic event tracing check[PASS]
13:40:04 DEBUG| [stdout] # [5] Change the ringbuffer size   [PASS]
13:40:20 DEBUG| [stdout] # [6] Snapshot and tracing setting [PASS]
13:40:35 DEBUG| [stdout] # [7] trace_pipe and trace_marker  [PASS]
13:40:51 DEBUG| [stdout] # [8] Generic dynamic event - add/remove kprobe events 
[PASS]
13:41:07 DEBUG| [stdout] # [9] Generic dynamic event - add/remove synthetic 
events  [PASS]
13:41:14 DEBUG| [stdout] # [10] Generic dynamic event - selective clear 
(compatibility) [PASS]
13:41:22 DEBUG| [stdout] # [11] Generic dynamic event - generic clear event 
[PASS]
13:41:46 DEBUG| [stdout] # [12] event tracing - enable/disable with event level 
files   [PASS]
13:42:17 DEBUG| [stdout] # [13] event tracing - restricts events based on pid   
[PASS]
13:42:41 DEBUG| [stdout] # [14] event tracing - enable/disable with subsystem 
level files   [PASS]
13:43:05 DEBUG| [stdout] # [15] event tracing - enable/disable with top level 
files [PASS]
13:43:14 DEBUG| [stdout] # [16] Test trace_printk from module   [PASS]
13:43:56 DEBUG| [stdout] # [17] ftrace - function graph filters with stack 
tracer   [PASS]
13:44:29 DEBUG| [stdout] # [18] ftrace - function graph filters [PASS]
13:45:49 DEBUG| [stdout] # [19] ftrace - function pid filters   [PASS]
13:46:06 DEBUG| [stdout] # [20] ftrace - stacktrace filter command  [PASS]
13:46:38 DEBUG| [stdout] # [21] ftrace - function trace with cpumask[PASS]
13:47:13 DEBUG| [stdout] # [22] ftrace - test for function event triggers   
[PASS]
13:47:21 DEBUG| [stdout] # [23] ftrace - function trace on module   [PASS]
13:47:31 DEBUG| [stdout] # [24] ftrace - function profiling [PASS]
13:48:07 DEBUG| [stdout] # [25] ftrace - function profiler with function 
tracing[PASS]
13:48:25 DEBUG| [stdout] # [26] ftrace - test reading of set_ftrace_filter  
[PASS]
 END OF MESSAGES 

This job was run twice. The prior run also hung before completing, but
we don't have a console log for that time period, so it's unclear if it
also panic'd. It's last messages were:

04:44:27 DEBUG| [stdout] # selftests: timers: nsleep-lat
04:44:48 DEBUG| [stdout] # nsleep latency CLOCK_REALTIME [OK]
04:45:09 DEBUG| [stdout] # nsleep latency CLOCK_MONOTONIC[OK]
04:45:09 DEBUG| [stdout] # nsleep latency CLOCK_MONOTONIC_RAW
[UNSUPPORTED]
04:45:09 DEBUG| [stdout] # nsleep latency CLOCK_REALTIME_COARSE  
[UNSUPPORTED]
04:45:09 DEBUG| [stdout] # nsleep latency CLOCK_MONOTONIC_COARSE 
[UNSUPPORTED]
04:45:30 DEBUG| [stdout] # nsleep latency CLOCK_BOOTTIME [OK]
04:45:52 DEBUG| [stdout] # nsleep latency CLOCK_REALTIME_ALARM   [OK]
04:46:13 DEBUG| [stdout] # nsleep latency CLOCK_BOOTTIME_ALARM   [OK]
04:46:34 DEBUG| [stdout] # nsleep latency CLOCK_TAI  [OK]
04:46:34 DEBUG| [stdout] # # Pass 0 Fail 0 Xfail 0 Xpass 0 Skip 0 Error 0
04:46:34 DEBUG| [stdout] ok 3 selftests: timers: nsleep-lat
04:46:34 DEBUG| [stdout] # selftests: timers: set-timer-lat

The job can be found here:
http://10.246.72.4:8080/view/nvidia%20a100%20-%20blanka/job/focal-linux-
generic-amd64-5.4.0-blanka-ubuntu_kernel_selftests/

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1922387

Title:
  BUG: kernel NULL pointer dereference, address: 0050

Status in linux package in Ubuntu:
  New
Status in linux source package in Focal:
  Confirmed
Status in linux source package in Groovy:
  New
Status in linux source package in Hirsute:
  New

Bug description:
  I observed the following kernel panic with the 5.4.0-71.79-generic
  kernel while running kernel selftests:

  blanka login: [ 1671.958400] mmiotrace: Error taking CPU253 down: -28
  [ 1672.118199] mmiotrace: Error taking CPU254 down: -28
  [ 1672.230306] mmiotrace: Error taking CPU255 down: -28
  [ 2503.359753] BUG: kernel NULL pointer dereference, address: 0050
  [ 2503.367527] #PF: supervisor read access in kernel mode
  [ 2503.373257] #PF: error_code(0x) - not-present page
  [ 2503.378989] PGD 0 P4D 0 
  [ 2503.381812] Oops:  [#1] SMP NOPTI
  [ 2503.385896] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G   OE 
5.4.0-71-generic #79-Ubuntu
  [ 2503.395795] Hardware name: NVIDIA DGXA100 920-23687-2530-000/DGXA100, BIOS 
0.33 01/19/2021
  [ 2503.405027] RIP: 0010:trace_event_raw_event_wbt_timer+0x6f/0x100
  [ 2503.411728] Code: 59 80 e5 02 0f 85 8f 00 00 00 4c 89 e6 ba 34 00 00 00

[Kernel-packages] [Bug 1918226] Re: testbed auxverb failed with exit code 255 with linux on Groovy ADT failure

2021-03-10 Thread Francis Ginther
I also looked at this and compared with older kernels. This test has
never progressed past the "Kretprobe dynamic event with maxactive" test
on 5.8, but it has on 5.4:


```
18:41:31 DEBUG| [stdout] # [43] Kprobe event parser error log check [PASS]
18:41:32 DEBUG| [stdout] # [44] Kretprobe dynamic event with arguments  [PASS]
18:41:33 DEBUG| [stdout] # [45] Kretprobe dynamic event with maxactive  [PASS]
18:41:49 DEBUG| [stdout] # [46] Register/unregister many kprobe events  [PASS]
18:41:49 DEBUG| [stdout] # [47] Kprobe dynamic event - adding and removing  
[PASS]
18:41:50 DEBUG| [stdout] # [48] Uprobe event parser error log check [PASS]
18:41:50 DEBUG| [stdout] # [49] test for the preemptirqsoff tracer  
[UNSUPPORTED]
18:42:32 DEBUG| [stdout] # [50] Meta-selftest   [PASS]
```

I suspect that next test is causing a hang and ssh dies. The autopkgtest
infrastructure sees this and tries to provide some debugging info (the
console log and vm details).

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1918226

Title:
  testbed auxverb failed with exit code 255 with linux on Groovy ADT
  failure

Status in linux package in Ubuntu:
  Invalid
Status in linux source package in Groovy:
  Confirmed

Bug description:
  Testing failed on:
  arm64: 
https://objectstorage.prodstack4-5.canonical.com/v1/AUTH_77e2ada1e7a84929a74ba3b87153c0ac/autopkgtest-groovy/groovy/arm64/l/linux/20210308_234307_e716b@/log.gz

  
  Looks to be a flaky test with this error occurring frequently. This is not a 
regression. 

  Found a previously reported (expired) bug with the same error:
  https://launchpad.net/bugs/1549425

  
  [ 6606.751232] audit: backlog limit Creating nova instance 
adt-groovy-arm64-linux-20210308-143145 from image 
adt/ubuntu-groovy-arm64-server-20210308.img (UUID 
1cfb02a6-bf88-46bb-a8f1-3c6589cd939e)...
  Creating nova instance adt-groovy-arm64-linux-20210308-143145 from image 
adt/ubuntu-groovy-arm64-server-20210308.img (UUID 
1cfb02a6-bf88-46bb-a8f1-3c6589cd939e)...
  autopkgtest [23:42:55]: ERROR: testbed failure: testbed auxverb failed with 
exit code 255

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1918226/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1805806] Re: test_maps in ubuntu_bpf failed with "Failed sockmap unexpected timeout" on D ARM64

2021-02-19 Thread Francis Ginther
Seen with linux-aws 5.4.0-1038.40~18.04.1.

** Tags added: sru-20210125

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1805806

Title:
  test_maps in ubuntu_bpf failed with "Failed sockmap unexpected
  timeout" on D ARM64

Status in ubuntu-kernel-tests:
  Triaged
Status in linux package in Ubuntu:
  Invalid

Bug description:
  This issue can be found on 2 different ARM64 node, TunderX Cavium node
  "starmie" and Moonshot "ms10-34-mcdivittB0-kernel"

  Running test_maps bpf test..
    Fork 1024 tasks to 'test_update_delete'
    Fork 1024 tasks to 'test_update_delete'
    
    Fork 1024 tasks to 'test_update_delete'
    Fork 1024 tasks to 'test_update_delete'
    Fork 100 tasks to 'test_hashmap'
    Fork 100 tasks to 'test_hashmap_percpu'
    Fork 1024 tasks to 'test_update_delete'
    Fork 1024 tasks to 'test_update_delete'
    Fork 100 tasks to 'test_hashmap'
    Fork 100 tasks to 'test_hashmap_percpu'
    Fork 1024 tasks to 'test_update_delete'
    Fork 1024 tasks to 'test_update_delete'
    Fork 100 tasks to 'test_hashmap'
    Fork 100 tasks to 'test_hashmap_percpu'
    Fork 1024 tasks to 'test_update_delete'
    Fork 1024 tasks to 'test_update_delete'
    Fork 100 tasks to 'test_hashmap'
    Fork 100 tasks to 'test_hashmap_percpu'
    Fork 1024 tasks to 'test_update_delete'
    Fork 1024 tasks to 'test_update_delete'
    Fork 100 tasks to 'test_hashmap'
    Fork 100 tasks to 'test_hashmap_percpu'
    Fork 1024 tasks to 'test_update_delete'
    Fork 1024 tasks to 'test_update_delete'
    Fork 100 tasks to 'test_hashmap'
    Fork 100 tasks to 'test_hashmap_percpu'
    Fork 1024 tasks to 'test_update_delete'
    Fork 1024 tasks to 'test_update_delete'
    Fork 100 tasks to 'test_hashmap'
    Fork 100 tasks to 'test_hashmap_percpu'
    Fork 1024 tasks to 'test_update_delete'
    Fork 1024 tasks to 'test_update_delete'
    Fork 100 tasks to 'test_hashmap'
    Fork 100 tasks to 'test_hashmap_percpu'
    Fork 1024 tasks to 'test_update_delete'
    Fork 1024 tasks to 'test_update_delete'
    Fork 100 tasks to 'test_hashmap'
    Fork 100 tasks to 'test_hashmap_percpu'
    Fork 1024 tasks to 'test_update_delete'
    Fork 1024 tasks to 'test_update_delete'
    Fork 100 tasks to 'test_hashmap'
    Fork 100 tasks to 'test_hashmap_percpu'
    Fork 1024 tasks to 'test_update_delete'
    Fork 1024 tasks to 'test_update_delete'
    Fork 100 tasks to 'test_hashmap'
    Fork 100 tasks to 'test_hashmap_percpu'
    Fork 1024 tasks to 'test_update_delete'
    Fork 1024 tasks to 'test_update_delete'
    Fork 100 tasks to 'test_hashmap'
    Fork 100 tasks to 'test_hashmap_percpu'
    Fork 1024 tasks to 'test_update_delete'
    Fork 1024 tasks to 'test_update_delete'
    Fork 100 tasks to 'test_hashmap'
    
    Fork 100 tasks to 'test_hashmap_sizes'
    Fork 100 tasks to 'test_hashmap_walk'
    Fork 100 tasks to 'test_arraymap'
    Fork 100 tasks to 'test_arraymap_percpu'
    Fork 1024 tasks to 'test_update_delete'
    Fork 1024 tasks to 'test_update_delete'
    Fork 100 tasks to 'test_hashmap'
    Fork 100 tasks to 'test_hashmap_percpu'
    Fork 100 tasks to 'test_hashmap_sizes'
    Fork 100 tasks to 'test_hashmap_walk'
    Fork 100 tasks to 'test_arraymap'
    Fork 100 tasks to 'test_arraymap_percpu'
    Fork 1024 tasks to 'test_update_delete'
    Fork 1024 tasks to 'test_update_delete'
    Fork 100 tasks to 'test_hashmap'
    Fork 100 tasks to 'test_hashmap_percpu'
    Fork 100 tasks to 'test_hashmap_sizes'
    Fork 100 tasks to 'test_hashmap_walk'
    Fork 100 tasks to 'test_arraymap'
    Fork 100 tasks to 'test_arraymap_percpu'
    Fork 1024 tasks to 'test_update_delete'
    Fork 1024 tasks to 'test_update_delete'
    Fork 100 tasks to 'test_hashmap'
    Fork 100 tasks to 'test_hashmap_percpu'
    Fork 100 tasks to 'test_hashmap_sizes'
    Fork 100 tasks to 'test_hashmap_walk'
    Fork 100 tasks to 'test_arraymap'
    Fork 100 tasks to 'test_arraymap_percpu'
    Fork 1024 tasks to 'test_update_delete'
    Fork 1024 tasks to 'test_update_delete'
    Fork 100 tasks to 'test_hashmap'
    Fork 100 tasks to 'test_hashmap_percpu'
    Fork 100 tasks to 'test_hashmap_sizes'
    Fork 100 tasks to 'test_hashmap_walk'
    Fork 100 tasks to 'test_arraymap'
    Fork 100 tasks to 'test_arraymap_percpu'
    Fork 1024 tasks to 'test_update_delete'
    Fork 1024 tasks to 'test_update_delete'
    Fork 100 tasks to 'test_hashmap'
    Fork 100 tasks to 'test_hashmap_percpu'
    Fork 100 tasks to 'test_hashmap_sizes'
    Fork 100 tasks to 'test_hashmap_walk'
    Fork 100 tasks to 'test_arraymap'
    Fork 100 tasks to 'test_arraymap_percpu'
    Failed sockmap unexpected timeout

  ProblemType: Bug
  DistroRelease: Ubuntu 18.10
  Package: linux-image-4.18.0-11-generic 4.18.0-11.12
  ProcVersionSignature: User Name 4.18.0-11.12-generic 4.18.12
  Uname: Linux 4.18.0-11-generic aarch64
  AlsaDevices:
   total 0
   crw-rw 1 root

[Kernel-packages] [Bug 1844493] Re: ubuntu_sysdig_smoke_test failed on 5.3 / 5.4 / 5.6 /5.8 kernels

2021-02-18 Thread Francis Ginther
** Tags added: sru-20210125

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-gcp in Ubuntu.
https://bugs.launchpad.net/bugs/1844493

Title:
  ubuntu_sysdig_smoke_test failed on 5.3 / 5.4 / 5.6 /5.8 kernels

Status in ubuntu-kernel-tests:
  Triaged
Status in linux package in Ubuntu:
  Incomplete
Status in linux-aws package in Ubuntu:
  New
Status in linux-azure package in Ubuntu:
  New
Status in linux-gcp package in Ubuntu:
  New
Status in linux source package in Eoan:
  Incomplete
Status in linux-aws source package in Eoan:
  New
Status in linux-azure source package in Eoan:
  New
Status in linux-gcp source package in Eoan:
  New
Status in linux source package in Focal:
  New
Status in linux-aws source package in Focal:
  New
Status in linux-azure source package in Focal:
  New
Status in linux-gcp source package in Focal:
  New

Bug description:
  Test failed with:
FAILED (trace at least 25 reads of /dev/zero by dd)
FAILED (trace at least 25 writes to /dev/null by dd)

  Steps:
sudo apt-get install git python-minimal python-yaml gdb -y
git clone --depth=1 git://kernel.ubuntu.com/ubuntu/autotest-client-tests
git clone --depth=1 git://kernel.ubuntu.com/ubuntu/autotest
rm -fr autotest/client/tests
ln -sf ~/autotest-client-tests autotest/client/tests
AUTOTEST_PATH=/home/ubuntu/autotest sudo -E autotest/client/autotest-local 
--verbose autotest/client/tests/ubuntu_sysdig_smoke_test/control

  Test output:
== sysdig smoke test to trace dd, cat, read and writes ==
Limiting raw capture file to 16384 blocks
Try 1 of 10
Sysdig capture started after 1 seconds wait
Raw capture file is 16 Mbytes
Converted events file is 18 Mbytes
Try 2 of 10
Sysdig capture started after 1 seconds wait
Raw capture file is 16 Mbytes
Converted events file is 22 Mbytes
Try 3 of 10
Sysdig capture started after 1 seconds wait
Raw capture file is 16 Mbytes
Converted events file is 21 Mbytes
Try 4 of 10
Sysdig capture started after 1 seconds wait
Raw capture file is 16 Mbytes
Converted events file is 21 Mbytes
Try 5 of 10
Sysdig capture started after 1 seconds wait
Raw capture file is 16 Mbytes
Converted events file is 21 Mbytes
Try 6 of 10
Sysdig capture started after 1 seconds wait
Raw capture file is 16 Mbytes
Converted events file is 21 Mbytes
Try 7 of 10
Sysdig capture started after 1 seconds wait
Raw capture file is 16 Mbytes
Converted events file is 21 Mbytes
Try 8 of 10
Sysdig capture started after 1 seconds wait
Raw capture file is 16 Mbytes
Converted events file is 21 Mbytes
Try 9 of 10
Sysdig capture started after 1 seconds wait
Raw capture file is 16 Mbytes
Converted events file is 21 Mbytes
Try 10 of 10
Sysdig capture started after 1 seconds wait
Raw capture file is 16 Mbytes
Converted events file is 21 Mbytes
Found:
   279845 sysdig events
   29882 context switches
   0 reads from /dev/zero by dd
   0 writes to /dev/null by dd
PASSED (trace at least 25 context switches)
FAILED (trace at least 25 reads of /dev/zero by dd)
FAILED (trace at least 25 writes to /dev/null by dd)
 
Summary: 1 passed, 2 failed

  ProblemType: Bug
  DistroRelease: Ubuntu 19.10
  Package: linux-image-5.3.0-10-generic 5.3.0-10.11
  ProcVersionSignature: User Name 5.3.0-10.11-generic 5.3.0-rc8
  Uname: Linux 5.3.0-10-generic x86_64
  AlsaDevices:
   total 0
   crw-rw 1 root audio 116,  1 Sep 18 08:10 seq
   crw-rw 1 root audio 116, 33 Sep 18 08:10 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay'
  ApportVersion: 2.20.11-0ubuntu7
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 
'arecord'
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
  CurrentDmesg:
   
  Date: Wed Sep 18 08:18:54 2019
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig'
  MachineType: Intel Corporation S1200RP
  PciMultimedia:
   
  ProcFB: 0 mgag200drmfb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.3.0-10-generic 
root=UUID=b0d2ae4e-12dd-423e-acea-272ee8b2a893 ro
  RelatedPackageVersions:
   linux-restricted-modules-5.3.0-10-generic N/A
   linux-backports-modules-5.3.0-10-generic  N/A
   linux-firmware1.182
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill'
  SourcePackage: linux
  UpgradeStatus: No upgrade log present (probably fresh install)
  dmi.bios.date: 07/01/2015
  dmi.bios.vendor: Intel Corp.
  dmi.bios.version: S1200RP.86B.03.02.0003.070120151022
  dmi.board.asset.tag: 
  dmi.board.name: S1200RP
  dmi.board.vendor: Intel Corporation
  dmi.board.version: G62254-407
  dmi.chassis.asset.tag: 
  dmi.chassis.type: 17
 

[Kernel-packages] [Bug 1905728] Re: Found insecure W+X mapping at address on Groovy RISCV

2021-02-18 Thread Francis Ginther
** Attachment added: "dmesg-5.8.0-17-generic"
   
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1905728/+attachment/5464855/+files/dmesg

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1905728

Title:
  Found insecure W+X mapping at address on Groovy RISCV

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  Issue found on 5.8.0-10-generic riscv

  Message reported on boot.

  [   13.483103] [ cut here ]
  [   13.483711] riscv/mm: Found insecure W+X mapping at address 
(ptrval)/0xffdff800
  [   13.484542] WARNING: CPU: 5 PID: 1 at arch/riscv/mm/ptdump.c:200 
note_page+0x24c/0x252
  [   13.485175] Modules linked in:
  [   13.485606] CPU: 5 PID: 1 Comm: swapper/0 Not tainted 5.8.0-10-generic 
#12-Ubuntu
  [   13.486091] epc: ffe000208f18 ra : ffe000208f18 sp : 
ffe1f5bfbb30
  [   13.486471]  gp : ffe001728ee0 tp : ffe1f5bf5080 t0 : 
ffe00173ed88
  [   13.486850]  t1 : ffe00173ed20 t2 : 0001fecbe000 s0 : 
ffe1f5bfbb80
  [   13.487250]  s1 : ffe1f5bfbe10 a0 : 0053 a1 : 
0020
  [   13.487633]  a2 : ffe1f5bfb870 a3 :  a4 : 
ffe0016200f8
  [   13.488040]  a5 : ffe0016200f8 a6 : 00b5 a7 : 
ffe0006f2806
  [   13.488421]  s2 : ffdff8001000 s3 :  s4 : 
0004
  [   13.488800]  s5 :  s6 :  s7 : 
ffe1f5bfbd20
  [   13.489322]  s8 : ffdff8001000 s9 : ffe00172a148 s10: 
ffdff8002000
  [   13.489738]  s11: ffe000c16e20 t3 : 0003cec0 t4 : 
0003cec0
  [   13.490119]  t5 :  t6 : ffe001739462
  [   13.490406] status: 0120 badaddr:  cause: 
0003
  [   13.490849] ---[ end trace 607c551edff1ef12 ]---

  Please find attachment for the boot dmesg log.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1905728/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1905728] Re: Found insecure W+X mapping at address on Groovy RISCV

2021-02-18 Thread Francis Ginther
Still seeing this with the 5.8.0-17-generic riscv kernel on groovy. See
attached dmesg.

** Changed in: linux (Ubuntu)
   Status: Expired => Confirmed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1905728

Title:
  Found insecure W+X mapping at address on Groovy RISCV

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  Issue found on 5.8.0-10-generic riscv

  Message reported on boot.

  [   13.483103] [ cut here ]
  [   13.483711] riscv/mm: Found insecure W+X mapping at address 
(ptrval)/0xffdff800
  [   13.484542] WARNING: CPU: 5 PID: 1 at arch/riscv/mm/ptdump.c:200 
note_page+0x24c/0x252
  [   13.485175] Modules linked in:
  [   13.485606] CPU: 5 PID: 1 Comm: swapper/0 Not tainted 5.8.0-10-generic 
#12-Ubuntu
  [   13.486091] epc: ffe000208f18 ra : ffe000208f18 sp : 
ffe1f5bfbb30
  [   13.486471]  gp : ffe001728ee0 tp : ffe1f5bf5080 t0 : 
ffe00173ed88
  [   13.486850]  t1 : ffe00173ed20 t2 : 0001fecbe000 s0 : 
ffe1f5bfbb80
  [   13.487250]  s1 : ffe1f5bfbe10 a0 : 0053 a1 : 
0020
  [   13.487633]  a2 : ffe1f5bfb870 a3 :  a4 : 
ffe0016200f8
  [   13.488040]  a5 : ffe0016200f8 a6 : 00b5 a7 : 
ffe0006f2806
  [   13.488421]  s2 : ffdff8001000 s3 :  s4 : 
0004
  [   13.488800]  s5 :  s6 :  s7 : 
ffe1f5bfbd20
  [   13.489322]  s8 : ffdff8001000 s9 : ffe00172a148 s10: 
ffdff8002000
  [   13.489738]  s11: ffe000c16e20 t3 : 0003cec0 t4 : 
0003cec0
  [   13.490119]  t5 :  t6 : ffe001739462
  [   13.490406] status: 0120 badaddr:  cause: 
0003
  [   13.490849] ---[ end trace 607c551edff1ef12 ]---

  Please find attachment for the boot dmesg log.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1905728/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1748103] Re: apic test in kvm-unit-test failed with timeout

2021-02-17 Thread Francis Ginther
Seen with linux-oracle 4.15.0-1065.73~16.04.1.

** Tags added: sru-20210125

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-azure in Ubuntu.
https://bugs.launchpad.net/bugs/1748103

Title:
  apic test in kvm-unit-test failed with timeout

Status in ubuntu-kernel-tests:
  In Progress
Status in linux package in Ubuntu:
  Incomplete
Status in linux-azure package in Ubuntu:
  New
Status in linux-azure-edge package in Ubuntu:
  New
Status in linux source package in Xenial:
  New
Status in linux-azure source package in Xenial:
  New
Status in linux-azure-edge source package in Xenial:
  New
Status in linux source package in Bionic:
  New
Status in linux-azure source package in Bionic:
  New
Status in linux-azure-edge source package in Bionic:
  New

Bug description:
  With Joshua's comment in bug 1719524: "Nested KVM can only be tried on
  instance sizes with nested Hypervisor support: Ev3 and Dv3.", although
  the instance name is E4v3 here but I can start a KVM on it.

  Test apic will timeout on it.

  Steps:
  1. git clone --depth=1 
https://git.kernel.org/pub/scm/virt/kvm/kvm-unit-tests.git
  2. cd kvm-unit-tests; ./configure; make
  3. Run the apic test as root:
   
  # TESTNAME=apic TIMEOUT=30 ACCEL= ./x86/run x86/apic.flat -smp 2 -cpu 
qemu64,+x2apic,+tsc-deadline
  timeout -k 1s --foreground 30 /usr/bin/qemu-system-x86_64 -nodefaults -device 
pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -vnc none -serial 
stdio -device pci-testdev -machine accel=kvm -kernel x86/apic.flat -smp 2 -cpu 
qemu64,+x2apic,+tsc-deadline # -initrd /tmp/tmp.onXtr5JVp7
  enabling apic
  enabling apic
  paging enabled
  cr0 = 80010011
  cr3 = 459000
  cr4 = 20
  apic version: 1050014
  PASS: apic existence
  PASS: xapic id matches cpuid
  PASS: writeable xapic id
  PASS: non-writeable x2apic id
  PASS: sane x2apic id
  FAIL: x2apic id matches cpuid
  PASS: correct xapic id after reset
  PASS: apic_disable: Local apic enabled
  PASS: apic_disable: CPUID.1H:EDX.APIC[bit 9] is set
  PASS: apic_disable: Local apic disabled
  PASS: apic_disable: CPUID.1H:EDX.APIC[bit 9] is clear
  PASS: apic_disable: Local apic enabled
  PASS: apic_disable: CPUID.1H:EDX.APIC[bit 9] is set
  x2apic enabled
  PASS: x2apic enabled to invalid state
  PASS: x2apic enabled to apic enabled
  PASS: disabled to invalid state
  PASS: disabled to x2apic enabled
  PASS: apic enabled to invalid state
  PASS: apicbase: relocate apic
  PASS: apicbase: reserved physaddr bits
  PASS: apicbase: reserved low bits
  PASS: self ipi
  starting broadcast (x2apic)
  PASS: APIC physical broadcast address
  PASS: APIC physical broadcast shorthand
  PASS: nmi-after-sti
  qemu-system-x86_64: terminating on signal 15 from pid 7246

  ProblemType: Bug
  DistroRelease: Ubuntu 16.04
  Package: linux-image-4.14.0-1004-azure-edge 4.14.0-1004.4
  ProcVersionSignature: User Name 4.14.0-1004.4-username-edge 4.14.14
  Uname: Linux 4.14.0-1004-azure-edge x86_64
  ApportVersion: 2.20.1-0ubuntu2.15
  Architecture: amd64
  Date: Thu Feb  8 06:00:55 2018
  ProcEnviron:
   TERM=xterm-256color
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  SourcePackage: linux-azure-edge
  UpgradeStatus: No upgrade log present (probably fresh install)
  --- 
  ApportVersion: 2.20.1-0ubuntu2.15
  Architecture: amd64
  DistroRelease: Ubuntu 16.04
  Package: linux-azure-edge
  PackageArchitecture: amd64
  ProcEnviron:
   TERM=xterm-256color
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcVersionSignature: User Name 4.13.0-1009.12-username 4.13.13
  Tags:  xenial uec-images
  Uname: Linux 4.13.0-1009-azure x86_64
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups: adm audio cdrom dialout dip floppy libvirtd lxd netdev plugdev 
sudo video
  _MarkForUpload: True

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1748103/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1831449] Re: memory in ubuntu_kvm_unit_tests fails

2021-02-17 Thread Francis Ginther
Seen with linux-oracle 4.15.0-1065.73~16.04.1.

** Tags added: sru-20210125

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-oracle in Ubuntu.
https://bugs.launchpad.net/bugs/1831449

Title:
  memory in ubuntu_kvm_unit_tests fails

Status in ubuntu-kernel-tests:
  Triaged
Status in linux-kvm package in Ubuntu:
  New
Status in linux-oracle package in Ubuntu:
  New

Bug description:
  Need to run this on oracle manually to get the full output:
   TESTNAME=memory TIMEOUT=90s ACCEL= ./x86/run x86/memory.flat -smp 1 -cpu host
   FAIL memory (8 tests, 2 unexpected failures)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1831449/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1748105] Re: port80 test in ubuntu_kvm_unit_tests failed with timeout

2021-02-17 Thread Francis Ginther
Seen with linux-oracle 4.15.0-1065.73~16.04.1.


** Tags added: sru-20210125

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-kvm in Ubuntu.
https://bugs.launchpad.net/bugs/1748105

Title:
  port80 test in ubuntu_kvm_unit_tests failed with timeout

Status in ubuntu-kernel-tests:
  Triaged
Status in linux package in Ubuntu:
  Incomplete
Status in linux-azure package in Ubuntu:
  Confirmed
Status in linux-azure-edge package in Ubuntu:
  Confirmed
Status in linux-kvm package in Ubuntu:
  Confirmed
Status in linux-oracle-5.0 package in Ubuntu:
  Confirmed

Bug description:
  With Joshua's comment in bug 1719524: "Nested KVM can only be tried on
  instance sizes with nested Hypervisor support: Ev3 and Dv3.", although
  the instance name is E4v3 here but I can start a KVM on it.

  Test port80 test will timeout on it.

  Steps:
  1. git clone --depth=1 
https://git.kernel.org/pub/scm/virt/kvm/kvm-unit-tests.git
  2. cd kvm-unit-tests; ./configure; make
  3. Run the port80 test as root:

  # TESTNAME=port80 TIMEOUT=90s ACCEL= ./x86/run x86/port80.flat -smp 1
  timeout -k 1s --foreground 90s /usr/bin/qemu-system-x86_64 -nodefaults 
-device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -vnc none 
-serial stdio -device pci-testdev -machine accel=kvm -kernel x86/port80.flat 
-smp 1 # -initrd /tmp/tmp.3p9PWc2SRi
  enabling apic
  begining port 0x80 write test
  qemu-system-x86_64: terminating on signal 15 from pid 7790

  ProblemType: Bug
  DistroRelease: Ubuntu 16.04
  Package: linux-image-4.14.0-1004-azure-edge 4.14.0-1004.4
  ProcVersionSignature: User Name 4.14.0-1004.4-username-edge 4.14.14
  Uname: Linux 4.14.0-1004-azure-edge x86_64
  ApportVersion: 2.20.1-0ubuntu2.15
  Architecture: amd64
  Date: Thu Feb  8 06:13:18 2018
  ProcEnviron:
   TERM=xterm-256color
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  SourcePackage: linux-azure-edge
  UpgradeStatus: No upgrade log present (probably fresh install)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1748105/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1827979] Re: pcid in ubuntu_kvm_unit_tests failed on B-KVM / X-4.15-oracle / B-oracle-5.3

2021-02-17 Thread Francis Ginther
Seen with linux-oracle 4.15.0-1065.73~16.04.1.

** Tags added: sru-20210125

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-kvm in Ubuntu.
https://bugs.launchpad.net/bugs/1827979

Title:
  pcid in ubuntu_kvm_unit_tests failed on B-KVM / X-4.15-oracle /
  B-oracle-5.3

Status in ubuntu-kernel-tests:
  New
Status in linux-kvm package in Ubuntu:
  New

Bug description:
  FAIL pcid (3 tests, 1 unexpected failures)

  # TESTNAME=pcid TIMEOUT=90s ACCEL= ./x86/run x86/pcid.flat -smp 1 -cpu 
qemu64,+pcid
  timeout -k 1s --foreground 90s /usr/bin/qemu-system-x86_64 -nodefaults 
-device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -vnc none 
-serial stdio -device pci-testdev -machine accel=kvm -kernel x86/pcid.flat -smp 
1 -cpu qemu64,+pcid # -initrd /tmp/tmp.a4xQyF7juj
  qemu-system-x86_64: warning: host doesn't support requested feature: 
CPUID.01H:ECX.pcid [bit 17]
  qemu-system-x86_64: warning: host doesn't support requested feature: 
CPUID.8001H:ECX.svm [bit 2]
  enabling apic
  PASS: CPUID consistency
  PASS: Test on PCID when disabled
  FAIL: Test on INVPCID when disabled
  SUMMARY: 3 tests, 1 unexpected failures

  ProblemType: Bug
  DistroRelease: Ubuntu 18.04
  Package: linux-image-4.15.0-1032-kvm 4.15.0-1032.32
  ProcVersionSignature: User Name 4.15.0-1032.32-kvm 4.15.18
  Uname: Linux 4.15.0-1032-kvm x86_64
  ApportVersion: 2.20.9-0ubuntu7.6
  Architecture: amd64
  Date: Tue May  7 03:31:38 2019
  SourcePackage: linux-kvm
  UpgradeStatus: No upgrade log present (probably fresh install)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1827979/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1837035] Re: memcg_stat_rss from controllers in ubuntu_ltp failed

2021-02-16 Thread Francis Ginther
Seen with linux-oracle 4.15.0-1065.73.

** Tags added: sru-20210125

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-aws in Ubuntu.
https://bugs.launchpad.net/bugs/1837035

Title:
  memcg_stat_rss from controllers in ubuntu_ltp failed

Status in ubuntu-kernel-tests:
  New
Status in linux package in Ubuntu:
  Confirmed
Status in linux-aws package in Ubuntu:
  New

Bug description:
  This issue was spotted on an i386 node "pepe" with Disco kernel,
  it failed with:

  memcg_process: shmget() failed: Invalid argument
  /opt/ltp/testcases/bin/memcg_stat_rss.sh: 168: kill: No such process

  memcg_stat_rss 4 TFAIL: Process 1845 exited with 1 after warm up

  
  <<>>
  tag=memcg_stat_rss stime=1563448062
  cmdline="memcg_stat_rss.sh"
  contacts=""
  analysis=exit
  <<>>
  memcg_stat_rss 1 TINFO: Starting test 1
  /opt/ltp/testcases/bin/memcg_stat_rss.sh: 522: echo: echo: I/O error
  memcg_stat_rss 1 TINFO: set /dev/memcg/memory.use_hierarchy to 0 failed
  memcg_stat_rss 1 TINFO: Running memcg_process --mmap-anon -s 135168
  memcg_stat_rss 1 TINFO: Warming up pid: 1784
  memcg_stat_rss 1 TINFO: Process is still here after warm up: 1784
  memcg_stat_rss 1 TPASS: rss is 135168 as expected
  memcg_stat_rss 2 TINFO: Starting test 2
  /opt/ltp/testcases/bin/memcg_stat_rss.sh: 522: echo: echo: I/O error
  memcg_stat_rss 2 TINFO: set /dev/memcg/memory.use_hierarchy to 0 failed
  memcg_stat_rss 2 TINFO: Running memcg_process --mmap-file -s 4096
  memcg_stat_rss 2 TINFO: Warming up pid: 1804
  memcg_stat_rss 2 TINFO: Process is still here after warm up: 1804
  memcg_stat_rss 2 TPASS: rss is 0 as expected
  memcg_stat_rss 3 TINFO: Starting test 3
  /opt/ltp/testcases/bin/memcg_stat_rss.sh: 522: echo: echo: I/O error
  memcg_stat_rss 3 TINFO: set /dev/memcg/memory.use_hierarchy to 0 failed
  memcg_stat_rss 3 TINFO: Running memcg_process --shm -k 3 -s 4096
  memcg_stat_rss 3 TINFO: Warming up pid: 1825
  memcg_stat_rss 3 TINFO: Process is still here after warm up: 1825
  memcg_stat_rss 3 TPASS: rss is 0 as expected
  memcg_stat_rss 4 TINFO: Starting test 4
  /opt/ltp/testcases/bin/memcg_stat_rss.sh: 522: echo: echo: I/O error
  memcg_stat_rss 4 TINFO: set /dev/memcg/memory.use_hierarchy to 0 failed
  memcg_stat_rss 4 TINFO: Running memcg_process --mmap-anon --mmap-file --shm 
-s 135168
  memcg_stat_rss 4 TINFO: Warming up pid: 1845
  memcg_process: shmget() failed: Invalid argument
  /opt/ltp/testcases/bin/memcg_stat_rss.sh: 168: kill: No such process

  memcg_stat_rss 4 TFAIL: Process 1845 exited with 1 after warm up
  memcg_stat_rss 5 TINFO: Starting test 5
  /opt/ltp/testcases/bin/memcg_stat_rss.sh: 522: echo: echo: I/O error
  memcg_stat_rss 5 TINFO: set /dev/memcg/memory.use_hierarchy to 0 failed
  memcg_stat_rss 5 TINFO: Running memcg_process --mmap-lock1 -s 135168
  memcg_stat_rss 5 TINFO: Warming up pid: 1858
  memcg_stat_rss 5 TINFO: Process is still here after warm up: 1858
  memcg_stat_rss 5 TPASS: rss is 135168 as expected
  memcg_stat_rss 6 TINFO: Starting test 6
  /opt/ltp/testcases/bin/memcg_stat_rss.sh: 522: echo: echo: I/O error
  memcg_stat_rss 6 TINFO: set /dev/memcg/memory.use_hierarchy to 0 failed
  memcg_stat_rss 6 TINFO: Running memcg_process --mmap-anon -s 135168
  memcg_stat_rss 6 TINFO: Warming up pid: 1878
  memcg_stat_rss 6 TINFO: Process is still here after warm up: 1878
  memcg_stat_rss 6 TPASS: rss is 135168 as expected
  memcg_stat_rss 7 TPASS: rss is 0 as expected
  memcg_stat_rss 8 TINFO: Starting test 7
  /opt/ltp/testcases/bin/memcg_stat_rss.sh: 522: echo: echo: I/O error
  memcg_stat_rss 8 TINFO: set /dev/memcg/memory.use_hierarchy to 0 failed
  memcg_stat_rss 8 TINFO: Running memcg_process --mmap-file -s 4096
  memcg_stat_rss 8 TINFO: Warming up pid: 1901
  memcg_stat_rss 8 TINFO: Process is still here after warm up: 1901
  memcg_stat_rss 8 TPASS: rss is 0 as expected
  memcg_stat_rss 9 TPASS: rss is 0 as expected
  memcg_stat_rss 10 TINFO: Starting test 8
  /opt/ltp/testcases/bin/memcg_stat_rss.sh: 522: echo: echo: I/O error
  memcg_stat_rss 10 TINFO: set /dev/memcg/memory.use_hierarchy to 0 failed
  memcg_stat_rss 10 TINFO: Running memcg_process --shm -k 8 -s 4096
  memcg_stat_rss 10 TINFO: Warming up pid: 1925
  memcg_stat_rss 10 TINFO: Process is still here after warm up: 1925
  memcg_stat_rss 10 TPASS: rss is 0 as expected
  memcg_stat_rss 11 TPASS: rss is 0 as expected
  memcg_stat_rss 12 TINFO: Starting test 9
  /opt/ltp/testcases/bin/memcg_stat_rss.sh: 522: echo: echo: I/O error
  memcg_stat_rss 12 TINFO: set /dev/memcg/memory.use_hierarchy to 0 failed
  memcg_stat_rss 12 TINFO: Running memcg_process --mmap-anon --mmap-file --shm 
-s 135168
  memcg_stat_rss 12 TINFO: Warming up pid: 1948
  memcg_process: shmget() failed: Invalid argument
  /opt/ltp/testcases/bin/memcg_stat_rss.sh: 168: kill: No such process

  memcg_stat_rss 12 TFAIL: Process 1948 exited with 1 after warm up
  memcg_stat_rss 13 TINFO: Starting test 

[Kernel-packages] [Bug 1829995] Re: getaddrinfo_01 from ipv6_lib test suite in LTP failed

2021-02-16 Thread Francis Ginther
Seen with linux-oracle 4.15.0-1065.73.

** Tags added: oracle sru-20210125

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-aws in Ubuntu.
https://bugs.launchpad.net/bugs/1829995

Title:
  getaddrinfo_01 from ipv6_lib test suite in LTP failed

Status in ubuntu-kernel-tests:
  Triaged
Status in linux package in Ubuntu:
  Incomplete
Status in linux-aws package in Ubuntu:
  New
Status in linux source package in Bionic:
  Incomplete
Status in linux-aws source package in Bionic:
  New
Status in linux source package in Eoan:
  New
Status in linux-aws source package in Eoan:
  New

Bug description:
  startup='Wed May 22 08:02:52 2019'
  getaddrinfo_011  TPASS  :  getaddrinfo IPv4 basic lookup
  getaddrinfo_012  TFAIL  :  getaddrinfo_01.c:140: getaddrinfo IPv4 
canonical name ("curly.maas") doesn't match hostname ("curly")
  getaddrinfo_013  TFAIL  :  getaddrinfo_01.c:578: getaddrinfo IPv6 basic 
lookup ("curly") returns -5 ("No address associated with hostname")
  tag=getaddrinfo_01 stime=1558512172 dur=1 exit=exited stat=1 core=no cu=0 cs=0

  ProblemType: Bug
  DistroRelease: Ubuntu 18.04
  Package: linux-image-4.15.0-50-generic 4.15.0-50.54
  ProcVersionSignature: User Name 4.15.0-50.54-generic 4.15.18
  Uname: Linux 4.15.0-50-generic x86_64
  AlsaDevices:
   total 0
   crw-rw 1 root audio 116,  1 May 22 02:57 seq
   crw-rw 1 root audio 116, 33 May 22 02:57 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay'
  ApportVersion: 2.20.9-0ubuntu7.6
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 
'arecord'
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
  CurrentDmesg:
   
  Date: Wed May 22 08:04:30 2019
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig'
  Lsusb: Bus 001 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
  MachineType: QEMU Standard PC (i440FX + PIIX, 1996)
  PciMultimedia:
   
  ProcFB: 0 cirrusdrmfb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-50-generic 
root=UUID=57e8-9e7f-40ee-934e-f1dce18323e5 ro
  RelatedPackageVersions:
   linux-restricted-modules-4.15.0-50-generic N/A
   linux-backports-modules-4.15.0-50-generic  N/A
   linux-firmware 1.173.6
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill'
  SourcePackage: linux
  UpgradeStatus: No upgrade log present (probably fresh install)
  dmi.bios.date: 04/01/2014
  dmi.bios.vendor: SeaBIOS
  dmi.bios.version: Ubuntu-1.8.2-1ubuntu1
  dmi.chassis.type: 1
  dmi.chassis.vendor: QEMU
  dmi.chassis.version: pc-i440fx-xenial
  dmi.modalias: 
dmi:bvnSeaBIOS:bvrUbuntu-1.8.2-1ubuntu1:bd04/01/2014:svnQEMU:pnStandardPC(i440FX+PIIX,1996):pvrpc-i440fx-xenial:cvnQEMU:ct1:cvrpc-i440fx-xenial:
  dmi.product.name: Standard PC (i440FX + PIIX, 1996)
  dmi.product.version: pc-i440fx-xenial
  dmi.sys.vendor: QEMU

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1829995/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1837543] Re: crypto_user02 in crypto from ubuntu_ltp failed

2021-02-16 Thread Francis Ginther
Seen with linux-oracle 4.15.0-1065.73.

** Tags added: sru-20210125

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-aws in Ubuntu.
https://bugs.launchpad.net/bugs/1837543

Title:
  crypto_user02 in crypto from ubuntu_ltp failed

Status in ubuntu-kernel-tests:
  Triaged
Status in linux package in Ubuntu:
  Confirmed
Status in linux-aws package in Ubuntu:
  New
Status in linux source package in Bionic:
  New
Status in linux-aws source package in Bionic:
  Confirmed

Bug description:
  This is a new test, test will fail with:

  <<>>
  tag=crypto_user02 stime=1563881396
  cmdline="crypto_user02"
  contacts=""
  analysis=exit
  <<>>
  incrementing stop
  tst_test.c:1100: INFO: Timeout per run is 0h 05m 00s
  crypto_user02.c:59: INFO: Starting crypto_user larval deletion test.  May 
crash buggy kernels.
  crypto_user02.c:91: BROK: unexpected error from tst_crypto_del_alg(): EBUSY

  Summary:
  passed   0
  failed   0
  skipped  0
  warnings 0
  <<>>
  initiation_status="ok"
  duration=0 termination_type=exited termination_id=2 corefile=no
  cutime=0 cstime=0
  <<>>

  
  Nothing interesting in syslog:
  Jul 23 11:29:20 amaura systemd[1]: Started Session 1 of user ubuntu.
  Jul 23 11:29:56 amaura kernel: [  619.646330] LTP: starting crypto_user02
  Jul 23 11:30:23 amaura kernel: [  646.554403] cfg80211: Loading compiled-in 
X.509 certificates for regulatory database

  
  Steps to run this test:
git clone --depth=1 https://github.com/linux-test-project/ltp.git
cd ltp; make autotools; ./configure; make; sudo make install
echo "crypto_user02 crypto_user02" > /tmp/jobs
sudo /opt/ltp/runltp -f /tmp/jobs

  ProblemType: Bug
  DistroRelease: Ubuntu 19.04
  Package: linux-image-5.0.0-21-generic 5.0.0-21.22
  ProcVersionSignature: User Name 5.0.0-21.22-generic 5.0.15
  Uname: Linux 5.0.0-21-generic x86_64
  AlsaDevices:
   total 0
   crw-rw 1 root audio 116,  1 Jul 23 11:19 seq
   crw-rw 1 root audio 116, 33 Jul 23 11:19 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay'
  ApportVersion: 2.20.10-0ubuntu27.1
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 
'arecord'
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
  Date: Tue Jul 23 11:30:15 2019
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig'
  MachineType: Intel Corporation S1200RP
  PciMultimedia:
   
  ProcFB: 0 mgadrmfb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.0.0-21-generic 
root=UUID=b0d2ae4e-12dd-423e-acea-272ee8b2a893 ro
  RelatedPackageVersions:
   linux-restricted-modules-5.0.0-21-generic N/A
   linux-backports-modules-5.0.0-21-generic  N/A
   linux-firmware1.178.3
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill'
  SourcePackage: linux
  UpgradeStatus: No upgrade log present (probably fresh install)
  dmi.bios.date: 07/01/2015
  dmi.bios.vendor: Intel Corp.
  dmi.bios.version: S1200RP.86B.03.02.0003.070120151022
  dmi.board.asset.tag: 
  dmi.board.name: S1200RP
  dmi.board.vendor: Intel Corporation
  dmi.board.version: G62254-407
  dmi.chassis.asset.tag: 
  dmi.chassis.type: 17
  dmi.chassis.vendor: ..
  dmi.chassis.version: ..
  dmi.modalias: 
dmi:bvnIntelCorp.:bvrS1200RP.86B.03.02.0003.070120151022:bd07/01/2015:svnIntelCorporation:pnS1200RP:pvr:rvnIntelCorporation:rnS1200RP:rvrG62254-407:cvn..:ct17:cvr..:
  dmi.product.family: To be filled by O.E.M.
  dmi.product.name: S1200RP
  dmi.product.sku: To be filled by O.E.M.
  dmi.product.version: 
  dmi.sys.vendor: Intel Corporation

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1837543/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1829849] Re: proc01 in fs from ubuntu_ltp failed

2021-02-16 Thread Francis Ginther
Seen with linux-oracle 4.15.0-1065.73.

** Tags added: sru-20210125

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-azure in Ubuntu.
https://bugs.launchpad.net/bugs/1829849

Title:
  proc01 in fs from ubuntu_ltp failed

Status in ubuntu-kernel-tests:
  Triaged
Status in linux package in Ubuntu:
  Incomplete
Status in linux-aws package in Ubuntu:
  Triaged
Status in linux-azure package in Ubuntu:
  Triaged
Status in linux-oracle-5.0 package in Ubuntu:
  Confirmed

Bug description:
   proc01  0  TINFO  :  /proc/sys/fs/binfmt_misc/register: is write-only.
   proc01  0  TINFO  :  /proc/sys/net/ipv6/conf/all/stable_secret: known 
issue: errno=EIO(5): Input/output error
   proc01  0  TINFO  :  /proc/sys/net/ipv6/conf/default/stable_secret: 
known issue: errno=EIO(5): Input/output error
   proc01  0  TINFO  :  /proc/sys/net/ipv6/conf/ens6/stable_secret: known 
issue: errno=EIO(5): Input/output error
   proc01  0  TINFO  :  /proc/sys/net/ipv6/conf/lo/stable_secret: known 
issue: errno=EIO(5): Input/output error
   proc01  0  TINFO  :  /proc/kmsg: known issue: 
errno=EAGAIN/EWOULDBLOCK(11): Resource temporarily unavailable
   proc01  0  TINFO  :  /proc/sysrq-trigger: is write-only.
   proc01  0  TINFO  :  /proc/self/task/8782/mem: known issue: 
errno=EIO(5): Input/output error
   proc01  0  TINFO  :  /proc/self/task/8782/clear_refs: is write-only.
   proc01  0  TINFO  :  /proc/self/task/8782/pagemap: reached maxmbytes (-m)
   proc01  0  TINFO  :  /proc/self/task/8782/attr/prev: known issue: 
errno=EINVAL(22): Invalid argument
   proc01  0  TINFO  :  /proc/self/task/8782/attr/exec: known issue: 
errno=EINVAL(22): Invalid argument
   proc01  0  TINFO  :  /proc/self/task/8782/attr/fscreate: known issue: 
errno=EINVAL(22): Invalid argument
   proc01  0  TINFO  :  /proc/self/task/8782/attr/keycreate: known issue: 
errno=EINVAL(22): Invalid argument
   proc01  0  TINFO  :  /proc/self/task/8782/attr/sockcreate: known issue: 
errno=EINVAL(22): Invalid argument
   proc01  1  TFAIL  :  proc01.c:397: read failed: 
/proc/self/task/8782/attr/selinux/current: errno=EINVAL(22): Invalid argument
   proc01  2  TFAIL  :  proc01.c:397: read failed: 
/proc/self/task/8782/attr/selinux/prev: errno=EINVAL(22): Invalid argument
   proc01  3  TFAIL  :  proc01.c:397: read failed: 
/proc/self/task/8782/attr/selinux/exec: errno=EINVAL(22): Invalid argument
   proc01  4  TFAIL  :  proc01.c:397: read failed: 
/proc/self/task/8782/attr/selinux/fscreate: errno=EINVAL(22): Invalid argument
   proc01  5  TFAIL  :  proc01.c:397: read failed: 
/proc/self/task/8782/attr/selinux/keycreate: errno=EINVAL(22): Invalid argument
   proc01  6  TFAIL  :  proc01.c:397: read failed: 
/proc/self/task/8782/attr/selinux/sockcreate: errno=EINVAL(22): Invalid argument
   proc01  7  TFAIL  :  proc01.c:397: read failed: 
/proc/self/task/8782/attr/smack/current: errno=EINVAL(22): Invalid argument
   proc01  8  TFAIL  :  proc01.c:397: read failed: 
/proc/self/task/8782/attr/apparmor/prev: errno=EINVAL(22): Invalid argument
   proc01  9  TFAIL  :  proc01.c:397: read failed: 
/proc/self/task/8782/attr/apparmor/exec: errno=EINVAL(22): Invalid argument
   proc01  0  TINFO  :  /proc/self/mem: known issue: errno=EIO(5): 
Input/output error
   proc01  0  TINFO  :  /proc/self/clear_refs: is write-only.
   proc01  0  TINFO  :  /proc/self/pagemap: reached maxmbytes (-m)
   proc01  0  TINFO  :  /proc/self/attr/prev: known issue: 
errno=EINVAL(22): Invalid argument
   proc01  0  TINFO  :  /proc/self/attr/exec: known issue: 
errno=EINVAL(22): Invalid argument
   proc01  0  TINFO  :  /proc/self/attr/fscreate: known issue: 
errno=EINVAL(22): Invalid argument
   proc01  0  TINFO  :  /proc/self/attr/keycreate: known issue: 
errno=EINVAL(22): Invalid argument
   proc01  0  TINFO  :  /proc/self/attr/sockcreate: known issue: 
errno=EINVAL(22): Invalid argument
   proc01 10  TFAIL  :  proc01.c:397: read failed: 
/proc/self/attr/selinux/current: errno=EINVAL(22): Invalid argument
   proc01 11  TFAIL  :  proc01.c:397: read failed: 
/proc/self/attr/selinux/prev: errno=EINVAL(22): Invalid argument
   proc01 12  TFAIL  :  proc01.c:397: read failed: 
/proc/self/attr/selinux/exec: errno=EINVAL(22): Invalid argument
   proc01 13  TFAIL  :  proc01.c:397: read failed: 
/proc/self/attr/selinux/fscreate: errno=EINVAL(22): Invalid argument
   proc01 14  TFAIL  :  proc01.c:397: read failed: 
/proc/self/attr/selinux/keycreate: errno=EINVAL(22): Invalid argument
   proc01 15  TFAIL  :  proc01.c:397: read failed: 
/proc/self/attr/selinux/sockcreate: errno=EINVAL(22): Invalid argument
   proc01 16  TFAIL  :  proc01.c:397: read failed: 
/proc/self/attr/smack/current: errno=EINVAL(22): Invalid argument
   proc01 17  TFAIL  :  proc01.c:397: read failed: 
/proc/self/attr/apparmor/prev: errn

[Kernel-packages] [Bug 1829978] Re: cpuacct_100_100 from controllers test suite in LTP failed

2021-02-16 Thread Francis Ginther
Seen with linux-oracle 4.15.0-1065.73.

** Tags added: oracle sru-20210125

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-aws in Ubuntu.
https://bugs.launchpad.net/bugs/1829978

Title:
  cpuacct_100_100 from controllers test suite in LTP failed

Status in ubuntu-kernel-tests:
  Confirmed
Status in linux package in Ubuntu:
  Incomplete
Status in linux-aws package in Ubuntu:
  New
Status in linux source package in Bionic:
  Incomplete
Status in linux-aws source package in Bionic:
  New

Bug description:
   startup='Wed May 22 06:50:45 2019'
   cpuacct 1 TINFO: timeout per run is 0h 5m 0s
   cpuacct 1 TINFO: cpuacct: /sys/fs/cgroup/cpu,cpuacct
   cpuacct 1 TINFO: Creating 100 subgroups each with 100 processes
   /opt/ltp/testcases/bin/cpuacct.sh: -2110094999: 
/opt/ltp/testcases/bin/cpuacct.sh: Cannot fork
   tag=cpuacct_100_100 stime=1558507845 dur=9 exit=exited stat=2 core=no cu=30 
cs=53

  ProblemType: Bug
  DistroRelease: Ubuntu 18.04
  Package: linux-image-4.15.0-50-generic 4.15.0-50.54
  ProcVersionSignature: User Name 4.15.0-50.54-generic 4.15.18
  Uname: Linux 4.15.0-50-generic x86_64
  AlsaDevices:
   total 0
   crw-rw 1 root audio 116,  1 May 22 02:57 seq
   crw-rw 1 root audio 116, 33 May 22 02:57 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay'
  ApportVersion: 2.20.9-0ubuntu7.6
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 
'arecord'
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
  CurrentDmesg:
   
  Date: Wed May 22 06:59:27 2019
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig'
  Lsusb: Bus 001 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
  MachineType: QEMU Standard PC (i440FX + PIIX, 1996)
  PciMultimedia:
   
  ProcFB: 0 cirrusdrmfb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-50-generic 
root=UUID=57e8-9e7f-40ee-934e-f1dce18323e5 ro
  RelatedPackageVersions:
   linux-restricted-modules-4.15.0-50-generic N/A
   linux-backports-modules-4.15.0-50-generic  N/A
   linux-firmware 1.173.6
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill'
  SourcePackage: linux
  UpgradeStatus: No upgrade log present (probably fresh install)
  dmi.bios.date: 04/01/2014
  dmi.bios.vendor: SeaBIOS
  dmi.bios.version: Ubuntu-1.8.2-1ubuntu1
  dmi.chassis.type: 1
  dmi.chassis.vendor: QEMU
  dmi.chassis.version: pc-i440fx-xenial
  dmi.modalias: 
dmi:bvnSeaBIOS:bvrUbuntu-1.8.2-1ubuntu1:bd04/01/2014:svnQEMU:pnStandardPC(i440FX+PIIX,1996):pvrpc-i440fx-xenial:cvnQEMU:ct1:cvrpc-i440fx-xenial:
  dmi.product.name: Standard PC (i440FX + PIIX, 1996)
  dmi.product.version: pc-i440fx-xenial
  dmi.sys.vendor: QEMU

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1829978/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


  1   2   3   >