I have isolated the cause of this bug to this commit:
https://git.launchpad.net/~ubuntu-
kernel/ubuntu/+source/linux/+git/noble/commit/?h=Ubuntu-6.8.0-20.20&id=71eb6b6b0ba93b1467bccff57b5de746b09113d2

All versions that I tested before this commit during my bisect passed
the aiol test at least 15 times in a row, and all versions after this
commit panic during at least one test. To confirm, I reverted this patch
on the latest 6.8 Ubuntu kernel (which was previously panicking reliably
within 5 tests) and verified that, with that change, it passes the test
at least 15x in a row without any panics.

The contents of the patch also support this conclusion, as the patch is
a change to the Linux AIO interface that introduces new calls to
spin_lock_irqsave() and wake_up_process() inside aio_complete(), which
corresponds with the content of the traces I have observed.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2058557

Title:
  Kernel panic during checkbox stress_ng_test on Grace running noble 6.8
  (arm64+largemem) kernel

Status in linux package in Ubuntu:
  New

Bug description:
  A kernel oops and panic occurred during 22.04 SoC certification on
  Gunyolk (Grace/Grace) with 6.8 kernel, arm64+largemem variant

  Steps to reproduce:
  Run (as root) the following commands:

  add-apt-repository -y ppa:checkbox-dev/stable
  apt-add-repository -y ppa:firmware-testing-team/ppa-fwts-stable
  apt update
  apt install -y canonical-certification-server
  /usr/lib/checkbox-provider-base/bin/stress_ng_test.py disk --device dm-0 
--base-time 240

  stress_ng_test caused a kernel panic after about 5 minutes. I have
  attached dmesg output from my reproducer to this report.

  Initially, this was identified via a panic during the above test,
  which was running as part of a run of certify-soc-22.04.

  Attached is a tarball containing:

  - apport.linux-image-6.8.0-11-generic-64k.kzsondji.apport: The output of 
`ubuntu-bug linux` on the machine (after reboot)
  - reproduced-dmesg.202403201942: The dmesg output captured by kdump when I 
reproduced my original issue by running only the single stress_ng_test.py 
command above (not the entire cert suite)
  - original-dmesg.txt: The dmesg output I captured when the stress_ng_test 
originally failed during the full cert suite run

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2058557/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to