[Kernel-packages] [Bug 2058557] Re: Kernel panic during checkbox stress_ng_test on Grace running noble 6.8 (arm64+largemem) kernel
** Also affects: linux (Ubuntu Noble) Importance: Undecided Assignee: Mitchell Augustin (mitchellaugustin) Status: In Progress ** Changed in: linux (Ubuntu Noble) Status: In Progress => Fix Committed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2058557 Title: Kernel panic during checkbox stress_ng_test on Grace running noble 6.8 (arm64+largemem) kernel Status in linux package in Ubuntu: Fix Committed Status in linux source package in Noble: Fix Committed Bug description: A kernel oops and panic occurred during 22.04 SoC certification on Gunyolk (Grace/Grace) with 6.8 kernel, arm64+largemem variant Steps to reproduce: Run (as root) the following commands: add-apt-repository -y ppa:checkbox-dev/stable apt-add-repository -y ppa:firmware-testing-team/ppa-fwts-stable apt update apt install -y canonical-certification-server /usr/lib/checkbox-provider-base/bin/stress_ng_test.py disk --device dm-0 --base-time 240 stress_ng_test caused a kernel panic after about 5 minutes. I have attached dmesg output from my reproducer to this report. Initially, this was identified via a panic during the above test, which was running as part of a run of certify-soc-22.04. Attached is a tarball containing: - apport.linux-image-6.8.0-11-generic-64k.kzsondji.apport: The output of `ubuntu-bug linux` on the machine (after reboot) - reproduced-dmesg.202403201942: The dmesg output captured by kdump when I reproduced my original issue by running only the single stress_ng_test.py command above (not the entire cert suite) - original-dmesg.txt: The dmesg output I captured when the stress_ng_test originally failed during the full cert suite run To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2058557/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2058557] Re: Kernel panic during checkbox stress_ng_test on Grace running noble 6.8 (arm64+largemem) kernel
** Changed in: linux (Ubuntu) Assignee: Jose Ogando Justo (joseogando) => Mitchell Augustin (mitchellaugustin) ** Changed in: linux (Ubuntu) Status: Fix Committed => In Progress -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2058557 Title: Kernel panic during checkbox stress_ng_test on Grace running noble 6.8 (arm64+largemem) kernel Status in linux package in Ubuntu: In Progress Bug description: A kernel oops and panic occurred during 22.04 SoC certification on Gunyolk (Grace/Grace) with 6.8 kernel, arm64+largemem variant Steps to reproduce: Run (as root) the following commands: add-apt-repository -y ppa:checkbox-dev/stable apt-add-repository -y ppa:firmware-testing-team/ppa-fwts-stable apt update apt install -y canonical-certification-server /usr/lib/checkbox-provider-base/bin/stress_ng_test.py disk --device dm-0 --base-time 240 stress_ng_test caused a kernel panic after about 5 minutes. I have attached dmesg output from my reproducer to this report. Initially, this was identified via a panic during the above test, which was running as part of a run of certify-soc-22.04. Attached is a tarball containing: - apport.linux-image-6.8.0-11-generic-64k.kzsondji.apport: The output of `ubuntu-bug linux` on the machine (after reboot) - reproduced-dmesg.202403201942: The dmesg output captured by kdump when I reproduced my original issue by running only the single stress_ng_test.py command above (not the entire cert suite) - original-dmesg.txt: The dmesg output I captured when the stress_ng_test originally failed during the full cert suite run To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2058557/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2058557] Re: Kernel panic during checkbox stress_ng_test on Grace running noble 6.8 (arm64+largemem) kernel
Fix has landed upstream: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/fs/aio.c?h=v6.9-rc3=caeb4b0a11b3393e43f7fa8e0a5a18462acc66bd -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2058557 Title: Kernel panic during checkbox stress_ng_test on Grace running noble 6.8 (arm64+largemem) kernel Status in linux package in Ubuntu: Fix Committed Bug description: A kernel oops and panic occurred during 22.04 SoC certification on Gunyolk (Grace/Grace) with 6.8 kernel, arm64+largemem variant Steps to reproduce: Run (as root) the following commands: add-apt-repository -y ppa:checkbox-dev/stable apt-add-repository -y ppa:firmware-testing-team/ppa-fwts-stable apt update apt install -y canonical-certification-server /usr/lib/checkbox-provider-base/bin/stress_ng_test.py disk --device dm-0 --base-time 240 stress_ng_test caused a kernel panic after about 5 minutes. I have attached dmesg output from my reproducer to this report. Initially, this was identified via a panic during the above test, which was running as part of a run of certify-soc-22.04. Attached is a tarball containing: - apport.linux-image-6.8.0-11-generic-64k.kzsondji.apport: The output of `ubuntu-bug linux` on the machine (after reboot) - reproduced-dmesg.202403201942: The dmesg output captured by kdump when I reproduced my original issue by running only the single stress_ng_test.py command above (not the entire cert suite) - original-dmesg.txt: The dmesg output I captured when the stress_ng_test originally failed during the full cert suite run To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2058557/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2058557] Re: Kernel panic during checkbox stress_ng_test on Grace running noble 6.8 (arm64+largemem) kernel
A fix has been applied to vfs.fixes upstream and should land soon. I have tested this patch and verified that the panic no longer occurs. ** Changed in: linux (Ubuntu) Status: New => Fix Committed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2058557 Title: Kernel panic during checkbox stress_ng_test on Grace running noble 6.8 (arm64+largemem) kernel Status in linux package in Ubuntu: Fix Committed Bug description: A kernel oops and panic occurred during 22.04 SoC certification on Gunyolk (Grace/Grace) with 6.8 kernel, arm64+largemem variant Steps to reproduce: Run (as root) the following commands: add-apt-repository -y ppa:checkbox-dev/stable apt-add-repository -y ppa:firmware-testing-team/ppa-fwts-stable apt update apt install -y canonical-certification-server /usr/lib/checkbox-provider-base/bin/stress_ng_test.py disk --device dm-0 --base-time 240 stress_ng_test caused a kernel panic after about 5 minutes. I have attached dmesg output from my reproducer to this report. Initially, this was identified via a panic during the above test, which was running as part of a run of certify-soc-22.04. Attached is a tarball containing: - apport.linux-image-6.8.0-11-generic-64k.kzsondji.apport: The output of `ubuntu-bug linux` on the machine (after reboot) - reproduced-dmesg.202403201942: The dmesg output captured by kdump when I reproduced my original issue by running only the single stress_ng_test.py command above (not the entire cert suite) - original-dmesg.txt: The dmesg output I captured when the stress_ng_test originally failed during the full cert suite run To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2058557/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2058557] Re: Kernel panic during checkbox stress_ng_test on Grace running noble 6.8 (arm64+largemem) kernel
This issue is still present upstream, so I reported it to the original committer of the patch. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2058557 Title: Kernel panic during checkbox stress_ng_test on Grace running noble 6.8 (arm64+largemem) kernel Status in linux package in Ubuntu: New Bug description: A kernel oops and panic occurred during 22.04 SoC certification on Gunyolk (Grace/Grace) with 6.8 kernel, arm64+largemem variant Steps to reproduce: Run (as root) the following commands: add-apt-repository -y ppa:checkbox-dev/stable apt-add-repository -y ppa:firmware-testing-team/ppa-fwts-stable apt update apt install -y canonical-certification-server /usr/lib/checkbox-provider-base/bin/stress_ng_test.py disk --device dm-0 --base-time 240 stress_ng_test caused a kernel panic after about 5 minutes. I have attached dmesg output from my reproducer to this report. Initially, this was identified via a panic during the above test, which was running as part of a run of certify-soc-22.04. Attached is a tarball containing: - apport.linux-image-6.8.0-11-generic-64k.kzsondji.apport: The output of `ubuntu-bug linux` on the machine (after reboot) - reproduced-dmesg.202403201942: The dmesg output captured by kdump when I reproduced my original issue by running only the single stress_ng_test.py command above (not the entire cert suite) - original-dmesg.txt: The dmesg output I captured when the stress_ng_test originally failed during the full cert suite run To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2058557/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2058557] Re: Kernel panic during checkbox stress_ng_test on Grace running noble 6.8 (arm64+largemem) kernel
I have isolated the cause of this bug to this commit: https://git.launchpad.net/~ubuntu- kernel/ubuntu/+source/linux/+git/noble/commit/?h=Ubuntu-6.8.0-20.20=71eb6b6b0ba93b1467bccff57b5de746b09113d2 All versions that I tested before this commit during my bisect passed the aiol test at least 15 times in a row, and all versions after this commit panic during at least one test. To confirm, I reverted this patch on the latest 6.8 Ubuntu kernel (which was previously panicking reliably within 5 tests) and verified that, with that change, it passes the test at least 15x in a row without any panics. The contents of the patch also support this conclusion, as the patch is a change to the Linux AIO interface that introduces new calls to spin_lock_irqsave() and wake_up_process() inside aio_complete(), which corresponds with the content of the traces I have observed. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2058557 Title: Kernel panic during checkbox stress_ng_test on Grace running noble 6.8 (arm64+largemem) kernel Status in linux package in Ubuntu: New Bug description: A kernel oops and panic occurred during 22.04 SoC certification on Gunyolk (Grace/Grace) with 6.8 kernel, arm64+largemem variant Steps to reproduce: Run (as root) the following commands: add-apt-repository -y ppa:checkbox-dev/stable apt-add-repository -y ppa:firmware-testing-team/ppa-fwts-stable apt update apt install -y canonical-certification-server /usr/lib/checkbox-provider-base/bin/stress_ng_test.py disk --device dm-0 --base-time 240 stress_ng_test caused a kernel panic after about 5 minutes. I have attached dmesg output from my reproducer to this report. Initially, this was identified via a panic during the above test, which was running as part of a run of certify-soc-22.04. Attached is a tarball containing: - apport.linux-image-6.8.0-11-generic-64k.kzsondji.apport: The output of `ubuntu-bug linux` on the machine (after reboot) - reproduced-dmesg.202403201942: The dmesg output captured by kdump when I reproduced my original issue by running only the single stress_ng_test.py command above (not the entire cert suite) - original-dmesg.txt: The dmesg output I captured when the stress_ng_test originally failed during the full cert suite run To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2058557/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2058557] Re: Kernel panic during checkbox stress_ng_test on Grace running noble 6.8 (arm64+largemem) kernel
It turns out that this issue does not appear with *every* run of the aiol test on affected kernels, so multiple runs of that test may be necessary for the panic to occur. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2058557 Title: Kernel panic during checkbox stress_ng_test on Grace running noble 6.8 (arm64+largemem) kernel Status in linux package in Ubuntu: New Bug description: A kernel oops and panic occurred during 22.04 SoC certification on Gunyolk (Grace/Grace) with 6.8 kernel, arm64+largemem variant Steps to reproduce: Run (as root) the following commands: add-apt-repository -y ppa:checkbox-dev/stable apt-add-repository -y ppa:firmware-testing-team/ppa-fwts-stable apt update apt install -y canonical-certification-server /usr/lib/checkbox-provider-base/bin/stress_ng_test.py disk --device dm-0 --base-time 240 stress_ng_test caused a kernel panic after about 5 minutes. I have attached dmesg output from my reproducer to this report. Initially, this was identified via a panic during the above test, which was running as part of a run of certify-soc-22.04. Attached is a tarball containing: - apport.linux-image-6.8.0-11-generic-64k.kzsondji.apport: The output of `ubuntu-bug linux` on the machine (after reboot) - reproduced-dmesg.202403201942: The dmesg output captured by kdump when I reproduced my original issue by running only the single stress_ng_test.py command above (not the entire cert suite) - original-dmesg.txt: The dmesg output I captured when the stress_ng_test originally failed during the full cert suite run To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2058557/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2058557] Re: Kernel panic during checkbox stress_ng_test on Grace running noble 6.8 (arm64+largemem) kernel
I did some more version testing, and I have not been able to reproduce this bug with the "aiol" stressor on either Upstream 6.5 or Ubuntu 6.5.0-26-generic-64k, so it was evidently introduced after that version. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2058557 Title: Kernel panic during checkbox stress_ng_test on Grace running noble 6.8 (arm64+largemem) kernel Status in linux package in Ubuntu: New Bug description: A kernel oops and panic occurred during 22.04 SoC certification on Gunyolk (Grace/Grace) with 6.8 kernel, arm64+largemem variant Steps to reproduce: Run (as root) the following commands: add-apt-repository -y ppa:checkbox-dev/stable apt-add-repository -y ppa:firmware-testing-team/ppa-fwts-stable apt update apt install -y canonical-certification-server /usr/lib/checkbox-provider-base/bin/stress_ng_test.py disk --device dm-0 --base-time 240 stress_ng_test caused a kernel panic after about 5 minutes. I have attached dmesg output from my reproducer to this report. Initially, this was identified via a panic during the above test, which was running as part of a run of certify-soc-22.04. Attached is a tarball containing: - apport.linux-image-6.8.0-11-generic-64k.kzsondji.apport: The output of `ubuntu-bug linux` on the machine (after reboot) - reproduced-dmesg.202403201942: The dmesg output captured by kdump when I reproduced my original issue by running only the single stress_ng_test.py command above (not the entire cert suite) - original-dmesg.txt: The dmesg output I captured when the stress_ng_test originally failed during the full cert suite run To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2058557/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2058557] Re: Kernel panic during checkbox stress_ng_test on Grace running noble 6.8 (arm64+largemem) kernel
Earlier, I said that the device mapper observation did not seem to be a hard line - however, further testing now indicates that the situations where I observed panics when stressing nvme0n1 were due to an unrelated bug that is present in the latest 6.5 mainline tree, but *not* the latest 6.5 Ubuntu kernel tree (6.5.0-26-generic-64k). Therefore, from the perspective of *this* bug report, it once again *does* appear that this issue is only present when stressing dm-0 and not present when stressing a non-device-mapper device. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2058557 Title: Kernel panic during checkbox stress_ng_test on Grace running noble 6.8 (arm64+largemem) kernel Status in linux package in Ubuntu: New Bug description: A kernel oops and panic occurred during 22.04 SoC certification on Gunyolk (Grace/Grace) with 6.8 kernel, arm64+largemem variant Steps to reproduce: Run (as root) the following commands: add-apt-repository -y ppa:checkbox-dev/stable apt-add-repository -y ppa:firmware-testing-team/ppa-fwts-stable apt update apt install -y canonical-certification-server /usr/lib/checkbox-provider-base/bin/stress_ng_test.py disk --device dm-0 --base-time 240 stress_ng_test caused a kernel panic after about 5 minutes. I have attached dmesg output from my reproducer to this report. Initially, this was identified via a panic during the above test, which was running as part of a run of certify-soc-22.04. Attached is a tarball containing: - apport.linux-image-6.8.0-11-generic-64k.kzsondji.apport: The output of `ubuntu-bug linux` on the machine (after reboot) - reproduced-dmesg.202403201942: The dmesg output captured by kdump when I reproduced my original issue by running only the single stress_ng_test.py command above (not the entire cert suite) - original-dmesg.txt: The dmesg output I captured when the stress_ng_test originally failed during the full cert suite run To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2058557/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2058557] Re: Kernel panic during checkbox stress_ng_test on Grace running noble 6.8 (arm64+largemem) kernel
I did not observe this issue with any other stress_ng disk tests on linux-image-6.8.0-11-generic-64k after 1 full run of the suite with the "aiol" test disabled. (When running the "aiol" test alone, it panicked reliably each time.) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2058557 Title: Kernel panic during checkbox stress_ng_test on Grace running noble 6.8 (arm64+largemem) kernel Status in linux package in Ubuntu: New Bug description: A kernel oops and panic occurred during 22.04 SoC certification on Gunyolk (Grace/Grace) with 6.8 kernel, arm64+largemem variant Steps to reproduce: Run (as root) the following commands: add-apt-repository -y ppa:checkbox-dev/stable apt-add-repository -y ppa:firmware-testing-team/ppa-fwts-stable apt update apt install -y canonical-certification-server /usr/lib/checkbox-provider-base/bin/stress_ng_test.py disk --device dm-0 --base-time 240 stress_ng_test caused a kernel panic after about 5 minutes. I have attached dmesg output from my reproducer to this report. Initially, this was identified via a panic during the above test, which was running as part of a run of certify-soc-22.04. Attached is a tarball containing: - apport.linux-image-6.8.0-11-generic-64k.kzsondji.apport: The output of `ubuntu-bug linux` on the machine (after reboot) - reproduced-dmesg.202403201942: The dmesg output captured by kdump when I reproduced my original issue by running only the single stress_ng_test.py command above (not the entire cert suite) - original-dmesg.txt: The dmesg output I captured when the stress_ng_test originally failed during the full cert suite run To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2058557/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2058557] Re: Kernel panic during checkbox stress_ng_test on Grace running noble 6.8 (arm64+largemem) kernel
Upon further investigation, the device mapper observation does not seem to be a hard line, as I was able to observe panics when stressing both dm-0 and nvme0n1 under different circumstances. At the moment, it also seems like the specific part of stress_ng_test that is the culprit is the "stress-ng aiol stressor". When running only the "aiol" stressor in isolation on linux-image-6.8.0-11-generic-64k, the panic reliably happens in under 5 minutes. Currently investigating to see if any other stress_ng tests cause the same issue on this kernel version, or if it is only aiol. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2058557 Title: Kernel panic during checkbox stress_ng_test on Grace running noble 6.8 (arm64+largemem) kernel Status in linux package in Ubuntu: New Bug description: A kernel oops and panic occurred during 22.04 SoC certification on Gunyolk (Grace/Grace) with 6.8 kernel, arm64+largemem variant Steps to reproduce: Run (as root) the following commands: add-apt-repository -y ppa:checkbox-dev/stable apt-add-repository -y ppa:firmware-testing-team/ppa-fwts-stable apt update apt install -y canonical-certification-server /usr/lib/checkbox-provider-base/bin/stress_ng_test.py disk --device dm-0 --base-time 240 stress_ng_test caused a kernel panic after about 5 minutes. I have attached dmesg output from my reproducer to this report. Initially, this was identified via a panic during the above test, which was running as part of a run of certify-soc-22.04. Attached is a tarball containing: - apport.linux-image-6.8.0-11-generic-64k.kzsondji.apport: The output of `ubuntu-bug linux` on the machine (after reboot) - reproduced-dmesg.202403201942: The dmesg output captured by kdump when I reproduced my original issue by running only the single stress_ng_test.py command above (not the entire cert suite) - original-dmesg.txt: The dmesg output I captured when the stress_ng_test originally failed during the full cert suite run To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2058557/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2058557] Re: Kernel panic during checkbox stress_ng_test on Grace running noble 6.8 (arm64+largemem) kernel
I have observed that this panic does not seem to happen when stressing non-device-mapper devices (ex: it panics when running /usr/lib/checkbox- provider-base/bin/stress_ng_test.py disk --device dm-0 --base-time 240, but completes successfully when running /usr/lib/checkbox-provider- base/bin/stress_ng_test.py disk --device nvme0n1 --base-time 240). I'm going to investigate this further to confirm. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2058557 Title: Kernel panic during checkbox stress_ng_test on Grace running noble 6.8 (arm64+largemem) kernel Status in linux package in Ubuntu: New Bug description: A kernel oops and panic occurred during 22.04 SoC certification on Gunyolk (Grace/Grace) with 6.8 kernel, arm64+largemem variant Steps to reproduce: Run (as root) the following commands: add-apt-repository -y ppa:checkbox-dev/stable apt-add-repository -y ppa:firmware-testing-team/ppa-fwts-stable apt update apt install -y canonical-certification-server /usr/lib/checkbox-provider-base/bin/stress_ng_test.py disk --device dm-0 --base-time 240 stress_ng_test caused a kernel panic after about 5 minutes. I have attached dmesg output from my reproducer to this report. Initially, this was identified via a panic during the above test, which was running as part of a run of certify-soc-22.04. Attached is a tarball containing: - apport.linux-image-6.8.0-11-generic-64k.kzsondji.apport: The output of `ubuntu-bug linux` on the machine (after reboot) - reproduced-dmesg.202403201942: The dmesg output captured by kdump when I reproduced my original issue by running only the single stress_ng_test.py command above (not the entire cert suite) - original-dmesg.txt: The dmesg output I captured when the stress_ng_test originally failed during the full cert suite run To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2058557/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2058557] Re: Kernel panic during checkbox stress_ng_test on Grace running noble 6.8 (arm64+largemem) kernel
** Changed in: linux (Ubuntu) Assignee: (unassigned) => Jose Ogando Justo (joseogando) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2058557 Title: Kernel panic during checkbox stress_ng_test on Grace running noble 6.8 (arm64+largemem) kernel Status in linux package in Ubuntu: New Bug description: A kernel oops and panic occurred during 22.04 SoC certification on Gunyolk (Grace/Grace) with 6.8 kernel, arm64+largemem variant Steps to reproduce: Run (as root) the following commands: add-apt-repository -y ppa:checkbox-dev/stable apt-add-repository -y ppa:firmware-testing-team/ppa-fwts-stable apt update apt install -y canonical-certification-server /usr/lib/checkbox-provider-base/bin/stress_ng_test.py disk --device dm-0 --base-time 240 stress_ng_test caused a kernel panic after about 5 minutes. I have attached dmesg output from my reproducer to this report. Initially, this was identified via a panic during the above test, which was running as part of a run of certify-soc-22.04. Attached is a tarball containing: - apport.linux-image-6.8.0-11-generic-64k.kzsondji.apport: The output of `ubuntu-bug linux` on the machine (after reboot) - reproduced-dmesg.202403201942: The dmesg output captured by kdump when I reproduced my original issue by running only the single stress_ng_test.py command above (not the entire cert suite) - original-dmesg.txt: The dmesg output I captured when the stress_ng_test originally failed during the full cert suite run To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2058557/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2058557] Re: Kernel panic during checkbox stress_ng_test on Grace running noble 6.8 (arm64+largemem) kernel
This is also reproducible on the latest mainline version (https://kernel.ubuntu.com/mainline/v6.8/arm64/, retrieved 20 Mar 2024 @ 5 PM): 20 Mar 22:54: Running stress-ng aiol stressor for 240 seconds... [ 354.451450] Unable to handle kernel paging request at virtual address 17be9b4aa3e187be [ 354.459580] Mem abort info: [ 354.462439] ESR = 0x9621 [ 354.466274] EC = 0x25: DABT (current EL), IL = 32 bits [ 354.471703] SET = 0, FnV = 0 [ 354.474819] EA = 0, S1PTW = 0 [ 354.478024] FSC = 0x21: alignment fault [ 354.482118] Data abort info: [ 354.485056] ISV = 0, ISS = 0x0021, ISS2 = 0x [ 354.490662] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 [ 354.495823] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [ 354.501251] [17be9b4aa3e187be] address between user and kernel address ranges [ 354.508548] Internal error: Oops: 9621 [#1] SMP [ 354.514245] Modules linked in: qrtr cfg80211 binfmt_misc nls_iso8859_1 input_leds dax_hmem cxl_acpi acpi_ipmi onboard_usb_hub nvidia_cspmu ipmi_ssif cxl_co re ipmi_devintf arm_cspmu_module arm_smmuv3_pmu ipmi_msghandler uio_pdrv_genirq uio spi_nor cppc_cpufreq joydev mtd acpi_power_meter dm_multipath nvme_fabrics efi_pstore nfnetlink dmi_sysfs ip_tables x_tables autofs4 btrfs blake2b_generic raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor xor_neon raid6_pq libcrc32c raid1 raid0 hid_generic rndis_host usbhid cdc_ether hid usbnet uas usb_storage crct10dif_ce polyval_ce polyval_generic ghash_ce s m4_ce_gcm sm4_ce_ccm sm4_ce sm4_ce_cipher sm4 sm3_ce sm3 nvme sha3_ce i2c_smbus ixgbe sha2_ce nvme_core ast sha256_arm64 xhci_pci sha1_ce xfrm_algo xhci_pci_r enesas i2c_algo_bit nvme_auth mdio spi_tegra210_quad i2c_tegra aes_neon_bs aes_neon_blk aes_ce_blk aes_ce_cipher [ 354.594676] CPU: 61 PID: 0 Comm: swapper/61 Kdump: loaded Not tainted 6.8.0-060800-generic-64k #202403131158 [ 354.604728] Hardware name: Supermicro MBD-G1SMH/G1SMH, BIOS 1.0c 12/28/2023 [ 354.611844] pstate: 034000c9 (nzcv daIF +PAN -UAO +TCO +DIT -SSBS BTYPE=--) [ 354.618962] pc : _raw_spin_lock_irqsave+0x44/0x100 [ 354.623863] lr : try_to_wake_up+0x68/0x758 [ 354.628053] sp : 8000807afaf0 [ 354.631436] x29: 8000807afaf0 x28: 0004 x27: [ 354.638731] x26: a06103dc8a98 x25: 8000807afd98 x24: 0002 [ 354.646027] x23: f8156840 x22: 17be9b4aa3e187be x21: [ 354.653323] x20: 0003 x19: 00c0 x18: 8000819a0098 [ 354.660619] x17: x16: x15: e97dca18 [ 354.667914] x14: x13: x12: [ 354.675208] x11: x10: x9 : a06100ba6810 [ 354.682504] x8 : x7 : 0040 x6 : 9080 [ 354.689800] x5 : c2fb0dc488b0 x4 : x3 : 894178c0 [ 354.697096] x2 : 0001 x1 : x0 : 17be9b4aa3e187be [ 354.704391] Call trace: [ 354.706886] _raw_spin_lock_irqsave+0x44/0x100 [ 354.711426] try_to_wake_up+0x68/0x758 [ 354.715254] wake_up_process+0x24/0x50 [ 354.719082] aio_complete+0x1c4/0x2b8 [ 354.722825] aio_complete_rw+0x11c/0x2c8 [ 354.726831] iomap_dio_bio_end_io+0x1f0/0x248 [ 354.731282] bio_endio+0x170/0x270 [ 354.734758] __dm_io_complete+0x180/0x200 [ 354.738855] clone_endio+0xc8/0x288 [ 354.742416] bio_endio+0x170/0x270 [ 354.745889] blk_mq_end_request_batch+0x2e0/0x558 [ 354.750696] nvme_pci_complete_batch+0x94/0x118 [nvme] [ 354.755958] nvme_irq+0x9c/0xb0 [nvme] [ 354.759788] __handle_irq_event_percpu+0x68/0x2c0 [ 354.764595] handle_irq_event+0x58/0xe8 [ 354.768511] handle_fasteoi_irq+0xb0/0x218 [ 354.772695] generic_handle_domain_irq+0x38/0x70 [ 354.777411] __gic_handle_irq_from_irqson.isra.0+0x180/0x310 [ 354.783195] gic_handle_irq+0x2c/0xa0 [ 354.786935] call_on_irq_stack+0x3c/0x50 [ 354.790941] do_interrupt_handler+0xb0/0xc8 [ 354.795214] el1_interrupt+0x48/0xf0 [ 354.798866] el1h_64_irq_handler+0x1c/0x40 [ 354.803050] el1h_64_irq+0x7c/0x80 [ 354.806523] cpuidle_enter_state+0xd8/0x790 [ 354.810795] cpuidle_enter+0x44/0x78 [ 354.814446] cpuidle_idle_call+0x15c/0x210 [ 354.818631] do_idle+0xb0/0x130 [ 354.821837] cpu_startup_entry+0x44/0x50 [ 354.825845] secondary_start_kernel+0xec/0x130 [ 354.830386] __secondary_switched+0xc0/0xc8 [ 354.834661] Code: b9001041 d503201f 5281 52800022 (88e17c02) [ 354.840893] SMP: stopping secondary CPUs [ 355.897569] SMP: failed to stop secondary CPUs 0-60,62-143 [ 355.904206] Starting crashdump kernel... [ 355.908214] [ cut here ] [ 355.912930] Some CPUs may be stale, kdump will be unreliable. [ 355.918807] WARNING: CPU: 61 PID: 0 at arch/arm64/kernel/machine_kexec.c:174 machine_kexec+0x48/0x1f0 [ 355.928236] Modules linked in: qrtr cfg80211 binfmt_misc nls_iso8859_1 input_leds dax_hmem