Crash on Noble must be able to read kernel dumps up to the v6.14 kernel. The following commits are needed to accomplish this
[1] https://github.com/crash-utility/crash/commit/6752571d8d782d07537a258a1ec8919ebd1308ad [2] https://github.com/crash-utility/crash/commit/3879e9104826d5ae14a0824ec47ab60056a249a7 [3] https://github.com/crash-utility/crash/commit/968debd0d5979dd9ddca3af0766bad714dbd51e3 [4] https://github.com/crash-utility/crash/commit/3d60d9d40457239683a5f20b01437db94f964fb8 [5] https://github.com/crash-utility/crash/commit/2795136a515446b798ebbfa257c97f0ca6ecb8ec The commit at [3] required quite a bit of modification as it was based on some foundational work set up in similar commits for another architectures and other code refactors. Eg. certain struct members, and symbols were defined in earlier commits such as: https://github.com/crash- utility/crash/commit/6dfda0d2235574cf80530ea92e0ddff270f9c039. Rather than bringing in the many patches upon this one depended, I brought in only the necessary changes. In the debdiff you will notice [4] is also a backport rather than a direct cherry-pick because several line offsets were required for clean application ** Patch added: "noble_crash.debdiff" https://bugs.launchpad.net/ubuntu/+source/makedumpfile/+bug/2125145/+attachment/5917829/+files/noble_crash.debdiff -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to crash in Ubuntu. https://bugs.launchpad.net/bugs/2125145 Title: [SRU] Makedumpfile: Errors and Page Exclusions When Opening Kernel Crashdump Files Generated on the Latest HWE Kernel Status in crash package in Ubuntu: Confirmed Status in makedumpfile package in Ubuntu: Fix Released Status in crash source package in Noble: New Status in makedumpfile source package in Noble: New Status in crash source package in Plucky: New Status in makedumpfile source package in Plucky: Fix Released Status in crash source package in Questing: New Status in makedumpfile source package in Questing: Fix Released Status in crash source package in Resolute: Confirmed Status in makedumpfile source package in Resolute: Fix Released Bug description: Note: Original description is at the bottom of this report [Impact] The current versions of Makedumpfile and Crash in the -updates pocket on Noble do not support the latest hardware enablement kernel for that platform, which is 6.14. There are several architecture-dependent and kernel flavor-dependent behaviours that I will outline below, but the steps to reproduce are the same. Reproducer steps: ----------------- Boot into a hardware enablement kernel. For example, on arm64 use the 6.14.0-1008-nvidia-64k kernel: KERNEL_VERSION=6.14.0-1008-nvidia-64k DISTRO=noble sudo apt update sudo apt install ubuntu-dbgsym-keyring echo "deb http://ddebs.ubuntu.com ${DISTRO} main restricted universe multiverse deb http://ddebs.ubuntu.com ${DISTRO}-updates main restricted universe multiverse | \ sudo tee /etc/apt/sources.list.d/ddebs.list sudo apt update sudo apt install linux-image-${KERNEL_VERSION} sudo apt install linux-image-unsigned-${KERNEL_VERSION}-dbgsym Modify grub's cmdline to specify a crashkernel: GRUB_CMDLINE_LINUX_DEFAULT="quiet splash crashkernel=512M" # Or similar sudo update-grub sudo apt install kexec-tools kdump-tools crash makedumpfile sudo systemctl enable kdump-tools sudo systemctl start kdump-tools sudo reboot echo c | sudo tee /proc/sysrq-trigger After the machine recovers, crash /usr/lib/debug/boot/<kernel-dbgsym> /var/crash/<dump-dir>/<dump- file> Results on Arm64 ---------------- crash 8.0.4 Copyright (C) 2002-2022 Red Hat, Inc. ... For help, type "help". Type "apropos word" to search for commands related to "word"... please wait... (gathering task table data) crash: page excluded: kernel virtual address: ffff07ffa042d8e0 type: "xa_node.slots[off]" Results on amd64 ---------------- On an amd64 machine, using a kernel such as linux- image-6.14.0-29-generic results in crash failing to open. No error is printed but we don't obtain the prompt: crash 8.0.4 ... For help, type "help". Type "apropos word" to search for commands related to "word"... # Program exits and no prompt is presented [Test Plan] * Ensure that with the proposed combination of makedumpfile and crash is capable of generating and subsequently opening crashdumps on the HWE and GA kernels available for that platform. Here is the mapping ATOW: Noble GA: 6.8 Noble HWE: 6.14 Plucky (interim release, no HWE): 6.14 Questing (interim release, no HWE): 6.17 Resolute (development): 6.17 (as of Oct. 14th 2025) * Ensure all of crash's commands produce the expected output (eg. ps, mount, files, vm, vtop, runq, etc.) * If bugs are found in generating and reading crashdumps on the HWE kernel on other architectures (s390x, etc.), this test plan can be expanded to include those. [Where Problems Could Occur] * Crash and Makedumpfile are designed to be backwards-compatible, so the risk of regression when backporting a commit is low - however, not zero. This is why it will be important to ensure that the proposed combination of Makedumpfile and crash does not break existing environments - eg. the GA kernel * The matrix of hardware and kernel versions (including derivative / cloud kernels) to test again is extensive. It's possible that the commits identified to solve the known problems will not be comprehensive. For example, cpu architectures and kernels not in the test matrix may require additional commits to be backported. [Other Info] * Support/SEG are currently having conversations with the kernel team about the potential to proactively SRU / MRE the latest upstream crash version, and potentially Makedumpfile as well, alongside -hwe kernel releases to avoid this sort of regression in the future. Though, we understand this would require an SRUExceptionPolicy to be approved and published. [Investigation and summary of changes] We have identified that on the Makedumpfile at least two commits are needed: [1] https://github.com/makedumpfile/makedumpfile/commit/985e575253f1c2de8d6876cfe685c68a24ee06e1 [2] https://github.com/makedumpfile/makedumpfile/commit/bad2a7c4fa75d37a41578441468584963028bdda These are patches to compensate for a change in the kernel's mapping of memory. Using the patched Makedumpfile helps, but it is not sufficient. Including the patches in Makedumpfile (or using the tip of upstream master), but opening with the currently distributed crash results in the following errors: eg. Patched Makedumpfile with crash 8.0.4 on Arm64: --------------------------------------------------- ... WARNING: cannot determine starting stack frame for task ffffd574e21b4800 WARNING: cannot determine starting stack frame for task ffff07ff83296300 WARNING: cannot determine starting stack frame for task ffff07ff83293f80 WARNING: cannot determine starting stack frame for task ffff07ff83a04700 WARNING: cannot determine starting stack frame for task ffff08010507c400 KERNEL: /usr/lib/debug/boot/vmlinux-6.14.0-1008-nvidia-64k DUMPFILE: /var/crash/patched_mdf/dump.202509191531 [PARTIAL DUMP] CPUS: 128 [OFFLINE: 127] DATE: Thu Jan 1 00:00:00 UTC 1970 UPTIME: 00:13:38 LOAD AVERAGE: 0.12, 0.16, 0.10 TASKS: 1573 NODENAME: penguru RELEASE: 6.14.0-1008-nvidia-64k VERSION: #8-Ubuntu SMP PREEMPT_DYNAMIC Sat Jul 26 02:43:53 UTC 2025 MACHINE: aarch64 (unknown Mhz) MEMORY: 63.8 GB PANIC: "Kernel panic - not syncing: sysrq triggered crash" PID: 7886 COMMAND: "tee" TASK: ffff08010507c400 [THREAD_INFO: ffff08010507c400] CPU: 85 STATE: TASK_RUNNING (PANIC) On Amd64 -------- Crash still fails to open. Therefore, in addition to the above Makedumpfile commits, crash requires some patching. With the above two commits to Makedumpfile I did a bisect on crash on amd64 and arm64. On the amd64 crash side, I have identified that [3] applied in isolation (cherry-picked) is sufficient on amd64 [3] https://github.com/crash-utility/crash/commit/6752571d8d782d07537a258a1ec8919ebd1308ad I have also found that cherry-picking [4] and [5] resolves the issue on arm64 hardware in testflinger (using the machine agent penguru) [4] https://github.com/crash-utility/crash/commit/3879e9104826d5ae14a0824ec47ab60056a249a7 [5] https://github.com/crash-utility/crash/commit/968debd0d5979dd9ddca3af0766bad714dbd51e3 At this point, crash's commands such as mount, files, vm, etc. were still broken. To resolve this, [6] and [7] are needed [6] https://github.com/crash-utility/crash/commit/3d60d9d40457239683a5f20b01437db94f964fb8 [7] https://github.com/crash-utility/crash/commit/2795136a515446b798ebbfa257c97f0ca6ecb8ec To SRU for Noble, crash must also be work on Plucky, Questing, and Resolute. The current version of makedumpfile on all of those series was found to be sufficient and so no SRU for makedumpfile is required on those. However for crash: * Plucky uses the 6.14 kernel, so no additional commits are needed - in fact due to the newer version available on Plucky, only [7] is needed. * Questing uses the 6.17 kernel. No issues other than [7] were observed on arm, but on amd64, an infinite loop while gdb loaded module symbols was observed, This is fixed in [8]. * Resolute will ship with a newer kernel than 6.17, but as of October 14th, 2025 is currently based on 6.17. Currently the package in Debian unstable, which will autosync to Resolute does not contain the required fixes and so it will also require SRU with [7] and [8] unless superceded by an upstream (Debian) version bump. [8] https://github.com/crash- utility/crash/commit/e44a9a9d808c83fb846060f65e5aaa9d30b6e2c4 PPA with all of the packages built (except resolute): https://launchpad.net/~bryanfraschetti/+archive/ubuntu/lp2125145 -------------------------------------------------------------- Original Description: ===================== 24.04 LTS, Linux 6.14.0-29-generic #29~24.04.1-Ubuntu SMP PREEMPT_DYNAMIC Thu Aug 14 16:52:50 UTC 2 x86_64 x86_64 x86_64 GNU/Linux Problem Description: crash utility is crashing (error code 1) when attempting to analyze kernel crash dumps. Setup kdump & generated kernel panic using “echo 1 > /proc/sys/kernel/sysrq” but, crash cannot access it: # crash /usr/lib/debug/boot/vmlinux-6.14.0-29-generic dump.202509161821 crash 8.0.4 Copyright (C) 2002-2022 Red Hat, Inc. Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation Copyright (C) 1999-2006 Hewlett-Packard Co Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited Copyright (C) 2006, 2007 VA Linux Systems Japan K.K. Copyright (C) 2005, 2011, 2020-2022 NEC Corporation Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc. Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc. Copyright (C) 2015, 2021 VMware, Inc. This program is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Enter "help copying" to see the conditions. This program has absolutely no warranty. Enter "help warranty" for details. GNU gdb (GDB) 10.2 Copyright (C) 2021 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-pc-linux-gnu". Type "show configuration" for configuration details. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... # echo $? 1 running as root user and file is readable fine: $ :/var/crash/202509161821# ls -l total 299144 -rw------- 1 root whoopsie 119627 Sep 16 18:21 dmesg.202509161821 -rw-r--r-- 1 root whoopsie 306200163 Sep 16 18:21 dump.202509161821 symbol file is there: # ls -l /usr/lib/debug/boot/vmlinux-6.14.0-29-generic* -rw-r--r-- 1 root root 450705920 Aug 14 18:02 /usr/lib/debug/boot/vmlinux-6.14.0-29-generic tail of strace: 14:06:20.661240 rt_sigaction(SIGPIPE, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER|SA_NODEFER, sa_restorer=0x7b0841845330}, NULL, 8) = 0 <0.000008> 14:06:20.661281 rt_sigaction(SIGINT, {sa_handler=0x5ec383cbceb0, sa_mask=[], sa_flags=SA_RESTORER|SA_NODEFER, sa_restorer=0x7b0841845330}, NULL, 8) = 0 <0.000008> 14:06:20.661322 rt_sigaction(SIGSEGV, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER|SA_NODEFER, sa_restorer=0x7b0841845330}, NULL, 8) = 0 <0.000008> 14:06:20.661360 write(1, "\n", 1 ) = 1 <0.000119> 14:06:20.661579 lseek(3, 10312, SEEK_SET) = 10312 <0.000010> 14:06:20.661617 read(3, "OSRELEASE=6.14.0-29-generic\nBUIL"..., 3276) = 3276 <0.000011> 14:06:20.661748 unlink("/var/tmp/ramdump_elf_XXXXXX") = -1 ENOENT (No such file or directory) <0.002921> 14:06:20.664817 exit_group(1) = ? 14:06:20.690105 +++ exited with 1 +++ full crash strace https://filebin.net/custom-bin/crash.strace.1 ProblemType: Bug DistroRelease: Ubuntu 24.04 Package: crash 8.0.4-1ubuntu2 ProcVersionSignature: Ubuntu 6.14.0-29.29~24.04.1-generic 6.14.8 Uname: Linux 6.14.0-29-generic x86_64 ApportVersion: 2.28.1-0ubuntu3.8 Architecture: amd64 CasperMD5CheckResult: pass Date: Thu Sep 18 20:21:26 2025 InstallationDate: Installed on 2025-09-04 (14 days ago) InstallationMedia: Ubuntu 24.04.2 LTS "Noble Numbat" - Release amd64 (20250215) ProcEnviron: LANG=en_US.UTF-8 PATH=(custom, no user) SHELL=/bin/bash TERM=xterm-256color SourcePackage: crash UpgradeStatus: No upgrade log present (probably fresh install) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/crash/+bug/2125145/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : [email protected] Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp

