Performing verification for jammy
I started a i3.8xlarge instance on AWS, and installed 5.15.0-144-generic from
-updates.
$ uname -rv
5.15.0-144-generic #157-Ubuntu SMP Mon Jun 16 07:33:10 UTC 2025
I ran through the reproducer:
$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
xvda 202:0 0 8G 0 disk
├─xvda1 202:1 0 7.9G 0 part /
├─xvda14 202:14 0 4M 0 part
└─xvda15 202:15 0 106M 0 part /boot/efi
nvme0n1 259:0 0 1.7T 0 disk
nvme2n1 259:1 0 1.7T 0 disk
nvme1n1 259:2 0 1.7T 0 disk
nvme3n1 259:3 0 1.7T 0 disk
$ sudo mdadm --create --verbose /dev/md0 --level=10 --raid-devices=4
/dev/nvme0n1 /dev/nvme1n1 /dev/nvme2n1 /dev/nvme3n1
mdadm: layout defaults to n2
mdadm: layout defaults to n2
mdadm: chunk size defaults to 512K
mdadm: size set to 1855336448K
mdadm: automatically enabling write-intent bitmap on large array
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md0 started.
$ sudo mkfs.xfs -K /dev/md0
log stripe unit (524288 bytes) is too large (maximum is 256KiB)
log stripe unit adjusted to 32KiB
meta-data=/dev/md0 isize=512 agcount=32, agsize=28989568 blks
= sectsz=512 attr=2, projid32bit=1
= crc=1 finobt=1, sparse=1, rmapbt=0
= reflink=1 bigtime=0 inobtcount=0
data = bsize=4096 blocks=927666176, imaxpct=5
= sunit=128 swidth=256 blks
naming =version 2 bsize=4096 ascii-ci=0, ftype=1
log =internal log bsize=4096 blocks=452968, version=2
= sectsz=512 sunit=8 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
$ sudo mkdir /mnt/disk
$ sudo mount /dev/md0 /mnt/disk
Ran the trim:
$ sudo fstrim /mnt/disk
Checked dmesg:
$ sudo dmesg
kernel: BUG: kernel NULL pointer dereference, address: 0000000000000000
kernel: #PF: supervisor instruction fetch in kernel mode
kernel: #PF: error_code(0x0010) - not-present page
kernel: PGD 0 P4D 0
kernel: Oops: 0010 [#1] SMP PTI
kernel: CPU: 2 PID: 1536 Comm: fstrim Not tainted 5.15.0-144-generic #157-Ubuntu
kernel: Hardware name: Xen HVM domU, BIOS 4.11.amazon 08/24/2006
kernel: RIP: 0010:0x0
kernel: Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6.
kernel: RSP: 0018:ffffafdec35eb768 EFLAGS: 00010206
kernel: RAX: 0000000000000000 RBX: 0000000000092800 RCX: 0000000000000001
kernel: RDX: ffff8fd6dcb066f0 RSI: 0000000000000000 RDI: 0000000000092800
kernel: RBP: ffffafdec35eb7d8 R08: ffff8fd6fa3806c0 R09: ffff8fd6c106e650
kernel: R10: 0000000000000246 R11: ffff8fd6c0210390 R12: 0000000000092c00
kernel: R13: 0000000000000400 R14: ffff8fd6dcb06708 R15: ffff8fd6ca8ee600
kernel: FS: 00007fe63cb48800(0000) GS:ffff901249e80000(0000)
knlGS:0000000000000000
kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
kernel: CR2: ffffffffffffffd6 CR3: 0000000135a1e003 CR4: 00000000001706e0
kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
kernel: Call Trace:
kernel: <TASK>
kernel: mempool_alloc+0x64/0x1b0
kernel: ? __kmalloc+0x179/0x330
kernel: bio_alloc_bioset+0x9d/0x370
kernel: ? r10bio_pool_alloc+0x26/0x30 [raid10]
kernel: bio_clone_fast+0x1f/0x90
kernel: md_account_bio+0x42/0x80
kernel: raid10_handle_discard+0x56f/0x6b0 [raid10]
kernel: ? finish_wait+0x5b/0x80
kernel: ? wait_woken+0x70/0x70
kernel: raid10_make_request+0x147/0x180 [raid10]
kernel: md_handle_request+0x12d/0x1b0
kernel: ? submit_bio_checks+0x1a5/0x580
kernel: md_submit_bio+0x76/0xc0
kernel: __submit_bio+0x1a5/0x220
kernel: ? mempool_alloc_slab+0x17/0x20
kernel: __submit_bio_noacct+0x85/0x200
kernel: submit_bio_noacct+0x4e/0x120
kernel: ? bio_alloc_bioset+0x9d/0x370
kernel: submit_bio+0x4a/0x130
kernel: __blkdev_issue_discard+0x141/0x280
kernel: ? xfs_btree_lookup+0x22c/0x5c0 [xfs]
kernel: blkdev_issue_discard+0x65/0xd0
kernel: xfs_trim_extents+0x1cc/0x3b0 [xfs]
kernel: xfs_ioc_trim+0x19c/0x260 [xfs]
kernel: xfs_file_ioctl+0x7c3/0xb00 [xfs]
kernel: ? putname+0x59/0x70
kernel: ? kmem_cache_free+0x24f/0x290
kernel: ? putname+0x59/0x70
kernel: ? do_sys_openat2+0x8b/0x160
kernel: __x64_sys_ioctl+0x95/0xd0
kernel: x64_sys_call+0x1e5f/0x1fa0
kernel: do_syscall_64+0x56/0xb0
I can reproduce.
I then rebooted, and enabled Kernel Team PPA2 for security cycle
kernels:
https://launchpad.net/~canonical-kernel-
team/+archive/ubuntu/ppa2/+packages?field.name_filter=&field.status_filter=published&field.series_filter=jammy
I then installed 5.15.0-151-generic.
$ uname -rv
5.15.0-151-generic #161-Ubuntu SMP Tue Jul 22 14:25:40 UTC 2025
$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
loop0 7:0 0 27.2M 1 loop /snap/amazon-ssm-agent/11320
loop1 7:1 0 63.8M 1 loop /snap/core20/2571
loop2 7:2 0 73.9M 1 loop /snap/core22/1963
loop3 7:3 0 89.4M 1 loop /snap/lxd/31333
loop4 7:4 0 50.9M 1 loop /snap/snapd/24671
xvda 202:0 0 8G 0 disk
├─xvda1 202:1 0 7.9G 0 part /
├─xvda14 202:14 0 4M 0 part
└─xvda15 202:15 0 106M 0 part /boot/efi
nvme0n1 259:0 0 1.7T 0 disk
nvme2n1 259:1 0 1.7T 0 disk
nvme3n1 259:2 0 1.7T 0 disk
nvme1n1 259:3 0 1.7T 0 disk
$ sudo mdadm --create --verbose /dev/md0 --level=10 --raid-devices=4
/dev/nvme0n1 /dev/nvme1n1 /dev/nvme2n1 /dev/nvme3n1
mdadm: layout defaults to n2
mdadm: layout defaults to n2
mdadm: chunk size defaults to 512K
mdadm: size set to 1855336448K
mdadm: automatically enabling write-intent bitmap on large array
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md0 started.
$ sudo fstrim -v /mnt/disk
/mnt/disk: 3.5 TiB (3797863956480 bytes) trimmed
$ sudo dmesg
<clean>
The 5.15.0-151-generic kernel in -ppa2 fixes the issue. Happy to mark verified
for jammy.
** Tags added: verification-done-jammy-linux
** Tags added: sts
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2117395
Title:
raid10: block discard causes a NULL pointer dereference after
5.15.0-144-generic
Status in linux package in Ubuntu:
Fix Released
Status in linux source package in Jammy:
Fix Committed
Bug description:
BugLink: https://bugs.launchpad.net/bugs/2117395
[Impact]
The below commit was backported to 5.15.181 -stable, and introduced a NULL
pointer dereference in the raid10 subsystem, due to io_acct_set only being
used
in raid 0 and 456, and not 1 or 10.
commit d05af90d6218e9c8f1c2026990c3f53c1b41bfb0
Author: Yu Kuai <[email protected]>
Date: Tue Mar 25 09:57:46 2025 +0800
Subject: md/raid10: fix missing discard IO accounting
Link:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d05af90d6218e9c8f1c2026990c3f53c1b41bfb0
Kernel oops:
kernel: BUG: kernel NULL pointer dereference, address: 0000000000000000
kernel: #PF: supervisor instruction fetch in kernel mode
kernel: #PF: error_code(0x0010) - not-present page
kernel: PGD 0 P4D 0
kernel: Oops: 0010 [#1] SMP PTI
kernel: CPU: 5 PID: 784107 Comm: fstrim Not tainted 5.15.0-144-generic
#157-Ubuntu
kernel: RIP: 0010:0x0
kernel: Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6.
kernel: RSP: 0018:ffffb576409c7858 EFLAGS: 00010206
kernel: RAX: 0000000000000000 RBX: 0000000000092800 RCX: 0000000000000001
kernel: RDX: ffff8e7e012426f0 RSI: 0000000000000000 RDI: 0000000000092800
kernel: RBP: ffffb576409c78c8 R08: ffff8e884ec966c0 R09: ffff8e7e07c6b050
kernel: R10: 0000000000002ecb R11: 00000000000030c8 R12: 0000000000092c00
kernel: R13: 0000000000000400 R14: ffff8e7e01242708 R15: ffff8e7e10743400
kernel: FS: 00007f6fff9f0800(0000) GS:ffff8e8cee540000(0000)
knlGS:0000000000000000
kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
kernel: CR2: ffffffffffffffd6 CR3: 00000001090f6005 CR4: 00000000003706e0
kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
kernel: Call Trace:
kernel: <TASK>
kernel: mempool_alloc+0x61/0x1b0
kernel: ? __kmalloc+0x179/0x330
kernel: bio_alloc_bioset+0x9d/0x370
kernel: ? r10bio_pool_alloc+0x26/0x30 [raid10]
kernel: bio_clone_fast+0x1f/0x90
kernel: md_account_bio+0x42/0x80
kernel: raid10_handle_discard+0x56f/0x6b0 [raid10]
kernel: raid10_make_request+0x147/0x180 [raid10]
kernel: md_handle_request+0x12a/0x1b0
kernel: ? submit_bio_checks+0x1a5/0x580
kernel: md_submit_bio+0x76/0xc0
kernel: __submit_bio+0x1a2/0x220
kernel: ? mempool_alloc_slab+0x17/0x20
kernel: ? mempool_alloc+0x61/0x1b0
kernel: ? schedule_timeout+0x91/0x140
kernel: __submit_bio_noacct+0x85/0x200
kernel: submit_bio_noacct+0x4e/0x120
kernel: ? __cond_resched+0x1a/0x60
kernel: submit_bio+0x4a/0x130
kernel: submit_bio_wait+0x5a/0xc0
kernel: blkdev_issue_discard+0x7e/0xd0
kernel: ext4_try_to_trim_range+0x2db/0x520
kernel: ? ext4_mb_load_buddy_gfp+0x91/0x3e0
kernel: ext4_trim_fs+0x313/0x510
kernel: __ext4_ioctl+0x82c/0xef0
kernel: ext4_ioctl+0xe/0x20
kernel: __x64_sys_ioctl+0x92/0xd0
kernel: x64_sys_call+0x1e5f/0x1fa0
kernel: do_syscall_64+0x56/0xb0
kernel: entry_SYSCALL_64_after_hwframe+0x6c/0xd6
A workaround is to disable the systemd weekly fstrim timer and to not fstrim /
discard blocks while the problem exists.
[Fix]
The below necessary commit was mainlined in 6.6-rc1 and needs to be backported
to jammy.
commit c567c86b90d4715081adfe5eb812141a5b6b4883
Author: Yu Kuai <[email protected]>
Date: Thu Jun 22 00:51:03 2023 +0800
Subject: md: move initialization and destruction of 'io_acct_set' to md.c
Link:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c567c86b90d4715081adfe5eb812141a5b6b4883
This needs a minor backport, adjusting __md_stop() to md_stop().
[Testcase]
You will need a machine with at least 4x NVMe drives which support block
discard. I use a i3.8xlarge instance on AWS, since it has all of these things.
$ lsblk
xvda 202:0 0 8G 0 disk
└─xvda1 202:1 0 8G 0 part /
nvme0n1 259:2 0 1.7T 0 disk
nvme1n1 259:0 0 1.7T 0 disk
nvme2n1 259:1 0 1.7T 0 disk
nvme3n1 259:3 0 1.7T 0 disk
Create a Raid10 array:
$ sudo mdadm --create --verbose /dev/md0 --level=10 --raid-devices=4
/dev/nvme0n1 /dev/nvme1n1 /dev/nvme2n1 /dev/nvme3n1
Format the array with XFS (use -K to disable initial discard):
$ sudo mkfs.xfs -K /dev/md0
$ sudo mkdir /mnt/disk
$ sudo mount /dev/md0 /mnt/disk
Do a fstrim:
$ sudo fstrim /mnt/disk
There are test packages available in the following ppa:
https://launchpad.net/~mruffell/+archive/ubuntu/sf414897-test
If you install the test kernel, the kernel will no longer panic on
fstrim.
[Where problems can occur]
This changes io_acct_set from being sometimes initialised, mostly under raid
0,
456 to being always initialised under all raid types.
If a regression were to occur, it would likely impact block discard on any
raid
type, not just raid 10, but raid 10 would carry more risk as we may be missing
more patches due to discard on raid10 being very new, as in the last 5 or so
years, versus 0, 456 which have had full discard for a decade or more.
The workarounds would be the same, to disable the systemd block discard timer
or disable fstrim.
[Other info]
Upstream bug:
https://lists.linaro.org/archives/list/[email protected]/thread/TM2PPS3XKE6M5H2FW63MLZV2T7HTM3QJ/
Debian bug:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1104460
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2117395/+subscriptions
--
Mailing list: https://launchpad.net/~kernel-packages
Post to : [email protected]
Unsubscribe : https://launchpad.net/~kernel-packages
More help : https://help.launchpad.net/ListHelp