Control: reassign -1 src:linux
Dear Håkan,
thanks for reporting back and testing!
* Håkan T Johansson [220801 19:31]:
> On Sun, 31 Jul 2022, Chris Hofstaedtler wrote:
>
> > I can't see a difference that should matter from userspace.
> >
> > I have stared a bit at the kernel code... there have been quite some
> > changes and fixes in this area. Which kernel version were you
> > running when testing this?
> >
> > Could you retry on something >= 5.9? I.e. some version with patch
> >08fc1ab6d748ab1a690fd483f41e2938984ce353.
>
> I believe that I was running 5.10 (bullseye).
>
> It looks like 5.18 (from backports) does not show the issue! (i.e. works)
Okay, I think we are now clearly in "this is not an mdadm bug per
se" territory (-> reassigning to src:linux).
[..]
> This time I did get some dmesg BUG output as well (attached).
> It does not seem to be the same backtrace on two occurances.
>
> I also noticed that the BUG: report in dmesg does not happen directly
> when doing 'mdadm --examine --scan --config=partitions'. It rather
> occurs when some activity happens on the host filesystem, e.g.
> a 'touch /root/a' command.
>
> host:
> linux-image-5.18.0-0.bpo.1-amd64 5.18.2-1~bpo11+1
>
> (did not re-install anything else, except upgraded zfs, also from
> backports (since pure bullseye would not compile with 5.18))
>
> Does not exhibit the problem.
>
> I have tried with both kernels several times, and it was repeatable that
> 5.10 got stuck while 5.18 does not show issues.
Its good that this now works in 5.18. However I'm not sure how we
should find the commit fixing this - in 5.14 lots of block layer
code was shuffled around/refactored.
If you have the time, maybe trying the various kernel versions
between 5.10 and 5.18 would be a good start. If they are not in
backports anymore, they should still be at
http://snapshot.debian.org/package/linux/
> Reminder: to get the issue, /dev/ should not be mounted in the chroot.
> With /dev/ mounted, 5.10 also works.
I'll see if I can repro this on 5.10, but need to find a box first.
Best,
Chris
> [mån aug 1 15:53:08 2022] BUG: kernel NULL pointer dereference, address:
> 0010
> [mån aug 1 15:53:08 2022] #PF: supervisor read access in kernel mode
> [mån aug 1 15:53:08 2022] #PF: error_code(0x) - not-present page
> [mån aug 1 15:53:08 2022] PGD 0 P4D 0
> [mån aug 1 15:53:08 2022] Oops: [#1] SMP PTI
> [mån aug 1 15:53:08 2022] CPU: 2 PID: 284256 Comm: cron Tainted: P
> OE 5.10.0-16-amd64 #1 Debian 5.10.127-2
> [mån aug 1 15:53:08 2022] Hardware name: Dell Computer Corporation PowerEdge
> 2850/0T7971, BIOS A04 09/22/2005
> [mån aug 1 15:53:08 2022] RIP:
> 0010:__ext4_journal_get_write_access+0x29/0x120 [ext4]
> [mån aug 1 15:53:08 2022] Code: 00 0f 1f 44 00 00 41 57 41 56 41 89 f6 41 55
> 41 54 49 89 d4 55 48 89 cd 53 48 83 ec 10 48 89 3c 24 e8 ab d7 bb e1 48 8b 45
> 30 <4c> 8b 78 10 4d 85 ff 74 2f 49 8b 87 e0 00 00 00 49 8b 9f 88 03 00
> [mån aug 1 15:53:08 2022] RSP: 0018:ae27c059fd60 EFLAGS: 00010246
> [mån aug 1 15:53:08 2022] RAX: RBX: 9d1b94505480 RCX:
> 9d1bc52e5e38
> [mån aug 1 15:53:08 2022] RDX: 9d1bc13782d8 RSI: 0c14 RDI:
> c096feb0
> [mån aug 1 15:53:08 2022] RBP: 9d1bc52e5e38 R08: 9d1be04d5230 R09:
> 0001
> [mån aug 1 15:53:08 2022] R10: 9d1bc985f000 R11: 001d R12:
> 9d1bc13782d8
> [mån aug 1 15:53:08 2022] R13: 9d1be04d5000 R14: 0c14 R15:
> 9d1bc13782d8
> [mån aug 1 15:53:08 2022] FS: 7fed5ecb1840()
> GS:9d1cd7c8() knlGS:
> [mån aug 1 15:53:08 2022] CS: 0010 DS: ES: CR0: 80050033
> [mån aug 1 15:53:08 2022] CR2: 0010 CR3: 0001a46d8000 CR4:
> 06e0
> [mån aug 1 15:53:08 2022] Call Trace:
> [mån aug 1 15:53:08 2022] ext4_orphan_del+0x23f/0x290 [ext4]
> [mån aug 1 15:53:08 2022] ext4_evict_inode+0x31f/0x630 [ext4]
> [mån aug 1 15:53:08 2022] evict+0xd1/0x1a0
> [mån aug 1 15:53:08 2022] __dentry_kill+0xe4/0x180
> [mån aug 1 15:53:08 2022] dput+0x149/0x2f0
> [mån aug 1 15:53:08 2022] __fput+0xe4/0x240
> [mån aug 1 15:53:08 2022] task_work_run+0x65/0xa0
> [mån aug 1 15:53:08 2022] exit_to_user_mode_prepare+0x111/0x120
> [mån aug 1 15:53:08 2022] syscall_exit_to_user_mode+0x28/0x140
> [mån aug 1 15:53:08 2022] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [mån aug 1 15:53:08 2022] RIP: 0033:0x7fed5eea2d77
> [mån aug 1 15:53:08 2022] Code: 44 00 00 48 8b 15 19 a1 0c 00 f7 d8 64 89 02
> b8 ff ff ff ff eb bc 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 b8 03 00 00 00 0f
> 05 <48> 3d 00 f0 ff ff 77 01 c3 48 8b 15 e9 a0 0c 00 f7 d8 64 89 02 b8
> [mån aug 1 15:53:08 2022] RSP: 002b:7ffd50452818 EFLAGS: 0202
> ORIG_RAX: 0003
> [mån aug 1 15:53:08 2022] RAX: RBX: 55dab4578910 RCX:
> 7fed5eea2d77
> [mån au