Control: retitle -1 linux-image-6.5.0-1-amd64: Kernel page fault in
process exit due to bit flip
Control: tag -1 moreinfo

On Wed, 2023-09-27 at 20:45 +0200, Gabriel Francisco wrote:
> Package: src:linux
> Version: 6.5.3-1
> Severity: important
> Tags: upstream
> X-Debbugs-Cc: frc.gabr...@gmail.com
> 
> Dear Maintainer,
> 
> First of all thanks for your hard work!
> 
> I noticed my computer started freezing for few seconds when entering/exiting
> full screen videos in youtube using firefox and while trying to check if the
> issue also afected chromium I saw the following message in dmesg:
> 
> [12569.564300] BUG: unable to handle page fault for address: ffff991989e936b8
> [12569.564304] #PF: supervisor write access in kernel mode
> [12569.564306] #PF: error_code(0x0002) - not-present page

The first BUG message should be more meaningful that what comes after.
This shows the kernel tried to access non-existent memory.

> [12569.564308] PGD 0 P4D 0 
> [12569.564311] Oops: 0002 [#1] PREEMPT SMP NOPTI
> [12569.564314] CPU: 10 PID: 328649 Comm: Chroot Helper Not tainted 
> 6.5.0-1-amd64 #1  Debian 6.5.3-1
> [12569.564317] Hardware name: ASUS System Product Name/ROG STRIX B550-F 
> GAMING WIFI II, BIOS 3205 08/14/2023
> [12569.564318] RIP: 0010:down_write+0x23/0x70
> [12569.564324] Code: 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 53 48 89 
> fb e8 2e bc ff ff bf 01 00 00 00 e8 74 3a 53 ff 31 c0 ba 01 00 00 00 <f0> 48 
> 0f b1 13 75 33 65 48 8b 04 25 80 29 03 00 48 89 43 08 bf 01
> [12569.564326] RSP: 0018:ffffa189d736fc70 EFLAGS: 00010246
> [12569.564328] RAX: 0000000000000000 RBX: ffff991989e936b8 RCX: 
> ffff891797aaef00
> [12569.564330] RDX: 0000000000000001 RSI: ffff891989e645c0 RDI: 
> ffffffff8e7c95dc
> [12569.564331] RBP: ffffffffffffffff R08: 0000000000000060 R09: 
> 0000000080400014
> [12569.564333] R10: ffff8918cbfeb7f8 R11: 0000000000000006 R12: 
> 00007f7e5fd00000
> [12569.564334] R13: 0000000000000001 R14: ffff891989e645c0 R15: 
> ffff891989e64958

The CPU registers contain several addresses starting ffff89, except for
rbx which starts ffff99 (and is the faulting address).  That looks like
a single bit got flipped.

This could be due to a kernel bug, but is more likely a hardware
problem.  Please test the RAM with memtest86+.  Also if you've enabled
any overclocking options, turn those off.

[...]
> After that the computer can't shutdown and systemd keeps waiting on process 
> PID 328649 (Chroot Helper).

This (and the other BUG messages) are because that process crashed in
kernel mode and couldn't properly exit.

Ben.

-- 
Ben Hutchings
Beware of bugs in the above code;
I have only proved it correct, not tried it. - Donald Knuth

Attachment: signature.asc
Description: This is a digitally signed message part

Reply via email to