Just an update on this bug.

I found this link: https://gitlab.freedesktop.org/drm/amd/-/issues/2877
where other people are facing the same issue as me and I can confirm that
disabling FreeSync on my monitor settings makes the freezes/hangs to
disappear.

Best,

*Gabriel Francisco*
Linux User #507840
email: frc.gabriel[at]gmail.com <[email protected]>


On Thu, Oct 12, 2023 at 8:36 PM Gabriel Francisco <[email protected]>
wrote:

> ---------- Forwarded message ---------
> From: Gabriel Francisco <[email protected]>
> Date: Thu, Oct 12, 2023 at 8:23 PM
> Subject: Re: Bug#1053122: linux-image-6.5.0-1-amd64: using
> smp_processor_id() in preemptible
> To: Ben Hutchings <[email protected]>
>
>
> Hi,
>
> > The CPU registers contain several addresses starting ffff89, except for
> > rbx which starts ffff99 (and is the faulting address).  That looks like
> > a single bit got flipped.
>
> Thanks for the explanation! (now I know how to detect bit flips) :D
>
> > The first BUG message should be more meaningful that what comes after.
> > This shows the kernel tried to access non-existent memory.
>
> Yes, I should have reported the first one indeed, I thought too much and
> ended reporting the second one. Sorry about that.
>
> > This could be due to a kernel bug, but is more likely a hardware
> > problem.  Please test the RAM with memtest86+.  Also if you've enabled
> > any overclocking options, turn those off.
>
> Even with XMP([email protected]) enabled (F4-3000C16-16GISB), memtest86+ ran for
> 3 hours and printed PASS in the screen.
> I removed the XMP profile from my memories and ordered new rams to check
> if my current ones are faulty (or not).
>
> The message in dmesg was only one occasion. (but I reported it anyways)
>
> The hang does still happens with/without XMP when running 6.5.x kernel
> series. It happens when maximizing a video (or time-to-time when my cursor
> enters the video area) when using kernel 6.5.x. It does not happen with
> kernel 6.1.x series.
>
> I'm using amgpu module.
>
> Greetings,
>
> *Gabriel Francisco*
> Linux User #507840
> email: frc.gabriel[at]gmail.com <[email protected]>
>
>
> On Thu, Oct 5, 2023 at 1:15 AM Ben Hutchings <[email protected]> wrote:
>
>> Control: retitle -1 linux-image-6.5.0-1-amd64: Kernel page fault in
>> process exit due to bit flip
>> Control: tag -1 moreinfo
>>
>> On Wed, 2023-09-27 at 20:45 +0200, Gabriel Francisco wrote:
>> > Package: src:linux
>> > Version: 6.5.3-1
>> > Severity: important
>> > Tags: upstream
>> > X-Debbugs-Cc: [email protected]
>> >
>> > Dear Maintainer,
>> >
>> > First of all thanks for your hard work!
>> >
>> > I noticed my computer started freezing for few seconds when
>> entering/exiting
>> > full screen videos in youtube using firefox and while trying to check
>> if the
>> > issue also afected chromium I saw the following message in dmesg:
>> >
>> > [12569.564300] BUG: unable to handle page fault for address:
>> ffff991989e936b8
>> > [12569.564304] #PF: supervisor write access in kernel mode
>> > [12569.564306] #PF: error_code(0x0002) - not-present page
>>
>> The first BUG message should be more meaningful that what comes after.
>> This shows the kernel tried to access non-existent memory.
>>
>> > [12569.564308] PGD 0 P4D 0
>> > [12569.564311] Oops: 0002 [#1] PREEMPT SMP NOPTI
>> > [12569.564314] CPU: 10 PID: 328649 Comm: Chroot Helper Not tainted
>> 6.5.0-1-amd64 #1  Debian 6.5.3-1
>> > [12569.564317] Hardware name: ASUS System Product Name/ROG STRIX B550-F
>> GAMING WIFI II, BIOS 3205 08/14/2023
>> > [12569.564318] RIP: 0010:down_write+0x23/0x70
>> > [12569.564324] Code: 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 53
>> 48 89 fb e8 2e bc ff ff bf 01 00 00 00 e8 74 3a 53 ff 31 c0 ba 01 00 00 00
>> <f0> 48 0f b1 13 75 33 65 48 8b 04 25 80 29 03 00 48 89 43 08 bf 01
>> > [12569.564326] RSP: 0018:ffffa189d736fc70 EFLAGS: 00010246
>> > [12569.564328] RAX: 0000000000000000 RBX: ffff991989e936b8 RCX:
>> ffff891797aaef00
>> > [12569.564330] RDX: 0000000000000001 RSI: ffff891989e645c0 RDI:
>> ffffffff8e7c95dc
>> > [12569.564331] RBP: ffffffffffffffff R08: 0000000000000060 R09:
>> 0000000080400014
>> > [12569.564333] R10: ffff8918cbfeb7f8 R11: 0000000000000006 R12:
>> 00007f7e5fd00000
>> > [12569.564334] R13: 0000000000000001 R14: ffff891989e645c0 R15:
>> ffff891989e64958
>>
>> The CPU registers contain several addresses starting ffff89, except for
>> rbx which starts ffff99 (and is the faulting address).  That looks like
>> a single bit got flipped.
>>
>> This could be due to a kernel bug, but is more likely a hardware
>> problem.  Please test the RAM with memtest86+.  Also if you've enabled
>> any overclocking options, turn those off.
>>
>> [...]
>> > After that the computer can't shutdown and systemd keeps waiting on
>> process PID 328649 (Chroot Helper).
>>
>> This (and the other BUG messages) are because that process crashed in
>> kernel mode and couldn't properly exit.
>>
>> Ben.
>>
>> --
>> Ben Hutchings
>> Beware of bugs in the above code;
>> I have only proved it correct, not tried it. - Donald Knuth
>>
>>

Reply via email to