Hello there, > De: "Yann Dirson" <[email protected]> > À: "Alex Deucher" <[email protected]> > Cc: "amd-gfx list" <[email protected]> > Envoyé: Mardi 29 Juillet 2025 11:49:55 > Objet: Re: Need help to dig into X11 display freezing (Renoir, Xen/QubesOS) > > > De: "Alex Deucher" <[email protected]> > > À: "Yann Dirson" <[email protected]> > > Cc: "amd-gfx list" <[email protected]> > > Envoyé: Lundi 28 Juillet 2025 19:20:13 > > Objet: Re: Need help to dig into X11 display freezing (Renoir, > > Xen/QubesOS) > > > > On Sun, Jul 20, 2025 at 10:39 AM Yann Dirson <[email protected]> > > wrote: > > > > > > Hello there, > > > > > > For a few months I've been experiencing occasional freezes of the > > > X11 display > > > on my QubesOS RENOIR laptop. The setup is pretty much standard > > > for > > > QubesOS, > > > with both GPUs attached to dom0 and XFCE running there (and the > > > dGPU being > > > mostly not used). Kernel is QubesOS' kernel-latest-6.15.4. > > > > > > Those freezes most often occur while the screen is blanked > > > by xscreensaver (suspend options fully deactivated here, as > > > suspend > > > is broken > > > on this platform): in this case moving the mouse does not get the > > > unlock banner > > > displayed, the screen stays black... except the mouse pointer is > > > visible. I can > > > also switch to other virtual consoles and interact with the > > > system, > > > but was > > > never able to collect any evidence of something wrong being > > > detected. > > > > > > Occasionally it also happens (like yesterday) while I'm working, > > > and the X11 > > > display just seems frozen, no clue if the input devices trigger > > > anything in > > > there. > > > > > > I guess something goes wrong but gets undetected by the driver. > > > Any suggestion > > > as to extra logging/debug features to enable? > > > > Is this specific to QubesOS or a general problem even on bare > > metal? > > Actually this is my main machine so QubesOS is running most of the > time. > I'm only booting bare metal on this box for targeted tests, and > never use it long enough then to see the problem trigger.
There are some possibly good news about this issue: while it has not hit me that much for the start of the year, it has started to occur more often recently, and there is a recurring oops reproduced below, happening every second when X11 stays black. There are a few variations between occurrences, every few seconds it is another kworker on a different core, but "vblank wait timed out on crtc 0" is a constant. The first occurrence happens when I try to wake the screensaver. Does that give any idea to dig further? (from kernel 6.19.14-1.qubes.fc37.x86_64) May 13 18:44:17 dom0 kernel: ------------[ cut here ]------------ May 13 18:44:17 dom0 kernel: amdgpu 0000:07:00.0: [drm] vblank wait timed out on crtc 0 May 13 18:44:17 dom0 kernel: WARNING: drivers/gpu/drm/drm_vblank.c:1318 at drm_wait_one_vblank+0x179/0x230, CPU#3: kworker/3:1/106 May 13 18:44:17 dom0 kernel: Modules linked in: snd_seq_dummy snd_hrtimer vfat fat snd_sof_amd_acp70 snd_sof_amd_acp63 snd_sof_amd_vangogh snd_sof_amd_rembrandt snd_sof_amd_renoir snd_sof_a> May 13 18:44:17 dom0 kernel: xenfs dm_thin_pool dm_persistent_data dm_bio_prison dm_crypt amdgpu amdxcp i2c_algo_bit drm_ttm_helper ttm drm_exec drm_panel_backlight_quirks gpu_sched nvme d> May 13 18:44:17 dom0 kernel: CPU: 3 UID: 0 PID: 106 Comm: kworker/3:1 Not tainted 6.19.14-1.qubes.fc37.x86_64 #1 PREEMPT(full) May 13 18:44:17 dom0 kernel: Hardware name: Micro-Star International Co., Ltd. Bravo 17 A4DDK/MS-17FK, BIOS E17FKAMS.117 10/29/2020 May 13 18:44:17 dom0 kernel: Workqueue: events drm_fb_helper_damage_work May 13 18:44:17 dom0 kernel: RIP: e030:drm_wait_one_vblank+0x17e/0x230 May 13 18:44:17 dom0 kernel: Code: 84 c5 00 00 00 48 8b 7b 08 4c 8b 67 50 4d 85 e4 0f 84 ac 00 00 00 e8 61 00 03 00 48 89 c6 48 8d 3d f7 bf 76 01 89 e9 4c 89 e2 <67> 48 0f b9 3a e9 e5 fe ff> May 13 18:44:17 dom0 kernel: RSP: e02b:ffffc9004060fda0 EFLAGS: 00010286 May 13 18:44:17 dom0 kernel: RAX: ffffffffc16fb03b RBX: ffff88811ba00010 RCX: 0000000000000000 May 13 18:44:17 dom0 kernel: RDX: ffff8881017863f0 RSI: ffffffffc16fb03b RDI: ffffffff828c7f60 May 13 18:44:17 dom0 kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000001 May 13 18:44:17 dom0 kernel: R10: 0000000000007ff0 R11: ffff88815579e400 R12: ffff8881017863f0 May 13 18:44:17 dom0 kernel: R13: 00000000004bc778 R14: ffff888101b1f830 R15: ffff8881030a29c0 May 13 18:44:17 dom0 kernel: FS: 0000000000000000(0000) GS:ffff8881d2257000(0000) knlGS:0000000000000000 May 13 18:44:17 dom0 kernel: CS: e030 DS: 0000 ES: 0000 CR0: 0000000080050033 May 13 18:44:17 dom0 kernel: CR2: 000059ecfa9f3640 CR3: 0000000146bc5000 CR4: 0000000000050660 May 13 18:44:17 dom0 kernel: Call Trace: May 13 18:44:17 dom0 kernel: <TASK> May 13 18:44:17 dom0 kernel: ? __pfx_autoremove_wake_function+0x10/0x10 May 13 18:44:17 dom0 kernel: drm_client_modeset_wait_for_vblank+0x5b/0x70 May 13 18:44:17 dom0 kernel: drm_fb_helper_damage_work+0x7e/0x190 May 13 18:44:17 dom0 kernel: process_one_work+0x19b/0x3c0 May 13 18:44:17 dom0 kernel: worker_thread+0x196/0x300 May 13 18:44:17 dom0 kernel: ? __pfx_worker_thread+0x10/0x10 May 13 18:44:17 dom0 kernel: kthread+0xfe/0x240 May 13 18:44:17 dom0 kernel: ? __pfx_kthread+0x10/0x10 May 13 18:44:17 dom0 kernel: ? __pfx_kthread+0x10/0x10 May 13 18:44:17 dom0 kernel: ret_from_fork+0x14a/0x190 May 13 18:44:17 dom0 kernel: ? __pfx_kthread+0x10/0x10 May 13 18:44:17 dom0 kernel: ret_from_fork_asm+0x1a/0x30 May 13 18:44:17 dom0 kernel: </TASK> May 13 18:44:17 dom0 kernel: ---[ end trace 0000000000000000 ]---
