Hi,

thanks for the patch.

Am 15.12.25 um 09:18 schrieb Chengjun Yao:
During GPU reset, VBlank interrupts are disabled which causes
drm_fb_helper_fb_dirty() to wait for VBlank timeout. This will create
call traces like (seen on an RX7900 series dGPU):

[  101.313646] ------------[ cut here ]------------
[  101.313648] amdgpu 0000:03:00.0: [drm] vblank wait timed out on crtc 0
[  101.313657] WARNING: CPU: 0 PID: 461 at drivers/gpu/drm/drm_vblank.c:1320 
drm_wait_one_vblank+0x176/0x220
[  101.313663] Modules linked in: amdgpu amdxcp drm_panel_backlight_quirks 
gpu_sched drm_buddy drm_ttm_helper ttm drm_exec drm_suballoc_helper 
drm_display_helper cec rc_core i2c_algo_bit nf_conntrack_netlink xt_nat 
xt_tcpudp veth xt_conntrack xt_MASQUERADE bridge stp llc xfrm_user xfrm_algo 
xt_set ip_set nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 
xt_addrtype nft_compat x_tables nf_tables overlay qrtr sunrpc 
snd_hda_codec_alc882 snd_hda_codec_realtek_lib snd_hda_codec_generic 
snd_hda_codec_atihdmi snd_hda_codec_hdmi snd_hda_intel snd_hda_codec 
snd_hda_core snd_intel_dspcfg snd_intel_sdw_acpi snd_hwdep snd_pcm amd_atl 
intel_rapl_msr snd_seq_midi intel_rapl_common asus_ec_sensors 
snd_seq_midi_event snd_rawmidi snd_seq eeepc_wmi snd_seq_device edac_mce_amd 
asus_wmi polyval_clmulni ghash_clmulni_intel snd_timer platform_profile 
aesni_intel wmi_bmof sparse_keymap joydev snd rapl input_leds i2c_piix4 
soundcore ccp k10temp i2c_smbus gpio_amdpt mac_hid binfmt_misc sch_fq_codel msr 
parport_pc ppdev lp parport
[  101.313745]  efi_pstore nfnetlink dmi_sysfs autofs4 hid_generic usbhid hid 
r8169 realtek ahci libahci video wmi
[  101.313760] CPU: 0 UID: 0 PID: 461 Comm: kworker/0:2 Not tainted 
6.18.0-rc6-174403b3b920 #1 PREEMPT(voluntary)
[  101.313763] Hardware name: ASUS System Product Name/TUF GAMING X670E-PLUS, 
BIOS 0821 11/15/2022
[  101.313765] Workqueue: events drm_fb_helper_damage_work
[  101.313769] RIP: 0010:drm_wait_one_vblank+0x176/0x220
[  101.313772] Code: 7c 24 08 4c 8b 77 50 4d 85 f6 0f 84 a1 00 00 00 e8 2f 11 03 00 
44 89 e9 4c 89 f2 48 c7 c7 d0 ad 0d a8 48 89 c6 e8 2a e0 4a ff <0f> 0b e9 f2 fe 
ff ff 48 85 ff 74 04 4c 8b 67 08 4d 8b 6c 24 50 4d
[  101.313774] RSP: 0018:ffffc99c00d47d68 EFLAGS: 00010246
[  101.313777] RAX: 0000000000000000 RBX: 000000000200038a RCX: 0000000000000000
[  101.313778] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[  101.313779] RBP: ffffc99c00d47dc0 R08: 0000000000000000 R09: 0000000000000000
[  101.313781] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8948c4280010
[  101.313782] R13: 0000000000000000 R14: ffff894883263a50 R15: ffff89488c384830
[  101.313784] FS:  0000000000000000(0000) GS:ffff895424692000(0000) 
knlGS:0000000000000000
[  101.313785] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  101.313787] CR2: 00007773650ee200 CR3: 0000000588e40000 CR4: 0000000000f50ef0
[  101.313788] PKRU: 55555554
[  101.313790] Call Trace:
[  101.313791]  <TASK>
[  101.313795]  ? __pfx_autoremove_wake_function+0x10/0x10
[  101.313800]  drm_crtc_wait_one_vblank+0x17/0x30
[  101.313802]  drm_client_modeset_wait_for_vblank+0x61/0x80
[  101.313805]  drm_fb_helper_damage_work+0x46/0x1a0
[  101.313808]  process_one_work+0x1a1/0x3f0
[  101.313812]  worker_thread+0x2ba/0x3d0
[  101.313816]  kthread+0x107/0x220
[  101.313818]  ? __pfx_worker_thread+0x10/0x10
[  101.313821]  ? __pfx_kthread+0x10/0x10
[  101.313823]  ret_from_fork+0x202/0x230
[  101.313826]  ? __pfx_kthread+0x10/0x10
[  101.313828]  ret_from_fork_asm+0x1a/0x30
[  101.313834]  </TASK>
[  101.313835] ---[ end trace 0000000000000000 ]---

Cancel pending damage work synchronously before console_lock() to ensure
any in-flight framebuffer damage operations complete before suspension.

Also check for FBINFO_STATE_RUNNING in drm_fb_helper_damage_work() to
avoid executing damage work if it is rescheduled while the device is suspended.

Fixes: d8c4bddcd8bc ("drm/fb-helper: Synchronize dirty worker with vblank")
Signed-off-by: Aurabindo Pillai <[email protected]>
Signed-off-by: Chengjun Yao <[email protected]>

Reviewed-by: Thomas Zimmermann <[email protected]>

---
  drivers/gpu/drm/drm_fb_helper.c | 10 ++++++++++
  1 file changed, 10 insertions(+)

diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
index c0343ec16a57..199cca1b5bdd 100644
--- a/drivers/gpu/drm/drm_fb_helper.c
+++ b/drivers/gpu/drm/drm_fb_helper.c
@@ -402,6 +402,9 @@ static void drm_fb_helper_damage_work(struct work_struct 
*work)
  {
        struct drm_fb_helper *helper = container_of(work, struct drm_fb_helper, 
damage_work);
+ if (helper->info->state != FBINFO_STATE_RUNNING)
+               return;
+
        drm_fb_helper_fb_dirty(helper);
  }
@@ -794,6 +797,13 @@ void drm_fb_helper_set_suspend_unlocked(struct drm_fb_helper *fb_helper,
                if (fb_helper->info->state != FBINFO_STATE_RUNNING)
                        return;
+ /*
+                * Cancel pending damage work. During GPU reset, VBlank
+                * interrupts are disabled and drm_fb_helper_fb_dirty()
+                * would wait for VBlank timeout otherwise.
+                */
+               cancel_work_sync(&fb_helper->damage_work);
+
                console_lock();
} else {

--
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstr. 146, 90461 Nürnberg, Germany, www.suse.com
GF: Jochen Jaser, Andrew McDonald, Werner Knoblich, (HRB 36809, AG Nürnberg)


Reply via email to