Hi,
thanks for the patch.
Am 15.12.25 um 09:18 schrieb Chengjun Yao:
During GPU reset, VBlank interrupts are disabled which causes
drm_fb_helper_fb_dirty() to wait for VBlank timeout. This will create
call traces like (seen on an RX7900 series dGPU):
[ 101.313646] ------------[ cut here ]------------
[ 101.313648] amdgpu 0000:03:00.0: [drm] vblank wait timed out on crtc 0
[ 101.313657] WARNING: CPU: 0 PID: 461 at drivers/gpu/drm/drm_vblank.c:1320
drm_wait_one_vblank+0x176/0x220
[ 101.313663] Modules linked in: amdgpu amdxcp drm_panel_backlight_quirks
gpu_sched drm_buddy drm_ttm_helper ttm drm_exec drm_suballoc_helper
drm_display_helper cec rc_core i2c_algo_bit nf_conntrack_netlink xt_nat
xt_tcpudp veth xt_conntrack xt_MASQUERADE bridge stp llc xfrm_user xfrm_algo
xt_set ip_set nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4
xt_addrtype nft_compat x_tables nf_tables overlay qrtr sunrpc
snd_hda_codec_alc882 snd_hda_codec_realtek_lib snd_hda_codec_generic
snd_hda_codec_atihdmi snd_hda_codec_hdmi snd_hda_intel snd_hda_codec
snd_hda_core snd_intel_dspcfg snd_intel_sdw_acpi snd_hwdep snd_pcm amd_atl
intel_rapl_msr snd_seq_midi intel_rapl_common asus_ec_sensors
snd_seq_midi_event snd_rawmidi snd_seq eeepc_wmi snd_seq_device edac_mce_amd
asus_wmi polyval_clmulni ghash_clmulni_intel snd_timer platform_profile
aesni_intel wmi_bmof sparse_keymap joydev snd rapl input_leds i2c_piix4
soundcore ccp k10temp i2c_smbus gpio_amdpt mac_hid binfmt_misc sch_fq_codel msr
parport_pc ppdev lp parport
[ 101.313745] efi_pstore nfnetlink dmi_sysfs autofs4 hid_generic usbhid hid
r8169 realtek ahci libahci video wmi
[ 101.313760] CPU: 0 UID: 0 PID: 461 Comm: kworker/0:2 Not tainted
6.18.0-rc6-174403b3b920 #1 PREEMPT(voluntary)
[ 101.313763] Hardware name: ASUS System Product Name/TUF GAMING X670E-PLUS,
BIOS 0821 11/15/2022
[ 101.313765] Workqueue: events drm_fb_helper_damage_work
[ 101.313769] RIP: 0010:drm_wait_one_vblank+0x176/0x220
[ 101.313772] Code: 7c 24 08 4c 8b 77 50 4d 85 f6 0f 84 a1 00 00 00 e8 2f 11 03 00
44 89 e9 4c 89 f2 48 c7 c7 d0 ad 0d a8 48 89 c6 e8 2a e0 4a ff <0f> 0b e9 f2 fe
ff ff 48 85 ff 74 04 4c 8b 67 08 4d 8b 6c 24 50 4d
[ 101.313774] RSP: 0018:ffffc99c00d47d68 EFLAGS: 00010246
[ 101.313777] RAX: 0000000000000000 RBX: 000000000200038a RCX: 0000000000000000
[ 101.313778] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[ 101.313779] RBP: ffffc99c00d47dc0 R08: 0000000000000000 R09: 0000000000000000
[ 101.313781] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8948c4280010
[ 101.313782] R13: 0000000000000000 R14: ffff894883263a50 R15: ffff89488c384830
[ 101.313784] FS: 0000000000000000(0000) GS:ffff895424692000(0000)
knlGS:0000000000000000
[ 101.313785] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 101.313787] CR2: 00007773650ee200 CR3: 0000000588e40000 CR4: 0000000000f50ef0
[ 101.313788] PKRU: 55555554
[ 101.313790] Call Trace:
[ 101.313791] <TASK>
[ 101.313795] ? __pfx_autoremove_wake_function+0x10/0x10
[ 101.313800] drm_crtc_wait_one_vblank+0x17/0x30
[ 101.313802] drm_client_modeset_wait_for_vblank+0x61/0x80
[ 101.313805] drm_fb_helper_damage_work+0x46/0x1a0
[ 101.313808] process_one_work+0x1a1/0x3f0
[ 101.313812] worker_thread+0x2ba/0x3d0
[ 101.313816] kthread+0x107/0x220
[ 101.313818] ? __pfx_worker_thread+0x10/0x10
[ 101.313821] ? __pfx_kthread+0x10/0x10
[ 101.313823] ret_from_fork+0x202/0x230
[ 101.313826] ? __pfx_kthread+0x10/0x10
[ 101.313828] ret_from_fork_asm+0x1a/0x30
[ 101.313834] </TASK>
[ 101.313835] ---[ end trace 0000000000000000 ]---
Cancel pending damage work synchronously before console_lock() to ensure
any in-flight framebuffer damage operations complete before suspension.
Also check for FBINFO_STATE_RUNNING in drm_fb_helper_damage_work() to
avoid executing damage work if it is rescheduled while the device is suspended.
Fixes: d8c4bddcd8bc ("drm/fb-helper: Synchronize dirty worker with vblank")
Signed-off-by: Aurabindo Pillai <[email protected]>
Signed-off-by: Chengjun Yao <[email protected]>
Reviewed-by: Thomas Zimmermann <[email protected]>
---
drivers/gpu/drm/drm_fb_helper.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
index c0343ec16a57..199cca1b5bdd 100644
--- a/drivers/gpu/drm/drm_fb_helper.c
+++ b/drivers/gpu/drm/drm_fb_helper.c
@@ -402,6 +402,9 @@ static void drm_fb_helper_damage_work(struct work_struct
*work)
{
struct drm_fb_helper *helper = container_of(work, struct drm_fb_helper,
damage_work);
+ if (helper->info->state != FBINFO_STATE_RUNNING)
+ return;
+
drm_fb_helper_fb_dirty(helper);
}
@@ -794,6 +797,13 @@ void drm_fb_helper_set_suspend_unlocked(struct drm_fb_helper *fb_helper,
if (fb_helper->info->state != FBINFO_STATE_RUNNING)
return;
+ /*
+ * Cancel pending damage work. During GPU reset, VBlank
+ * interrupts are disabled and drm_fb_helper_fb_dirty()
+ * would wait for VBlank timeout otherwise.
+ */
+ cancel_work_sync(&fb_helper->damage_work);
+
console_lock();
} else {
--
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstr. 146, 90461 Nürnberg, Germany, www.suse.com
GF: Jochen Jaser, Andrew McDonald, Werner Knoblich, (HRB 36809, AG Nürnberg)