Hi Dmitry, This is great. I will test this and include the fix in the patch.
> -----Original Message----- > From: Dmitry Osipenko <dmitry.osipe...@collabora.com> > Sent: Monday, June 30, 2025 7:00 PM > To: Kim, Dongwon <dongwon....@intel.com>; dri- > de...@lists.freedesktop.org > Cc: Kasireddy, Vivek <vivek.kasire...@intel.com> > Subject: Re: [RFC PATCH v2 0/2] Virtio-GPU suspend and resume > > On 6/18/25 01:41, Kim, Dongwon wrote: > ... > >> Have you figured out why 10ms workaround is needed? > > > > [Kim, Dongwon] Unfortunately, I don't know why it fails without the > > delay. I wanted to narrow down further so enabled printk during > > suspend and resume but hang didn't occur with the timing changes > > caused by printks. I've also tried more deterministic methods that > > make it wait based on some kinds of "status" but none of them have > worked so far. If you have any suggestions on possible condition we can > check instead of just sleeping, please let me know. > > 10ms seems to be close to minimum to make it work 100% for several > > days (rtcwake sleep and wake up every 5 sec). > > Was able to reproduce the hang and got a crash backtrace with > no_console_suspend: > > [ 63.824827] PM: suspend entry (deep) > [ 63.825041] Filesystems sync: 0.000 seconds > [ 63.990951] Freezing user space processes > [ 63.992488] Freezing user space processes completed (elapsed 0.001 > seconds) > [ 63.992775] OOM killer disabled. > [ 63.992902] Freezing remaining freezable tasks > [ 63.994099] Freezing remaining freezable tasks completed (elapsed 0.001 > seconds) > [ 64.002183] Oops: general protection fault, probably for non-canonical > address 0x2abe0ea26847fb08: 0000 [#1] SMP NOPTI > [ 64.003172] CPU: 9 UID: 0 PID: 178 Comm: kworker/9:2 Not tainted 6.15.4- > 00002-g01117b4373b2-dirty #123 PREEMPT(voluntary) > [ 64.003614] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS > rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014 > [ 64.004036] Workqueue: events virtio_gpu_dequeue_ctrl_func > [ 64.004280] RIP: 0010:virtqueue_get_buf_ctx_split+0x86/0x130 > [ 64.004515] Code: 01 66 23 43 50 0f b7 c0 8b 74 c1 04 8b 44 c1 08 41 89 45 > 00 3b 73 58 0f 83 96 d7 20 ff 89 f0 48 c1 e0 04 48 03 83 80 00 00 00 <4c> 8b > 20 > 4d 85 e4 0f 84 5a d7 20 ff 48 89 df e8 46 fc ff ff 0f b7 > [ 64.005227] RSP: 0018:ffffc90000b53d90 EFLAGS: 00010202 > [ 64.005430] RAX: 2abe0ea26847fb08 RBX: ffff888102d58a00 RCX: > ffff8881255314c0 > [ 64.005698] RDX: 0000000000000000 RSI: 0000000000000008 RDI: > ffff888102d58a00 > [ 64.005975] RBP: ffffc90000b53db0 R08: 8080808080808080 R09: > ffff88885b470b40 > [ 64.006273] R10: ffff8881000508c8 R11: fefefefefefefeff R12: > 0000000000000001 > [ 64.006907] R13: ffffc90000b53dfc R14: ffffc90000b53dfc R15: > ffff8881032d0568 > [ 64.007205] FS: 0000000000000000(0000) GS:ffff8888d6650000(0000) > knlGS:0000000000000000 > [ 64.007511] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 64.007732] CR2: 00007efedc4d3000 CR3: 00000001056e9000 CR4: > 0000000000750ef0 > [ 64.008014] PKRU: 55555554 > [ 64.008123] Call Trace: > [ 64.008223] <TASK> > [ 64.008314] virtqueue_get_buf+0x46/0x60 > [ 64.008465] virtio_gpu_dequeue_ctrl_func+0x86/0x2a0 > [ 64.008655] process_one_work+0x18a/0x370 > [ 64.008823] worker_thread+0x31a/0x460 > [ 64.008971] ? _raw_spin_unlock_irqrestore+0x27/0x50 > [ 64.009176] ? srso_alias_return_thunk+0x5/0xfbef5 > [ 64.009369] ? __pfx_worker_thread+0x10/0x10 > [ 64.009532] kthread+0x126/0x230 > [ 64.009662] ? _raw_spin_unlock_irq+0x1f/0x40 > [ 64.009836] ? __pfx_kthread+0x10/0x10 > [ 64.009986] ret_from_fork+0x3a/0x60 > [ 64.010156] ? __pfx_kthread+0x10/0x10 > [ 64.010318] ret_from_fork_asm+0x1a/0x30 > [ 64.010507] </TASK> > [ 64.010616] Modules linked in: > [ 64.010785] ---[ end trace 0000000000000000 ]--- > > == > > The trace tells that virtio queue is active after it has been removed. This > change fixes the crash, please test: > > diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.c > b/drivers/gpu/drm/virtio/virtgpu_drv.c > index 03ab78b44ab3..48bb21f33306 100644 > --- a/drivers/gpu/drm/virtio/virtgpu_drv.c > +++ b/drivers/gpu/drm/virtio/virtgpu_drv.c > @@ -187,6 +187,10 @@ static int virtgpu_freeze(struct virtio_device *vdev) > flush_work(&vgdev->ctrlq.dequeue_work); > flush_work(&vgdev->cursorq.dequeue_work); > flush_work(&vgdev->config_changed_work); > + wait_event(vgdev->ctrlq.ack_queue, > + vgdev->ctrlq.vq->num_free == vgdev->ctrlq.vq->num_max); > + wait_event(vgdev->cursorq.ack_queue, > + vgdev->cursorq.vq->num_free == > + vgdev->cursorq.vq->num_max); > vdev->config->del_vqs(vdev); > > return 0; > > -- > Best regards, > Dmitry