On 29/06/2026 15:11, Dmitry Osipenko wrote:
> Hi,
>
> On 6/18/26 10:18, Ryosuke Yasuoka wrote:
>> virtio_gpu_queue_ctrl_sgs() and virtio_gpu_queue_cursor() wait for
>> virtqueue space using wait_event() with vqs_released as the only abort
>> condition. This covers the device removal path, where
>> virtio_gpu_release_vqs() sets the flag, but does not help when the host
>> simply stops processing the virtqueue while the device remains present.
>>
>> In that case, the virtqueue fills up and subsequent command submissions
>> block indefinitely in D state with no diagnostic output, making the root
>> cause difficult to identify.
>>
>> Replace the bare wait_event() with a wait_event_timeout() loop that
>> prints a warning every 5 seconds while the virtqueue remains full. The
>> wait still blocks indefinitely so driver behavior is unchanged. The
>> warnings help identify an unresponsive host device during
>> troubleshooting.
>>
>> Signed-off-by: Ryosuke Yasuoka <[email protected]>
>> ---
>> When the host stops processing the virtio-gpu virtqueue without
>> triggering device removal, the bare wait_event() in
>> virtio_gpu_queue_ctrl_sgs() and virtio_gpu_queue_cursor() blocks
>> indefinitely with no diagnostic output. A DRM atomic commit worker
>> blocks in virtio_gpu_queue_fenced_ctrl_buffer() while holding the
>> modeset lock. During graceful shutdown, systemd (PID 1) needs the same
>> lock — either by writing to the console via fbcon, or by closing a DRM
>> file descriptor that triggers framebuffer cleanup — and blocks as well,
>> making the VM unrecoverable without a forced power-off.
>>
>> PID: 553 COMMAND: "kworker/u4:3"
>> #0 __schedule
>> #1 schedule
>> #2 virtio_gpu_queue_fenced_ctrl_buffer [virtio_gpu]
>> #3 virtio_gpu_primary_plane_update [virtio_gpu]
>> ...
>>
>> PID: 1 COMMAND: "systemd" (console write path)
>> #0 __schedule
>> #1 schedule
>> #2 schedule_preempt_disabled
>> #3 __ww_mutex_lock
>> #4 drm_modeset_lock [drm]
>> #5 drm_atomic_get_plane_state [drm]
>> #6 drm_client_modeset_commit_atomic [drm]
>> #7 drm_client_modeset_commit_locked [drm]
>> #8 drm_fb_helper_pan_display [drm_kms_helper]
>> #9 fb_pan_display
>> #10 bit_update_start
>> #11 fbcon_switch
>> #12 redraw_screen
>> ...
>>
>> Reproduction steps:
>> 1. Build QEMU with the fault injection patch [1] that adds an
>> x-ctrl-queue-broken property to virtio-gpu.
>> 2. Boot the VM and trigger the fault injection from the host.
>> 3. Fill the ctrlq (e.g., move the mouse on the guest's display).
>> The process gets stuck in virtio_gpu_queue_fenced_ctrl_buffer()
>> in D state.
>> 4. Run a graceful shutdown command (shutdown now or reboot).
>> 5. The shutdown process hangs.
>>
>> My earlier patch a46991b334f6 ("drm/virtio: abort virtqueue wait on
>> device removal to avoid hung task") covers the case where the shutdown
>> process reaches the device_shutdown() call path, which sets vqs_released
>> to unblock the wait. However, during graceful shutdown, systemd (PID 1)
>> gets stuck on the modeset lock before ever reaching device_shutdown(),
>> so vqs_released is never set and the wait is never unblocked.
>>
>> I initially considered adding a module parameter to abort the wait with
>> -ENODEV on timeout:
>>
>> +static unsigned int virtio_gpu_vq_timeout;
>> +MODULE_PARM_DESC(vq_timeout,
>> + "Timeout in seconds for virtqueue wait (0 = no timeout, default)");
>> +module_param_named(vq_timeout, virtio_gpu_vq_timeout, uint, 0444);
>> ...
>> + if (virtio_gpu_vq_timeout) {
>> + if (!wait_event_timeout(vgdev->ctrlq.ack_queue,
>> + vq->num_free >= elemcnt ||
>> + vgdev->vqs_released,
>> +
>> secs_to_jiffies(virtio_gpu_vq_timeout))) {
>> + if (fence && vbuf->objs)
>> +
>> virtio_gpu_array_unlock_resv(vbuf->objs);
>> + free_vbuf(vgdev, vbuf);
>> + drm_dev_exit(idx);
>> + return -ENODEV;
>> + }
>> + } else {
>> + wait_event(vgdev->ctrlq.ack_queue,
>> + vq->num_free >= elemcnt ||
>> + vgdev->vqs_released);
>> + }
>>
>> This approach aborts the wait and allows the graceful shutdown process
>> to eventually proceed, albeit with a delay.
>>
>> But that approach has drawbacks: it allows users to set arbitrarily
>> short timeouts that could destabilize the driver, and aborting commands
>> mid-flight is a rough recovery path. An unconditional timeout was also
>> discussed previously [2] but is not appropriate without virtio
>> specification support.
>>
>> This patch takes a safer approach: replace the bare wait_event() with
>> wait_event_timeout() and print a warning every 5 seconds while the
>> virtqueue remains full. The wait still blocks indefinitely and no
>> commands are aborted, so driver behavior is unchanged. The warnings
>> help identify an unresponsive host device during troubleshooting.
>> Once the user notices the warning, they can work around the hang by
>> unbinding the VT from fbcon, removing the device, or forcing a shutdown
>> via SysRq.
>>
>> [1] https://gist.github.com/YsuOS/fbcd181752594af35f954953a1d260b8
>> [2]
>> https://lore.kernel.org/all/[email protected]/
>> ---
>> drivers/gpu/drm/virtio/virtgpu_vq.c | 18 ++++++++++++++----
>> 1 file changed, 14 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/virtio/virtgpu_vq.c
>> b/drivers/gpu/drm/virtio/virtgpu_vq.c
>> index 68d097ad9d1d..a546130d3b6a 100644
>> --- a/drivers/gpu/drm/virtio/virtgpu_vq.c
>> +++ b/drivers/gpu/drm/virtio/virtgpu_vq.c
>> @@ -410,8 +410,13 @@ static int virtio_gpu_queue_ctrl_sgs(struct
>> virtio_gpu_device *vgdev,
>> if (vq->num_free < elemcnt) {
>> spin_unlock(&vgdev->ctrlq.qlock);
>> virtio_gpu_notify(vgdev);
>> - wait_event(vgdev->ctrlq.ack_queue,
>> - vq->num_free >= elemcnt || vgdev->vqs_released);
>> + while (!wait_event_timeout(vgdev->ctrlq.ack_queue,
>> + vq->num_free >= elemcnt ||
>> + vgdev->vqs_released,
>> + 5 * HZ) && !vgdev->vqs_released)
>> + DRM_WARN("ctrlq waiting for host: no free space for %d
>> secs\n",
>> + 5);
>> +
>> /*
>> * Set by virtio_gpu_release_vqs() to unblock
>> * synchronize_srcu() wait in drm_dev_unplug().
>> @@ -592,8 +597,13 @@ static void virtio_gpu_queue_cursor(struct
>> virtio_gpu_device *vgdev,
>> ret = virtqueue_add_sgs(vq, sgs, outcnt, 0, vbuf, GFP_ATOMIC);
>> if (ret == -ENOSPC) {
>> spin_unlock(&vgdev->cursorq.qlock);
>> - wait_event(vgdev->cursorq.ack_queue,
>> - vq->num_free >= outcnt || vgdev->vqs_released);
>> + while (!wait_event_timeout(vgdev->cursorq.ack_queue,
>> + vq->num_free >= outcnt ||
>> + vgdev->vqs_released,
>> + 5 * HZ) && !vgdev->vqs_released)
>> + DRM_WARN("cursorq waiting for host: no free space for
>> %d secs\n",
>> + 5);
>> +
>> /* See comment in virtio_gpu_queue_ctrl_sgs(). */
>> if (vgdev->vqs_released) {
>> free_vbuf(vgdev, vbuf);
>
> The [1] has a valid point about the hangcheck warning. Is
> wait_event_timeout() worth the change then?
>
> [1]
> https://sashiko.dev/#/patchset/20260618-virtiogpu_add_timeout-v1-1-dc36cef609d9%40redhat.com
Thanks for pointing this out.
You're right — the wait_event_timeout() loop wakes the task every
5 seconds, resetting the context switch counter each time, so
khungtaskd will never detect these tasks as hung.
I originally thought the earlier, virtio-specific warning would be
more useful for troubleshooting than the generic hung task output.
But reconsidering, the practical workflow for the user doesn't
really change either way: the device gets stuck, the user sees a
diagnostic message — whether that's the hung task warning or my
proposed DRM_WARN — and then manually intervenes with SysRq or a
forced shutdown.
The difference is what diagnostic information is available. The
hung task warning includes a full stack trace pointing directly at
the virtqueue wait, which is more useful for identifying the root
cause than a short warning line. Suppressing that by resetting the
context switch counter every 5 seconds makes the diagnostics
strictly worse.
More fundamentally, the guest can't recover from this situation
regardless — the root cause is an unresponsive host, and the fix
has to happen on the host side. The existing wait_event() with the
kernel's built-in hung task detection is the better behavior here.
I'll drop this patch.
Best regards,
Ryosuke