Thank you for your contribution! Sashiko AI review found 4 potential issue(s) to consider:
New issues: - [High] Replacing `wait_event()` with a 5-second `wait_event_timeout()` loop defeats the kernel's hung task detector for this wait, breaking automated recovery mechanisms. - [Medium] The `wait_event_timeout()` loop will generate severe log spam if multiple userspace threads block concurrently. Pre-existing issues: - [High] The `virtio_gpu_object_array` (`vbuf->objs`) and its contained GEM objects are leaked on error paths where `virtio_gpu_queue_ctrl_sgs()` fails to queue the buffer. - [High] The virtio-gpu driver sleeps in plane atomic update callbacks, violating DRM subsystem guidelines against sleeping in atomic context. -- commit 736ae781cca53398969e63f0d5a23d6e3f738db0 Author: Ryosuke Yasuoka <[email protected]> drm/virtio: warn when virtqueue has no free space for too long This commit replaces the bare wait_event() in virtqueue command submission with a wait_event_timeout() loop. The loop prints a warning every 5 seconds while the virtqueue remains full to help identify unresponsive host devices. > diff --git a/drivers/gpu/drm/virtio/virtgpu_vq.c > b/drivers/gpu/drm/virtio/virtgpu_vq.c > index 68d097ad9d1d0..a546130d3b6a5 100644 > --- a/drivers/gpu/drm/virtio/virtgpu_vq.c > +++ b/drivers/gpu/drm/virtio/virtgpu_vq.c > @@ -410,8 +410,13 @@ static int virtio_gpu_queue_ctrl_sgs(struct > virtio_gpu_device *vgdev, > if (vq->num_free < elemcnt) { > spin_unlock(&vgdev->ctrlq.qlock); > virtio_gpu_notify(vgdev); > - wait_event(vgdev->ctrlq.ack_queue, > - vq->num_free >= elemcnt || vgdev->vqs_released); > + while (!wait_event_timeout(vgdev->ctrlq.ack_queue, > + vq->num_free >= elemcnt || > + vgdev->vqs_released, > + 5 * HZ) && !vgdev->vqs_released) [Severity: High] Does this 5-second polling loop evade the kernel's hung task detector? The kernel's khungtaskd detects tasks stuck in TASK_UNINTERRUPTIBLE by verifying their context switch count does not change over the configured timeout. By waking the task up every 5 seconds, it will continually context switch, causing khungtaskd to see a changing context switch count. Will this prevent the system from ever flagging these tasks as hung or triggering automatic recovery panics? > + DRM_WARN("ctrlq waiting for host: no free space for %d > secs\n", > + 5); [Severity: Medium] Could this cause severe kernel log spam if multiple userspace threads are blocked concurrently? Since the 5-second timeout and warning are local to each thread's wait loop, if multiple threads attempt to submit commands while the host is unresponsive, will each thread emit a DRM_WARN message every 5 seconds? > + > /* > * Set by virtio_gpu_release_vqs() to unblock > * synchronize_srcu() wait in drm_dev_unplug(). [Severity: High] This is a pre-existing issue, but does this error path permanently leak the virtio_gpu_object_array and its pinned GEM objects? When vgdev->vqs_released is true, the code below this comment calls free_vbuf(vgdev, vbuf) and returns -ENODEV. However, free_vbuf() only frees the buffer and does not unreference or free vbuf->objs. Since the buffer was never queued, the completion callback that normally cleans up vbuf->objs will never run. [ ... ] > @@ -592,8 +597,13 @@ static void virtio_gpu_queue_cursor(struct > virtio_gpu_device *vgdev, > ret = virtqueue_add_sgs(vq, sgs, outcnt, 0, vbuf, GFP_ATOMIC); > if (ret == -ENOSPC) { > spin_unlock(&vgdev->cursorq.qlock); > - wait_event(vgdev->cursorq.ack_queue, > - vq->num_free >= outcnt || vgdev->vqs_released); > + while (!wait_event_timeout(vgdev->cursorq.ack_queue, > + vq->num_free >= outcnt || > + vgdev->vqs_released, > + 5 * HZ) && !vgdev->vqs_released) [Severity: High] This isn't a bug introduced by this patch, but does waiting here violate DRM subsystem atomic context rules? This function is reached via the plane atomic update callback: virtio_gpu_cursor_plane_update() -> virtio_gpu_cursor_ping() -> virtio_gpu_queue_cursor() DRM plane atomic update callbacks execute in atomic context where sleeping functions are forbidden. Using wait_event_timeout() (or previously wait_event()) sets the task state to TASK_UNINTERRUPTIBLE and schedules out. Could this scheduling while atomic cause system instability? > + DRM_WARN("cursorq waiting for host: no free space for > %d secs\n", > + 5); > + > /* See comment in virtio_gpu_queue_ctrl_sgs(). */ > if (vgdev->vqs_released) { > free_vbuf(vgdev, vbuf); -- Sashiko AI review ยท https://sashiko.dev/#/patchset/[email protected]?part=1
