On 30/06/2026 16:46, Dmitry Osipenko wrote:
> Hi,
> 
> On 6/30/26 12:16, Ryosuke Yasuoka wrote:
>> A probe-time deadlock can occur between the dequeue worker and
>> drm_client_register(). During probe, drm_client_register() holds
>> clientlist_mutex and calls the fbdev hotplug callback, which triggers an
>> atomic commit that ends up sleeping in virtio_gpu_queue_ctrl_sgs()
>> waiting for virtqueue space. The dequeue worker that would free that
>> space calls virtio_gpu_cmd_get_display_info_cb(), which invokes
>> drm_kms_helper_hotplug_event() -> drm_client_dev_hotplug(), attempting
>> to acquire the same clientlist_mutex. Since wake_up() is only called
>> after the resp_cb loop, the probe thread is never woken and both threads
>> deadlock.
>> 
>> Fix this by deferring the hotplug notification from
>> virtio_gpu_cmd_get_display_info_cb() to a separate work item. The
>> display data (outputs[i].info) is still updated synchronously in the
>> callback, and the deferred work only triggers a re-probe notification to
>> DRM clients.
>> 
>> Fixes: 27655b9bb9f0 ("drm/client: Send hotplug event after registering a 
>> client")
>> Closes: 
>> https://syzkaller.appspot.com/bug?id=d6dd6f86d3aaf7eebe7406e45c1c6e549453f224
>> Closes: 
>> https://syzkaller.appspot.com/bug?id=908bd910da5dd79b88de4cf7baf376cc873a922e
>> Signed-off-by: Ryosuke Yasuoka <[email protected]>
>> ---
>>  drivers/gpu/drm/virtio/virtgpu_drv.h |  3 +++
>>  drivers/gpu/drm/virtio/virtgpu_kms.c |  3 +++
>>  drivers/gpu/drm/virtio/virtgpu_vq.c  | 12 ++++++++++--
>>  3 files changed, 16 insertions(+), 2 deletions(-)
>> 
>> diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h 
>> b/drivers/gpu/drm/virtio/virtgpu_drv.h
>> index 7449907754a4..27ffa4697ae9 100644
>> --- a/drivers/gpu/drm/virtio/virtgpu_drv.h
>> +++ b/drivers/gpu/drm/virtio/virtgpu_drv.h
>> @@ -264,6 +264,8 @@ struct virtio_gpu_device {
>>  
>>      struct work_struct config_changed_work;
>>  
>> +    struct work_struct hotplug_work;
>> +
>>      struct work_struct obj_free_work;
>>      spinlock_t obj_free_lock;
>>      struct list_head obj_free_list;
>> @@ -350,6 +352,7 @@ void virtio_gpu_cmd_transfer_to_host_2d(struct 
>> virtio_gpu_device *vgdev,
>>                                      uint32_t x, uint32_t y,
>>                                      struct virtio_gpu_object_array *objs,
>>                                      struct virtio_gpu_fence *fence);
>> +void virtio_gpu_hotplug_work_func(struct work_struct *work);
>>  void virtio_gpu_panic_cmd_resource_flush(struct virtio_gpu_device *vgdev,
>>                                       uint32_t resource_id,
>>                                       uint32_t x, uint32_t y,
>> diff --git a/drivers/gpu/drm/virtio/virtgpu_kms.c 
>> b/drivers/gpu/drm/virtio/virtgpu_kms.c
>> index cfde9f573df6..cfb532ba43a4 100644
>> --- a/drivers/gpu/drm/virtio/virtgpu_kms.c
>> +++ b/drivers/gpu/drm/virtio/virtgpu_kms.c
>> @@ -154,6 +154,8 @@ int virtio_gpu_init(struct virtio_device *vdev, struct 
>> drm_device *dev)
>>      INIT_WORK(&vgdev->config_changed_work,
>>                virtio_gpu_config_changed_work_func);
>>  
>> +    INIT_WORK(&vgdev->hotplug_work, virtio_gpu_hotplug_work_func);
>> +
>>      INIT_WORK(&vgdev->obj_free_work,
>>                virtio_gpu_array_put_free_work);
>>      INIT_LIST_HEAD(&vgdev->obj_free_list);
>> @@ -293,6 +295,7 @@ void virtio_gpu_deinit(struct drm_device *dev)
>>      flush_work(&vgdev->obj_free_work);
>>      flush_work(&vgdev->ctrlq.dequeue_work);
>>      flush_work(&vgdev->cursorq.dequeue_work);
>> +    flush_work(&vgdev->hotplug_work);
>>      flush_work(&vgdev->config_changed_work);
>>      virtio_reset_device(vgdev->vdev);
>>      vgdev->vdev->config->del_vqs(vgdev->vdev);
>> diff --git a/drivers/gpu/drm/virtio/virtgpu_vq.c 
>> b/drivers/gpu/drm/virtio/virtgpu_vq.c
>> index 67865810a2e7..084d98f5dc7b 100644
>> --- a/drivers/gpu/drm/virtio/virtgpu_vq.c
>> +++ b/drivers/gpu/drm/virtio/virtgpu_vq.c
>> @@ -816,6 +816,15 @@ virtio_gpu_cmd_resource_detach_backing(struct 
>> virtio_gpu_device *vgdev,
>>      virtio_gpu_queue_fenced_ctrl_buffer(vgdev, vbuf, fence);
>>  }
>>  
>> +void virtio_gpu_hotplug_work_func(struct work_struct *work)
>> +{
>> +    struct virtio_gpu_device *vgdev =
>> +            container_of(work, struct virtio_gpu_device, hotplug_work);
>> +
>> +    if (!drm_helper_hpd_irq_event(vgdev->ddev))
>> +            drm_kms_helper_hotplug_event(vgdev->ddev);
>> +}
>> +
>>  static void virtio_gpu_cmd_get_display_info_cb(struct virtio_gpu_device 
>> *vgdev,
>>                                             struct virtio_gpu_vbuffer *vbuf)
>>  {
>> @@ -841,8 +850,7 @@ static void virtio_gpu_cmd_get_display_info_cb(struct 
>> virtio_gpu_device *vgdev,
>>      spin_unlock(&vgdev->display_info_lock);
>>      wake_up(&vgdev->resp_wq);
>>  
>> -    if (!drm_helper_hpd_irq_event(vgdev->ddev))
>> -            drm_kms_helper_hotplug_event(vgdev->ddev);
>> +    schedule_work(&vgdev->hotplug_work);
>>  }
>>  
>>  static void virtio_gpu_cmd_get_capset_info_cb(struct virtio_gpu_device 
>> *vgdev,
> 

Hi,
Thank you for your review.

> Could you please move drm_kms_helper_hotplug_event() to virtio_gpu_init(), 
> placing it after wait_event_timeout(display_info_pending)? This will avoid 
> additional work_struct that otherwise needs to be cancelled in 
> virtio_gpu_init() on the timeout.

IIUC, moving the drm_kms_helper_hotplug_event() and _hpd_irq_event()
into virtio_gpu_init() after wait_event_timeout() would not prevent the
issue.

Looking at the syzbot call traces[1][2], the deadlock occurs during
drm_client_setup(), which runs after virtio_gpu_init() has already
returned. The display_info_cb that triggers the deadlock is called from
the dequeue worker while drm_client_register() holds clientlist_mutex.

Thread A:
virtio_gpu_probe()
 -> virtio_gpu_init()   // sends GET_DISPLAY_INFO and waits up to 5s
 -> drm_dev_register()
 -> drm_client_setup()   // deadlock happens HERE
     -> drm_client_register()   // holds clientlist_mutex
     ...
     -> virtio_gpu_queue_fenced_ctrl_buffer()
         -> wait_event() // waits for free space

Thread B:
virtio_gpu_dequeue_ctrl_func()
 -> reclaim_vbufs()   // make free space
 -> resp_cb()
     -> virtio_gpu_cmd_get_display_info_cb
         -> drm_helper_hpd_irq_event()
         -> drm_kms_helper_hotplug_event()
             -> drm_client_dev_hotplug() // need to lock clientlist_mutex
 -> wake_up()   // never reached

IIUC the hotplug notification in display_info_cb is needed because it
notifies DRM after updating by the callback with fresh data from the host.

This work_struct ensures display_info_cb never blocks on
clientlist_mutex in the dequeue worker, while preserving the hotplug
notification with fresh data.

[1] 
https://syzkaller.appspot.com/bug?id=d6dd6f86d3aaf7eebe7406e45c1c6e549453f224
[2] 
https://syzkaller.appspot.com/bug?id=908bd910da5dd79b88de4cf7baf376cc873a922e

Best regards,
Ryosuke


Reply via email to