On 02/07/2026 14:26, Dmitry Osipenko wrote:
> On 7/1/26 12:23, Ryosuke Yasuoka wrote:
>> 
>> 
>> On 30/06/2026 16:46, Dmitry Osipenko wrote:
>>> Hi,
>>>
>>> On 6/30/26 12:16, Ryosuke Yasuoka wrote:
>>>> A probe-time deadlock can occur between the dequeue worker and
>>>> drm_client_register(). During probe, drm_client_register() holds
>>>> clientlist_mutex and calls the fbdev hotplug callback, which triggers an
>>>> atomic commit that ends up sleeping in virtio_gpu_queue_ctrl_sgs()
>>>> waiting for virtqueue space. The dequeue worker that would free that
>>>> space calls virtio_gpu_cmd_get_display_info_cb(), which invokes
>>>> drm_kms_helper_hotplug_event() -> drm_client_dev_hotplug(), attempting
>>>> to acquire the same clientlist_mutex. Since wake_up() is only called
>>>> after the resp_cb loop, the probe thread is never woken and both threads
>>>> deadlock.
>>>>
>>>> Fix this by deferring the hotplug notification from
>>>> virtio_gpu_cmd_get_display_info_cb() to a separate work item. The
>>>> display data (outputs[i].info) is still updated synchronously in the
>>>> callback, and the deferred work only triggers a re-probe notification to
>>>> DRM clients.
>>>>
>>>> Fixes: 27655b9bb9f0 ("drm/client: Send hotplug event after registering a 
>>>> client")
>>>> Closes: 
>>>> https://syzkaller.appspot.com/bug?id=d6dd6f86d3aaf7eebe7406e45c1c6e549453f224
>>>> Closes: 
>>>> https://syzkaller.appspot.com/bug?id=908bd910da5dd79b88de4cf7baf376cc873a922e
>>>> Signed-off-by: Ryosuke Yasuoka <[email protected]>
>>>> ---
>>>>  drivers/gpu/drm/virtio/virtgpu_drv.h |  3 +++
>>>>  drivers/gpu/drm/virtio/virtgpu_kms.c |  3 +++
>>>>  drivers/gpu/drm/virtio/virtgpu_vq.c  | 12 ++++++++++--
>>>>  3 files changed, 16 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h 
>>>> b/drivers/gpu/drm/virtio/virtgpu_drv.h
>>>> index 7449907754a4..27ffa4697ae9 100644
>>>> --- a/drivers/gpu/drm/virtio/virtgpu_drv.h
>>>> +++ b/drivers/gpu/drm/virtio/virtgpu_drv.h
>>>> @@ -264,6 +264,8 @@ struct virtio_gpu_device {
>>>>  
>>>>    struct work_struct config_changed_work;
>>>>  
>>>> +  struct work_struct hotplug_work;
>>>> +
>>>>    struct work_struct obj_free_work;
>>>>    spinlock_t obj_free_lock;
>>>>    struct list_head obj_free_list;
>>>> @@ -350,6 +352,7 @@ void virtio_gpu_cmd_transfer_to_host_2d(struct 
>>>> virtio_gpu_device *vgdev,
>>>>                                    uint32_t x, uint32_t y,
>>>>                                    struct virtio_gpu_object_array *objs,
>>>>                                    struct virtio_gpu_fence *fence);
>>>> +void virtio_gpu_hotplug_work_func(struct work_struct *work);
>>>>  void virtio_gpu_panic_cmd_resource_flush(struct virtio_gpu_device *vgdev,
>>>>                                     uint32_t resource_id,
>>>>                                     uint32_t x, uint32_t y,
>>>> diff --git a/drivers/gpu/drm/virtio/virtgpu_kms.c 
>>>> b/drivers/gpu/drm/virtio/virtgpu_kms.c
>>>> index cfde9f573df6..cfb532ba43a4 100644
>>>> --- a/drivers/gpu/drm/virtio/virtgpu_kms.c
>>>> +++ b/drivers/gpu/drm/virtio/virtgpu_kms.c
>>>> @@ -154,6 +154,8 @@ int virtio_gpu_init(struct virtio_device *vdev, struct 
>>>> drm_device *dev)
>>>>    INIT_WORK(&vgdev->config_changed_work,
>>>>              virtio_gpu_config_changed_work_func);
>>>>  
>>>> +  INIT_WORK(&vgdev->hotplug_work, virtio_gpu_hotplug_work_func);
>>>> +
>>>>    INIT_WORK(&vgdev->obj_free_work,
>>>>              virtio_gpu_array_put_free_work);
>>>>    INIT_LIST_HEAD(&vgdev->obj_free_list);
>>>> @@ -293,6 +295,7 @@ void virtio_gpu_deinit(struct drm_device *dev)
>>>>    flush_work(&vgdev->obj_free_work);
>>>>    flush_work(&vgdev->ctrlq.dequeue_work);
>>>>    flush_work(&vgdev->cursorq.dequeue_work);
>>>> +  flush_work(&vgdev->hotplug_work);
>>>>    flush_work(&vgdev->config_changed_work);
>>>>    virtio_reset_device(vgdev->vdev);
>>>>    vgdev->vdev->config->del_vqs(vgdev->vdev);
>>>> diff --git a/drivers/gpu/drm/virtio/virtgpu_vq.c 
>>>> b/drivers/gpu/drm/virtio/virtgpu_vq.c
>>>> index 67865810a2e7..084d98f5dc7b 100644
>>>> --- a/drivers/gpu/drm/virtio/virtgpu_vq.c
>>>> +++ b/drivers/gpu/drm/virtio/virtgpu_vq.c
>>>> @@ -816,6 +816,15 @@ virtio_gpu_cmd_resource_detach_backing(struct 
>>>> virtio_gpu_device *vgdev,
>>>>    virtio_gpu_queue_fenced_ctrl_buffer(vgdev, vbuf, fence);
>>>>  }
>>>>  
>>>> +void virtio_gpu_hotplug_work_func(struct work_struct *work)
>>>> +{
>>>> +  struct virtio_gpu_device *vgdev =
>>>> +          container_of(work, struct virtio_gpu_device, hotplug_work);
>>>> +
>>>> +  if (!drm_helper_hpd_irq_event(vgdev->ddev))
>>>> +          drm_kms_helper_hotplug_event(vgdev->ddev);
>>>> +}
>>>> +
>>>>  static void virtio_gpu_cmd_get_display_info_cb(struct virtio_gpu_device 
>>>> *vgdev,
>>>>                                           struct virtio_gpu_vbuffer *vbuf)
>>>>  {
>>>> @@ -841,8 +850,7 @@ static void virtio_gpu_cmd_get_display_info_cb(struct 
>>>> virtio_gpu_device *vgdev,
>>>>    spin_unlock(&vgdev->display_info_lock);
>>>>    wake_up(&vgdev->resp_wq);
>>>>  
>>>> -  if (!drm_helper_hpd_irq_event(vgdev->ddev))
>>>> -          drm_kms_helper_hotplug_event(vgdev->ddev);
>>>> +  schedule_work(&vgdev->hotplug_work);
>>>>  }
>>>>  
>>>>  static void virtio_gpu_cmd_get_capset_info_cb(struct virtio_gpu_device 
>>>> *vgdev,
>>>
>> 
>> Hi,
>> Thank you for your review.
>> 
>>> Could you please move drm_kms_helper_hotplug_event() to virtio_gpu_init(), 
>>> placing it after wait_event_timeout(display_info_pending)? This will avoid 
>>> additional work_struct that otherwise needs to be cancelled in 
>>> virtio_gpu_init() on the timeout.
>> 
>> IIUC, moving the drm_kms_helper_hotplug_event() and _hpd_irq_event()
>> into virtio_gpu_init() after wait_event_timeout() would not prevent the
>> issue.
>> 
>> Looking at the syzbot call traces[1][2], the deadlock occurs during
>> drm_client_setup(), which runs after virtio_gpu_init() has already
>> returned. The display_info_cb that triggers the deadlock is called from
>> the dequeue worker while drm_client_register() holds clientlist_mutex.
>> 
>> Thread A:
>> virtio_gpu_probe()
>>  -> virtio_gpu_init()   // sends GET_DISPLAY_INFO and waits up to 5s
> 
> You mean that the timeout happens and it's again syzkaller report for a
> broken host. A day ago I applied [1] that should fix this "bogus"
> syzkaller report.
> 
> [1] https://patchwork.freedesktop.org/patch/735301/

Thank you for [1] and the clarificaiton. I agree that [1] addresses the
broken host scenario where virtio_gpu_init() times out and probe
continues with a pending display_info response. I think there is another
scenario causing the same deadlock reported by syzbot and my patch
addresses both cases.

The deadlock can also occur with the following scenario:
after virtio_gpu_init() completes successfully (wait_event_timeout
returns), if a config_changed event arrives before drm_client_setup()
completes, config_changed_work_func() sends a new GET_DISPLAY_INFO. Its
display_info_cb fires in the dequeue worker and blocks on
clientlist_mutex held by drm_client_register(). This is the same
deadlock path reported by syzbot.

Since [1] only prevents the timeout case, I believe the schedule_work()
approach is still needed to prevent display_info_cb from blocking on
clientlist_mutex in the dequeue worker context, regardless of the
trigger.

Best regards,
Ryosuke


Reply via email to