On Fri, Oct 17, 2025 at 11:43:30AM +0200, Fiona Ebner wrote:
> When scsi_req_dequeue() is reached via
> scsi_req_cancel_async()
> virtio_scsi_tmf_cancel_req()
> virtio_scsi_do_tmf_aio_context(),
> there is a deadlock when trying to acquire the SCSI device's requests
> lock, because it was already acquired in
> virtio_scsi_do_tmf_aio_context().
> 
> In particular, the issue happens with a FreeBSD guest (13, 14, 15,
> maybe more), when it cancels SCSI requests, because of timeout.
> 
> This is a regression caused by commit da6eebb33b ("virtio-scsi:
> perform TMFs in appropriate AioContexts") and the introduction of the
> requests_lock earlier.
> 
> To fix the issue, only cancel the requests after releasing the
> requests_lock. For this, the SCSI device's requests are iterated while
> holding the requests_lock and the requests to be cancelled are
> collected in a list. Then, the collected requests are cancelled
> one by one while not holding the requests_lock. This is safe, because
> only requests from the current AioContext are collected and acted
> upon.
> 
> Originally reported by Proxmox VE users:
> https://bugzilla.proxmox.com/show_bug.cgi?id=6810
> https://forum.proxmox.com/threads/173914/
> 
> Fixes: da6eebb33b ("virtio-scsi: perform TMFs in appropriate AioContexts")
> Suggested-by: Stefan Hajnoczi <[email protected]>
> Signed-off-by: Fiona Ebner <[email protected]>
> ---
> 
> Changes in v2:
> * Different approach, collect requests for cancelling in a list for a
>   localized solution rather than keeping track of the lock status via
>   function arguments.
> 
>  hw/scsi/virtio-scsi.c | 14 +++++++++++++-
>  1 file changed, 13 insertions(+), 1 deletion(-)

Thanks, applied to my block tree:
https://gitlab.com/stefanha/qemu/commits/block

I replace g_list_append() with g_list_prepend() like in
scsi_device_for_each_req_async_bh(). The GLib documentation says the
following (https://docs.gtk.org/glib/type_func.List.append.html):

  g_list_append() has to traverse the entire list to find the end, which
  is inefficient when adding multiple elements. A common idiom to avoid
  the inefficiency is to use g_list_prepend() and reverse the list with
  g_list_reverse() when all elements have been added.

We don't call g_list_reverse() in scsi_device_for_each_req_async_bh()
and I don't think it's necessary here either.

Stefan

Attachment: signature.asc
Description: PGP signature

Reply via email to