Akihiko Odaki <akihiko.od...@daynix.com> writes: > On 2025/05/22 1:42, Alex Bennée wrote: >> From: Manos Pitsidianakis <manos.pitsidiana...@linaro.org> >> This commit fixes an indefinite hang when using VIRTIO GPU blob >> objects >> under TCG in certain conditions. >> The VIRTIO_GPU_CMD_RESOURCE_MAP_BLOB VIRTIO command creates a >> MemoryRegion and attaches it to an offset on a PCI BAR of the >> VirtIOGPUdevice. The VIRTIO_GPU_CMD_RESOURCE_UNMAP_BLOB command unmaps >> it. >> Because virglrenderer commands are not thread-safe they are only >> called on the main context and QEMU performs the cleanup in three steps >> to prevent a use-after-free scenario where the guest can access the >> region after it’s unmapped: >> 1. From the main context, the region’s field finish_unmapping is >> false >> by default, so it sets a variable cmd_suspended, increases the >> renderer_blocked variable, deletes the blob subregion, and unparents >> the blob subregion causing its reference count to decrement. >> 2. From an RCU context, the MemoryView gets freed, the FlatView gets >> recalculated, the free callback of the blob region >> virtio_gpu_virgl_hostmem_region_free is called which sets the >> region’s field finish_unmapping to true, allowing the main thread >> context to finish replying to the command >> 3. From the main context, the command is processed again, but this >> time >> finish_unmapping is true, so virgl_renderer_resource_unmap can be >> called and a response is sent to the guest. >> It happens so that under TCG, if the guest has no timers configured >> (and >> thus no interrupt will cause the CPU to exit), the RCU thread does not >> have enough time to grab the locks and recalculate the FlatView. >> That’s not a big problem in practice since most guests will assume a >> response will happen later in time and go on to do different things, >> potentially triggering interrupts and allowing the RCU context to run. >> If the guest waits for the unmap command to complete though, it blocks >> indefinitely. Attaching to the QEMU monitor and force quitting the guest >> allows the cleanup to continue. >> There's no reason why the FlatView recalculation can't occur right >> away >> when we delete the blob subregion, however. It does not, because when we >> create the subregion we set the object as its own parent: >> memory_region_init_ram_ptr(mr, OBJECT(mr), "blob", size, data); >> The extra reference is what prevents freeing the memory region >> object in >> the memory transaction of deleting the subregion. >> This commit changes the owner object to the device, which removes >> the >> extra owner reference in the memory region and causes the MR to be >> freed right away in the main context. >> Acked-by: Michael S. Tsirkin <m...@redhat.com> >> Signed-off-by: Manos Pitsidianakis <manos.pitsidiana...@linaro.org> >> Reviewed-by: Alex Bennée <alex.ben...@linaro.org> >> Tested-by: Alex Bennée <alex.ben...@linaro.org> >> Message-Id: <20250410122643.1747913-3-manos.pitsidiana...@linaro.org> >> Cc: qemu-sta...@nongnu.org >> --- >> hw/display/virtio-gpu-virgl.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> diff --git a/hw/display/virtio-gpu-virgl.c >> b/hw/display/virtio-gpu-virgl.c >> index 71a7500de9..8fbe4e70cc 100644 >> --- a/hw/display/virtio-gpu-virgl.c >> +++ b/hw/display/virtio-gpu-virgl.c >> @@ -112,7 +112,7 @@ virtio_gpu_virgl_map_resource_blob(VirtIOGPU *g, >> vmr->g = g; >> mr = g_new0(MemoryRegion, 1); >> - memory_region_init_ram_ptr(mr, OBJECT(mr), "blob", size, >> data); >> + memory_region_init_ram_ptr(mr, OBJECT(g), "blob", size, data); >> memory_region_add_subregion(&b->hostmem, offset, mr); >> memory_region_set_enabled(mr, true); >> > > I suggest dropping this patch for now due to the reason I pointed out > for the first version of this series.
This fixes an actual bug - without it we get a hang. -- Alex Bennée Virtualisation Tech Lead @ Linaro