Akihiko Odaki <akihiko.od...@daynix.com> writes:

> On 2025/05/22 1:42, Alex Bennée wrote:
>> From: Manos Pitsidianakis <manos.pitsidiana...@linaro.org>
>> This commit fixes an indefinite hang when using VIRTIO GPU blob
>> objects
>> under TCG in certain conditions.
>> The VIRTIO_GPU_CMD_RESOURCE_MAP_BLOB VIRTIO command creates a
>> MemoryRegion and attaches it to an offset on a PCI BAR of the
>> VirtIOGPUdevice. The VIRTIO_GPU_CMD_RESOURCE_UNMAP_BLOB command unmaps
>> it.
>> Because virglrenderer commands are not thread-safe they are only
>> called on the main context and QEMU performs the cleanup in three steps
>> to prevent a use-after-free scenario where the guest can access the
>> region after it’s unmapped:
>> 1. From the main context, the region’s field finish_unmapping is
>> false
>>     by default, so it sets a variable cmd_suspended, increases the
>>     renderer_blocked variable, deletes the blob subregion, and unparents
>>     the blob subregion causing its reference count to decrement.
>> 2. From an RCU context, the MemoryView gets freed, the FlatView gets
>>     recalculated, the free callback of the blob region
>>     virtio_gpu_virgl_hostmem_region_free is called which sets the
>>     region’s field finish_unmapping to true, allowing the main thread
>>     context to finish replying to the command
>> 3. From the main context, the command is processed again, but this
>> time
>>     finish_unmapping is true, so virgl_renderer_resource_unmap can be
>>     called and a response is sent to the guest.
>> It happens so that under TCG, if the guest has no timers configured
>> (and
>> thus no interrupt will cause the CPU to exit), the RCU thread does not
>> have enough time to grab the locks and recalculate the FlatView.
>> That’s not a big problem in practice since most guests will assume a
>> response will happen later in time and go on to do different things,
>> potentially triggering interrupts and allowing the RCU context to run.
>> If the guest waits for the unmap command to complete though, it blocks
>> indefinitely. Attaching to the QEMU monitor and force quitting the guest
>> allows the cleanup to continue.
>> There's no reason why the FlatView recalculation can't occur right
>> away
>> when we delete the blob subregion, however. It does not, because when we
>> create the subregion we set the object as its own parent:
>>      memory_region_init_ram_ptr(mr, OBJECT(mr), "blob", size, data);
>> The extra reference is what prevents freeing the memory region
>> object in
>> the memory transaction of deleting the subregion.
>> This commit changes the owner object to the device, which removes
>> the
>> extra owner reference in the memory region and causes the MR to be
>> freed right away in the main context.
>> Acked-by: Michael S. Tsirkin <m...@redhat.com>
>> Signed-off-by: Manos Pitsidianakis <manos.pitsidiana...@linaro.org>
>> Reviewed-by: Alex Bennée <alex.ben...@linaro.org>
>> Tested-by: Alex Bennée <alex.ben...@linaro.org>
>> Message-Id: <20250410122643.1747913-3-manos.pitsidiana...@linaro.org>
>> Cc: qemu-sta...@nongnu.org
>> ---
>>   hw/display/virtio-gpu-virgl.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>> diff --git a/hw/display/virtio-gpu-virgl.c
>> b/hw/display/virtio-gpu-virgl.c
>> index 71a7500de9..8fbe4e70cc 100644
>> --- a/hw/display/virtio-gpu-virgl.c
>> +++ b/hw/display/virtio-gpu-virgl.c
>> @@ -112,7 +112,7 @@ virtio_gpu_virgl_map_resource_blob(VirtIOGPU *g,
>>       vmr->g = g;
>>       mr = g_new0(MemoryRegion, 1);
>>   -    memory_region_init_ram_ptr(mr, OBJECT(mr), "blob", size,
>> data);
>> +    memory_region_init_ram_ptr(mr, OBJECT(g), "blob", size, data);
>>       memory_region_add_subregion(&b->hostmem, offset, mr);
>>       memory_region_set_enabled(mr, true);
>>   
>
> I suggest dropping this patch for now due to the reason I pointed out
> for the first version of this series.

This fixes an actual bug - without it we get a hang. 

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro

Reply via email to