Re: [PATCH v3 12/20] virtio-gpu: fix hang under TCG when unmapping blob

Akihiko Odaki Tue, 27 May 2025 04:10:56 -0700

On 2025/05/27 19:05, Alex Bennée wrote:

Akihiko Odaki <akihiko.od...@daynix.com> writes:

On 2025/05/22 18:28, Alex Bennée wrote:

Akihiko Odaki <akihiko.od...@daynix.com> writes:

On 2025/05/22 16:31, Manos Pitsidianakis wrote:

On Thu, May 22, 2025 at 10:03 AM Akihiko Odaki <akihiko.od...@daynix.com> wrote:


On 2025/05/22 15:45, Alex Bennée wrote:

Akihiko Odaki <akihiko.od...@daynix.com> writes:

On 2025/05/22 1:42, Alex Bennée wrote:

From: Manos Pitsidianakis <manos.pitsidiana...@linaro.org>

<snip>

In such a case, a bug should be fixed minimizing the regression and the
documentation of the regression should be left in the code.

In particular, this patch can cause use-after-free whether TCG is used
or not. Instead, I suggest to avoid freeing memory regions at all on
TCG. It will surely leak memory, but won't result in use-after-free at
least and the other accelerators will be unaffected.

Regards,
Akihiko Odaki

We tested this fix with ASAN and didn't see anything. Do you have a
test case in mind that can reproduce this use-after-free? It'd help
make a certain decision on whether to drop this patch or not. I'm not
doubting that this can cause a use-after-free by the way, it's just
that it is hypothetical only. If it causes a use-after-free for sure
we should definitely drop it.


No, I don't have a test case and it should rarely occur. More
concretely, a UAF occurs if the guest accesses the memory region while
trying to unmap it. It is just a theory indeed, but the theory says
the UAF is possible.

I have a test case this fixes which I think trumps a theoretical UAF
without a test case.
Why would the guest attempt to access after triggering the free
itself?
Wouldn't it be correct to fault the guest for violating its own memory
safety rules?


docs/devel/secure-coding-practices.rst says "Unexpected accesses must
not cause memory corruption or leaks in QEMU".


Agreed.

I'm not completely sure whether it is safe without concurrent accesses
either. In particular, KVM does not immediately update the guest
memory mapping, so it may result in a time window where the guest
memory is mapped to an unmapped host memory region, and I suspect that
could cause a problem. That also motivates limiting the scope of the
change to TCG.


Surely it does:

         memory_region_set_enabled(mr, false);
         memory_region_del_subregion(&b->hostmem, mr);

will trigger a rebuilding of the flatview - so after the memory region
is deleted any guest access should trigger a fault to the guest. Only
then do we unparent the memory region and finish the clean-up.


I wrongly assumed it waits for RCU, but I found it is not true.

It is still not true that any guest access will trigger a fault to theguest. QEMU's internal AddressSpace keeps the old FlatView until the RCUchanges, and some device may still have a DMA mapping.


I don't think we want to have different paths for KVM and TCG here as it
will further complicate already complicated code.

Complexity is not a problem here. Any concurrent code has complexityneeded for correctness and reliability. We should add complexity ifneeded for reliablity; otherwise, we should remove it (even if it'salready simple).

Instead, I suggest to avoid freeing memory regions at all on
TCG. It will surely leak memory, but won't result in use-after-free at
least and the other accelerators will be unaffected.

Leaking memory for blob objects is also not ideal, since they are
frequently allocated. It's memory-safe but the leak can accumulate
over time.


Memory safety and leak free cannot be compatible unless RCU is fixed.
We need to choose either of them.

How can the guest access something that is now unmapped? The RCU
should
only run after the flatview has been updated.


This patch bypasses RCU. That's why it solves the hang even though the
RCU itself is not fixed.

Let me summarize the theory and the actual behavior below:

The theory is that RCU satisfies the common requirement of concurrent
algorithms. More concretely:
1) It is race-free; for RCU, it means it prevents use-after-free.
2) It does not prevent forward progress.

The patch message suggests 2) is not satisfied. A proper fix would be
to change RCU to satisfy 2).


Its mutually incompatible with virglrenderer - we have to block all
commands until the virgl resource is freed and we can't do that until
the memory region is unplugged.

So yes we do bypass RCU for this but by explicitly un-parenting the
resource once the mapping has been removed.

The problem is that RCU doesn't free the memory region when all commandsare blocked.

If we fix RCU, RCU will free the memory region, the virgl resource willbe freed then, and the commands will be unblocked.

However, this patch workarounds the problem by circumventing RCU,
which solves 2) but it regresses 1).


I'm still not seeing how this happens and without a test case to
demonstrate it happening I can't hold this patch in limbo forever.

Not a test case that runs on the guest, but adding the following changeto QEMU will demonstrate the issue:


diff --git a/hw/display/virtio-gpu-virgl.c b/hw/display/virtio-gpu-virgl.c
index 145a0b387956..c97cd2dfd7b3 100644
--- a/hw/display/virtio-gpu-virgl.c
+++ b/hw/display/virtio-gpu-virgl.c
@@ -182,7 +182,11 @@ virtio_gpu_virgl_unmap_resource_blob(VirtIOGPU *g,

/* memory region owns self res->mr object and frees it byitself */

         memory_region_set_enabled(mr, false);
         memory_region_del_subregion(&b->hostmem, mr);
+        memory_region_ref(OBJECT(mr));
+        int k = b->renderer_blocked;
         object_unparent(OBJECT(mr));
+        assert(k == b->renderer_blocked);
+        memory_region_unref(OBJECT(mr));
     }

     return 0;

memory_region_ref() and memory_region_unref() emulates the begin and endof a DMA operation on mr, respectively. In a real world scenario,address_space_map() and address_space_unmap() call these functions.

If the code is correct, object_unparent() should not free the memoryregion while the DMA operation is ongoing, so the renderer will be keptblocked. assert(k == b->renderer_blocked) checks that.

My proposal is to limit the impact of the regression as possible as wecan and to document it, not to postpone the solution until figuring outthe problem in RCU.

My suggestion is to document and to limit the impact of 1) by:
a) Limiting the scope of the change to TCG.
b) Not freeing memory regions, which will prevent use-after-free while
leaking memory.

Manos said b) can be problematic because mappings are frequently
created. Whether b) makes sense or not depends on the probability and
impact of UAF and memory leak


Not freeing memory regions would lead to a DoS attack instead. I don't
think we can just accumulate region like that.

UAF also results in DoS, and it can have a bigger consequence due tomemory corruption. Memory leak is usually safer than UAF.

The question is whether memory leak causes a practical problem; it makessense to opt to UAF if the accumulation happens too quickly; we at leastknow that UAF is rare.


Regards,
Akihiko Odaki

Re: [PATCH v3 12/20] virtio-gpu: fix hang under TCG when unmapping blob

Reply via email to