On 2025/10/29 7:09, Peter Xu wrote:
On Mon, Oct 27, 2025 at 02:56:53PM +0900, Akihiko Odaki wrote:
docs/devel/memory.rst says "memory_region_ref and memory_region_unref
are never called on aliases", but these functions are called for
FlatView roots, which can be aliases.

IMHO the quoted doc was in a special context, where it was talking about
the example of address_space_map() holding adhoc refcounts of a MR, in
which case "memory_region_ref and memory_region_unref are never called on
aliases" was correct..

The full context:

   ...
   If you break this rule, the following situation can happen:
- the memory region's owner had a reference taken via memory_region_ref
     (for example by address_space_map)
- the region is unparented, and has no owner anymore - when address_space_unmap is called, the reference to the memory region's
     owner is leaked.
There is an exception to the above rule: it is okay to call
   object_unparent at any time for an alias or a container region.  It is
   therefore also okay to create or destroy alias and container regions
   dynamically during a device's lifetime.
This exceptional usage is valid because aliases and containers only help
   QEMU building the guest's memory map; they are never accessed directly.
   memory_region_ref and memory_region_unref are never called on aliases
   or containers, and the above situation then cannot happen.  Exploiting
   this exception is rarely necessary, and therefore it is discouraged,
   but nevertheless it is used in a few places.
   ...

So I can't say the doc is wrong, but maybe it can be at least be clearer on
the scope of that sentence.. indeed.

I think statement "it is okay to call object_unparent at any time for an alias or a container region" can be corrected. Practically, developers will want call object_unparent() only when:
- the memory region is not added to a container and
- there is no manual references created with memory_region_ref().

These two conditions can be satisfied by auditing the device code that owns the memory region instead of multiple devices.



This causes object overwrite hazard in pci-bridge. Specifically,
pci_bridge_region_init() expects that there are no references to
w->alias_io after object_unparent() is called, allowing it to reuse the
associated storage. However, if a parent bus still holds a reference to
the existing object as a FlatView's root, the storage is still in use,
leading to an overwrite. This hazard can be confirmed by adding the
following code to pci_bridge_region_init():

PCIDevice *parent_dev = parent->parent_dev;
assert(!object_dynamic_cast(OBJECT(parent_dev), TYPE_PCI_BRIDGE) ||
        PCI_BRIDGE(parent_dev)->as_io.current_map->root != &w->alias_io);

What's interesting is I found PCIBridge.as_io / PCIBridge.as_mem are not
used anywhere..  because it looks like the bridge code uses MRs to operate
rather than address spaces.

Does it mean we can drop the two ASes?  Then if they're the only holder of
the refcounts of these problematic MRs, does it solve the problem too in an
easier way?  Maybe there're other issues you want to fix too with this patch?

Apparently we cannot drop the ASes. See commit 55fa4be6f76a ("virtio-pci: fix memory_region_find for VirtIOPCIRegion's MR"), which introduced them.

I don't know any other existing devices affected by this FlatView behavior, but it is also difficult to show that they are *not* affected because it requires traversing MemoryRegion graphs that span across several devices.

We will also need to update the documentation for future devices, but it is not trivial either as the condition where aliases are referenced from FlatView is complex.

Considering that, I think this patch is a pragmatic solution that ensures correctness of object_unparent() on aliases.



This assertion fails when running:
meson test -C build qtest-x86_64/bios-tables-test \
     '--test-args=-p /x86_64/acpi/piix4/pci-hotplug/no_root_hotplug'

Make the references of FlatView roots "weak" (i.e., remove the
reference to a root automatically removed when it is finalized) to
avoid calling memory_region_ref and memory_region_unref and fix the
hazard with pci-bridge.

Alternative solutions (like removing the "never called on aliases"
statement or detailing the exception) were rejected because the alias
invariant is still relied upon in several parts of the codebase, and
updating existing code to align with a new condition is non-trivial.

Signed-off-by: Akihiko Odaki <[email protected]>
---
  system/memory.c | 8 ++++++--
  1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/system/memory.c b/system/memory.c
index 8b84661ae36c..08fe5e791224 100644
--- a/system/memory.c
+++ b/system/memory.c
@@ -266,7 +266,6 @@ static FlatView *flatview_new(MemoryRegion *mr_root)
      view = g_new0(FlatView, 1);
      view->ref = 1;
      view->root = mr_root;
-    memory_region_ref(mr_root);
      trace_flatview_new(view, mr_root);
return view;
@@ -301,7 +300,6 @@ static void flatview_destroy(FlatView *view)
          memory_region_unref(view->ranges[i].mr);
      }
      g_free(view->ranges);
-    memory_region_unref(view->root);
      g_free(view);
  }
@@ -1796,6 +1794,12 @@ void memory_region_init_iommu(void *_iommu_mr,
  static void memory_region_finalize(Object *obj)
  {
      MemoryRegion *mr = MEMORY_REGION(obj);
+    gpointer key;
+    gpointer value;
+
+    if (g_hash_table_steal_extended(flat_views, mr, &key, &value)) {
+        ((FlatView *)value)->root = NULL;
+    }

This is definitely very tricky.. The translation path (from
AddressSpaceDispatch) indeed looks ok as of now, which doesn't looks at
view->root.. however at least I saw this:

void flatview_unref(FlatView *view)
{
     if (qatomic_fetch_dec(&view->ref) == 1) {
         trace_flatview_destroy_rcu(view, view->root);
         assert(view->root);                            <-------------------
         call_rcu(view, flatview_destroy, rcu);
     }
}

I wonder how it didn't already crash.

In case of pci-bridge, I guess flatview_unref() is synchronously called, but memory_region_unref(view->root) is not because of flatview_destroy() is delayed with RCU.


The other stupid but working solution is we can always make the 6 aliases
to not be reused, IOW we can always use dynamic MRs considering
pci_bridge_update_mappings() should be rare?

Perhaps we may introduce memory_region_new_alias() (that calls object_new()) and allow calling object_unparent() only for aliases created with the function.

Regards,
Akihiko Odaki

Reply via email to