On 4/17/25 11:14 AM, Ethan MILON wrote:
Hi,

On 4/13/25 10:02 PM, Alejandro Jimenez wrote:
For the specified address range, walk the page table identifying regions
as mapped or unmapped and invoke registered notifiers with the
corresponding event type.

Signed-off-by: Alejandro Jimenez <alejandro.j.jime...@oracle.com>
---
  hw/i386/amd_iommu.c | 74 +++++++++++++++++++++++++++++++++++++++++++++
  1 file changed, 74 insertions(+)

diff --git a/hw/i386/amd_iommu.c b/hw/i386/amd_iommu.c
index d089fdc28ef1..6789e1e9b688 100644
--- a/hw/i386/amd_iommu.c
+++ b/hw/i386/amd_iommu.c
@@ -1688,6 +1688,80 @@ fetch_pte(AMDVIAddressSpace *as, const hwaddr address, 
uint64_t dte,
      return pte;
  }
+/*
+ * Walk the guest page table for an IOVA and range and signal the registered
+ * notifiers to sync the shadow page tables in the host.
+ * Must be called with a valid DTE for DMA remapping i.e. V=1,TV=1
+ */
+static void __attribute__((unused))
+amdvi_sync_shadow_page_table_range(AMDVIAddressSpace *as, uint64_t *dte,
+                                   hwaddr addr, uint64_t size, bool send_unmap)
+{
+    IOMMUTLBEvent event;
+
+    hwaddr iova_next, page_mask, pagesize;
+    hwaddr iova = addr;
+    hwaddr end = iova + size - 1;
+
+    uint64_t pte;
+
+    while (iova < end) {
+
+        pte = fetch_pte(as, iova, dte[0], &pagesize);
+
+        if (pte == (uint64_t)-2) {
+            /*
+             * Invalid conditions such as the IOVA being larger than supported
+             * by current page table mode as configured in the DTE, or a 
failure
+             * to fetch the Page Table from the Page Table Root Pointer in DTE.
+             */
+            assert(pagesize == 0);
+            return;
+        }
+        /* PTE has been validated for major errors and pagesize is set */
+        assert(pagesize);
+        page_mask = ~(pagesize - 1);
+        iova_next = (iova & page_mask) + pagesize;
+
+        if (pte == (uint64_t)-1) {
+            /*
+             * Failure to read PTE from memory, the pagesize matches the 
current
+             * level. Unable to determine the region type, so a safe strategy 
is
+             * to skip the range and continue the page walk.
+             */
+            goto next;
+        }
+
+        event.entry.target_as = &address_space_memory;
+        event.entry.iova = iova & page_mask;
+        /* translated_addr is irrelevant for the unmap case */
+        event.entry.translated_addr = (pte & AMDVI_DEV_PT_ROOT_MASK) &
+                                      page_mask;
+        event.entry.addr_mask = ~page_mask;
+        event.entry.perm = amdvi_get_perms(pte);

Is it possible for the dte permissions to be more restrictive than
permissions of the fetched pte?

No. My understanding of the documentation is that permissions can only get more restrictive as you go down the page walk, because they are logically ANDed with the permissions of the levels above (including the DTE). This is more or less verbatim what it says on the Spec on Table 17: I/O Page Translation Entry (PTE) Fields, PR=1

More details:
I haven't found any place where the Linux driver modifies intermediate permissions. As far as I can tell, alloc_pte() will create all the PDEs with RW permissions, and only applies permissions/prot requested in map_pages() to the PTE. So the effective permissions during a page walk are really determined by the leaf PTE.

The above is why my initial prototype didn't bother to check the intermediate permissions in fetch_pte() and only checked at the returned PTE. But I had to implement the intermediate checks since this code is emulating a hardware page walk so I have to comply with the specification.

Thank you,
Alejandro


+
+        /*
+         * In cases where the leaf PTE is not found, or it has invalid
+         * permissions, an UNMAP type notification is sent, but only if the
+         * caller requested it.
+         */
+        if (!IOMMU_PTE_PRESENT(pte) || (event.entry.perm == IOMMU_NONE)) {
+            if (!send_unmap) {
+                goto next;
+            }
+            event.type = IOMMU_NOTIFIER_UNMAP;
+        } else {
+            event.type = IOMMU_NOTIFIER_MAP;
+        }
+
+        /* Invoke the notifiers registered for this address space */
+        memory_region_notify_iommu(&as->iommu, 0, event);
+
+next:
+        iova = iova_next;
+    }
+}
+
  /*
   * Toggle between address translation and passthrough modes by enabling the
   * corresponding memory regions.


Reply via email to