Re: [PATCH 10/18] amd_iommu: Add a page walker to sync shadow page tables on invalidation

Alejandro Jimenez Thu, 17 Apr 2025 09:24:17 -0700



On 4/17/25 11:14 AM, Ethan MILON wrote:

Hi,

On 4/13/25 10:02 PM, Alejandro Jimenez wrote:

For the specified address range, walk the page table identifying regions
as mapped or unmapped and invoke registered notifiers with the
corresponding event type.

Signed-off-by: Alejandro Jimenez <alejandro.j.jime...@oracle.com>
---
  hw/i386/amd_iommu.c | 74 +++++++++++++++++++++++++++++++++++++++++++++
  1 file changed, 74 insertions(+)

diff --git a/hw/i386/amd_iommu.c b/hw/i386/amd_iommu.c
index d089fdc28ef1..6789e1e9b688 100644
--- a/hw/i386/amd_iommu.c
+++ b/hw/i386/amd_iommu.c
@@ -1688,6 +1688,80 @@ fetch_pte(AMDVIAddressSpace *as, const hwaddr address, 
uint64_t dte,
      return pte;
  }

+/*

+ * Walk the guest page table for an IOVA and range and signal the registered
+ * notifiers to sync the shadow page tables in the host.
+ * Must be called with a valid DTE for DMA remapping i.e. V=1,TV=1
+ */
+static void __attribute__((unused))
+amdvi_sync_shadow_page_table_range(AMDVIAddressSpace *as, uint64_t *dte,
+                                   hwaddr addr, uint64_t size, bool send_unmap)
+{
+    IOMMUTLBEvent event;
+
+    hwaddr iova_next, page_mask, pagesize;
+    hwaddr iova = addr;
+    hwaddr end = iova + size - 1;
+
+    uint64_t pte;
+
+    while (iova < end) {
+
+        pte = fetch_pte(as, iova, dte[0], &pagesize);
+
+        if (pte == (uint64_t)-2) {
+            /*
+             * Invalid conditions such as the IOVA being larger than supported
+             * by current page table mode as configured in the DTE, or a 
failure
+             * to fetch the Page Table from the Page Table Root Pointer in DTE.
+             */
+            assert(pagesize == 0);
+            return;
+        }
+        /* PTE has been validated for major errors and pagesize is set */
+        assert(pagesize);
+        page_mask = ~(pagesize - 1);
+        iova_next = (iova & page_mask) + pagesize;
+
+        if (pte == (uint64_t)-1) {
+            /*
+             * Failure to read PTE from memory, the pagesize matches the 
current
+             * level. Unable to determine the region type, so a safe strategy 
is
+             * to skip the range and continue the page walk.
+             */
+            goto next;
+        }
+
+        event.entry.target_as = &address_space_memory;
+        event.entry.iova = iova & page_mask;
+        /* translated_addr is irrelevant for the unmap case */
+        event.entry.translated_addr = (pte & AMDVI_DEV_PT_ROOT_MASK) &
+                                      page_mask;
+        event.entry.addr_mask = ~page_mask;
+        event.entry.perm = amdvi_get_perms(pte);


Is it possible for the dte permissions to be more restrictive than
permissions of the fetched pte?

No. My understanding of the documentation is that permissions can onlyget more restrictive as you go down the page walk, because they arelogically ANDed with the permissions of the levels above (including theDTE). This is more or less verbatim what it says on the Spec on Table17: I/O Page Translation Entry (PTE) Fields, PR=1


More details:

I haven't found any place where the Linux driver modifies intermediatepermissions. As far as I can tell, alloc_pte() will create all the PDEswith RW permissions, and only applies permissions/prot requested inmap_pages() to the PTE. So the effective permissions during a page walkare really determined by the leaf PTE.

The above is why my initial prototype didn't bother to check theintermediate permissions in fetch_pte() and only checked at the returnedPTE. But I had to implement the intermediate checks since this code isemulating a hardware page walk so I have to comply with the specification.


Thank you,
Alejandro

+
+        /*
+         * In cases where the leaf PTE is not found, or it has invalid
+         * permissions, an UNMAP type notification is sent, but only if the
+         * caller requested it.
+         */
+        if (!IOMMU_PTE_PRESENT(pte) || (event.entry.perm == IOMMU_NONE)) {
+            if (!send_unmap) {
+                goto next;
+            }
+            event.type = IOMMU_NOTIFIER_UNMAP;
+        } else {
+            event.type = IOMMU_NOTIFIER_MAP;
+        }
+
+        /* Invoke the notifiers registered for this address space */
+        memory_region_notify_iommu(&as->iommu, 0, event);
+
+next:
+        iova = iova_next;
+    }
+}
+
  /*
   * Toggle between address translation and passthrough modes by enabling the
   * corresponding memory regions.

Re: [PATCH 10/18] amd_iommu: Add a page walker to sync shadow page tables on invalidation

Reply via email to