On 5/30/25 7:41 AM, Michael S. Tsirkin wrote:
On Fri, May 02, 2025 at 02:15:45AM +0000, Alejandro Jimenez wrote:
This series adds support for guests using the AMD vIOMMU to enable DMA
remapping for VFIO devices. In addition to the currently supported
passthrough (PT) mode, guest kernels are now able to to provide DMA
address translation and access permission checking to VFs attached to
paging domains, using the AMD v1 I/O page table format.

Please see v1[0] cover letter for additional details such as example
QEMU command line parameters used in testing.

are you working on v3?

Yes, there are suggestions from Sairaj that I will address on v3. I am also planning to include two small patches from Joao Martins that add support for the HATDis feature (this is something that Sairaj suggested earlier). The Linux changes are being reviewed here:
https://lore.kernel.org/all/cover.1746613368.git.ankit.s...@amd.com/

I will be offline from 6/2 to 6/6, so I didn't want to send a new revision and disappear. In general, the changes from v2->v3 are minor and well contained, so any reviews I receive for v2 will be valid. That being said, I can send v3 today if you'd prefer that. Please let me know.

there was a bug you wanted to fix.


I assume the bug is Sairaj's report of a dmesg warning with an NVME passthrough on a 4.15 kernel, but unfortunately I have not been able to reproduce that problem. We agreed that given the age of the kernel (and reports of the same warning on NVME devices in unrelated scenarios), this is likely a guest driver issue, and should not be a blocker.

More details:
I have tested an Ubuntu image with a 4.15 kernel, but I cannot hit any issues when I passthrough a CX-6 VF (I don't have access to NMVE VF). The kernel is old enough that I have to force bind the mlx5_core driver to the VF on the guest, but once I do the VF comes up with no errors and I can see DMA map/unmap activity in the traces.

Sairaj: Are you passing a full NVME device to the guest (i.e. a PF)? I ask because the BDF in '-device vfio-pci,host=0000:44:00.0' doesn't look like a typical VF...

Thank you,
Alejandro

Changes since v1[0]:
- Added documentation entry for '-device amd-iommu'
- Code movement with no functional changes to avoid use of forward
   declarations in later patches [Sairaj, mst]
- Moved addr_translation and dma-remap property to separate commits.
   The dma-remap feature is only available for users to enable after
   all required functionality is implemented [Sairaj]
- Explicit initialization of significant fields like addr_translation
   and notifier_flags [Sairaj]
- Fixed bug in decoding of invalidation size [Sairaj]
- Changed fetch_pte() to use an out parameter for pte, and be able to
   check for error conditions via negative return value [Clement]
- Removed UNMAP-only notifier optimization, leaving vhost support for
   later series [Sairaj]
- Fixed ordering between address space unmap and memory region activation
   on devtab invalidation [Sairaj]
- Fixed commit message with "V=1, TV=0" [Sairaj]
- Dropped patch removing the page_fault event. That area is better
   addressed in separate series.
- Independent testing by Sairaj (thank you!)

Thank you,
Alejandro

[0] 
https://lore.kernel.org/all/20250414020253.443831-1-alejandro.j.jime...@oracle.com/

Alejandro Jimenez (20):
   memory: Adjust event ranges to fit within notifier boundaries
   amd_iommu: Document '-device amd-iommu' common options
   amd_iommu: Reorder device and page table helpers
   amd_iommu: Helper to decode size of page invalidation command
   amd_iommu: Add helper function to extract the DTE
   amd_iommu: Return an error when unable to read PTE from guest memory
   amd_iommu: Add helpers to walk AMD v1 Page Table format
   amd_iommu: Add a page walker to sync shadow page tables on
     invalidation
   amd_iommu: Add basic structure to support IOMMU notifier updates
   amd_iommu: Sync shadow page tables on page invalidation
   amd_iommu: Use iova_tree records to determine large page size on UNMAP
   amd_iommu: Unmap all address spaces under the AMD IOMMU on reset
   amd_iommu: Add replay callback
   amd_iommu: Invalidate address translations on INVALIDATE_IOMMU_ALL
   amd_iommu: Toggle memory regions based on address translation mode
   amd_iommu: Set all address spaces to default translation mode on reset
   amd_iommu: Add dma-remap property to AMD vIOMMU device
   amd_iommu: Toggle address translation mode on devtab entry
     invalidation
   amd_iommu: Do not assume passthrough translation when DTE[TV]=0
   amd_iommu: Refactor amdvi_page_walk() to use common code for page walk

  hw/i386/amd_iommu.c | 1005 ++++++++++++++++++++++++++++++++++++-------
  hw/i386/amd_iommu.h |   52 +++
  qemu-options.hx     |   23 +
  system/memory.c     |   10 +-
  4 files changed, 934 insertions(+), 156 deletions(-)


base-commit: 5134cf9b5d3aee4475fe7e1c1c11b093731073cf
--
2.43.5



Reply via email to