This patch series introduces a para-virtualized IOMMU driver for
Linux guests running on Microsoft Hyper-V. The driver enables two
primary use cases:
1) In-kernel DMA protection for devices assigned to the guest.
2) Device assignment to guest user space (e.g., via VFIO).
The driver implements the following core functionality:
* Hypercall-based Enumeration
Unlike traditional ACPI-based discovery (e.g., DMAR/IVRS),
this driver enumerates the Hyper-V IOMMU capabilities directly
via hypercalls. This approach allows the guest to discover
IOMMU presence and features without requiring specific virtual
firmware extensions or modifications.
* Domain Management
The driver manages IOMMU domains through a new set of Hyper-V
hypercall interfaces, handling domain allocation and attachment
for endpoint devices.
* Nested Translation Support
This implementation leverages guest-managed stage-1 I/O page
tables nested with host stage-2 translations. It is built
upon the consolidated IOMMU page table framework (IOMMU_PT).
This design eliminates the need for emulating map operations.
Both Intel VT-d and AMD IOMMU platforms are supported.
* IOTLB Invalidation
IOTLB invalidation requests are marshaled and issued to the
hypervisor through the same hypercall mechanism. Both domain-
selective and page-selective flushes are supported.
Implementation Notes:
* Platform Support
The current implementation targets x86 platforms with Intel
VT-d and AMD IOMMU hardware.
* MSI Region Handling
The hardware MSI region is hard-coded to the standard x86
interrupt range (0xfee00000 - 0xfeefffff). Future updates may
allow this configuration to be queried via hypercalls if new
hardware platforms are to be supported.
* Reserved Regions (RMRR)
There is currently no requirement to support assigned devices with
ACPI RMRR limitations. Consequently, this patch series does not
specify or query reserved memory regions.
Testing:
This series has been validated with the following configurations:
- Intel DSA devices assigned to the guest, tested with dmatest.
- NVMe devices assigned to the guest on AMD platforms, tested
with fio.
- dma_map_benchmark for DMA mapping performance evaluation.
Changelog:
v1[1] -> v2:
- Dropped the "move to subdirectory" patch; the directory now exists
upstream.
- hv: logical device ID registry:
- Moved the registry to hv_common.c so it can be shared, and derived
the prefix via a shared helper instead of caching it in pci-hyperv's
private struct.
- Moved the lookup out of the irq-disabled region (PREEMPT_RT).
- iommu/hyperv: para-virtualized IOMMU:
- Removed the unused detach_dev op.
- Rejected a hypervisor not advertising x86 page sizes instead of
masking and warning.
- Statically initialized the identity and blocking domains.
- Gave the blocking domain its own attach op, which returns the hypercall
status and WARNs on failure.
- iommu/hyperv: page-selective IOTLB flush:
- Used a single descriptor covering a slightly larger power-of-two
range, instead of splitting the range into multiple descriptors.
- Fixed the inclusive-end corner case in the flush range calculation.
RFC v1[2] -> v1[1]:
- Scoped platform support to x86 only (Intel VT-d and AMD IOMMU);
initialization now uses x86_init.iommu.iommu_init
- Added page-selective IOTLB flush support
- Disable device ATS in hv_iommu_release_device()
- Addressed review comments from Michael Kelley:
- Reversed dependency: pvIOMMU exports registration API for
pci-hyperv to call, instead of pci-hyperv exporting
hv_build_logical_dev_id()
- Dropped separate output page allocation patch; hypercall input
and output now share the same per-CPU page
- Cleaned up Kconfig (removed PCI_HYPERV dependency, unnecessary
selects)
- Removed dev_list, per-domain spinlock, and syscore_ops
- Removed forward declarations by reordering functions
- Fixed typos, cleaned up Kconfig selects, improved pr_info
messages, etc.
[1]
https://lore.kernel.org/linux-hyperv/[email protected]/
[2]
https://lore.kernel.org/linux-hyperv/[email protected]/
Easwar Hariharan (1):
Drivers: hv: Add logical device ID registry for vPCI devices
Wei Liu (1):
hyperv: Introduce new hypercall interfaces used by Hyper-V guest IOMMU
Yu Zhang (2):
iommu/hyperv: Add para-virtualized IOMMU support for Hyper-V guest
iommu/hyperv: Add page-selective IOTLB flush support
arch/x86/hyperv/hv_init.c | 4 +
arch/x86/include/asm/mshyperv.h | 4 +
drivers/hv/hv_common.c | 95 ++++
drivers/iommu/Kconfig | 1 +
drivers/iommu/hyperv/Kconfig | 16 +
drivers/iommu/hyperv/Makefile | 1 +
drivers/iommu/hyperv/iommu.c | 686 ++++++++++++++++++++++++++++
drivers/iommu/hyperv/iommu.h | 51 +++
drivers/pci/controller/pci-hyperv.c | 21 +-
include/asm-generic/mshyperv.h | 13 +
include/hyperv/hvgdk_mini.h | 9 +
include/hyperv/hvhdk_mini.h | 141 ++++++
include/linux/hyperv.h | 8 +
13 files changed, 1045 insertions(+), 5 deletions(-)
create mode 100644 drivers/iommu/hyperv/Kconfig
create mode 100644 drivers/iommu/hyperv/iommu.c
create mode 100644 drivers/iommu/hyperv/iommu.h
--
2.52.0