On Fri, May 15, 2026 at 05:11:19PM -0700, Mukesh R wrote:
> On 5/15/26 09:53, Yu Zhang wrote:
> > On Fri, May 15, 2026 at 02:51:38PM +0000, Michael Kelley wrote:
> > > From: Yu Zhang <[email protected]> Sent: Friday, May 15, 2026
> > > 7:00 AM
> > > >
> > > > On Thu, May 14, 2026 at 06:13:24PM +0000, Michael Kelley wrote:
> > > > > From: Yu Zhang <[email protected]> Sent: Monday, May 11,
> > > > > 2026 9:24 AM
> > > > > >
> > > > > > Add a para-virtualized IOMMU driver for Linux guests running on
> > > > > > Hyper-V.
> > > > > > This driver implements stage-1 IO translation within the guest OS.
> > > > > > It integrates with the Linux IOMMU core, utilizing Hyper-V
> > > > > > hypercalls
> > > > > > for:
> > > > > > - Capability discovery
> > > > > > - Domain allocation, configuration, and deallocation
> > > > > > - Device attachment and detachment
> > > > > > - IOTLB invalidation
> > > > > >
> > > > > > The driver constructs x86-compatible stage-1 IO page tables in the
> > > > > > guest memory using consolidated IO page table helpers. This allows
> > > > > > the guest to manage stage-1 translations independently of vendor-
> > > > > > specific drivers (like Intel VT-d or AMD IOMMU).
> > > > > >
> > > > > > Hyper-V consumes this stage-1 IO page table when a device domain is
> > > > > > created and configured, and nests it with the host's stage-2 IO page
> > > > > > tables, therefore eliminating the VM exits for guest IOMMU mapping
> > > > > > operations. For unmapping operations, VM exits to perform the IOTLB
> > > > > > flush are still unavoidable.
> > > > > >
> > > > > > Hyper-V identifies each PCI pass-thru device by a logical device ID
> > > > > > in its hypercall interface. The vPCI driver (pci-hyperv) registers
> > > > > > the
> > > > > > per-bus portion of this ID with the pvIOMMU driver during bus probe.
> > > > > > The pvIOMMU driver stores this mapping and combines it with the
> > > > > > function
> > > > > > number of the endpoint PCI device to form the complete ID for
> > > > > > hypercalls.
> > > > >
> > > > > As you are probably aware, Mukesh's patch series to support PCI
> > > > > pass-thru devices also needs to get the logical device ID. Maybe the
> > > > > registration mechanism needs to move somewhere that can be shared
> > > > > with his code.
> > > > >
> > > >
> > > > Thank you so much for the review, Michael!
> > > >
> > > > Yes, I looked at Mukesh's series and noticed his
> > > > hv_pci_vmbus_device_id()
> > > > in pci-hyperv.c has the same dev_instance byte manipulation. We do need
> > > > a common registration mechanism.
> > > >
> > > > Any suggestion on where to put it? drivers/hv/hv_common.c seems like a
> > > > natural place, but the register/lookup functions are currently only
> > > > meaningful when CONFIG_HYPERV_PVIOMMU is set. If Mukesh's pass-thru
> > > > code also needs them, we might need a new shared Kconfig option that
> > > > both can select. Open to better ideas.
> > >
> > > Unfortunately, I have not looked at Mukesh's series in detail yet, so
> > > I don't have enough knowledge of the full situation to offer a good
> > > recommendation.
> > >
> >
> > Sorry I forgot to Cc Mukesh in the previous reply. :(
> > @Mukesh, any thoughts on sharing the logical device ID registration
> > mechanism?
>
> Yeah, I went round and round trying to find the best place. I almost
> created virt/hyperv/hv_utils.c file. Maybe that is the best place?
Thanks for thinking about this, Mukesh!
I'm a bit hesitant about introducing virt/hyperv/. Currently virt/
only hosts KVM's architecture-neutral hypervisor core. And it feels
like the wrong layer for driver-level utility code. And drivers/hv/
seems like a more natural fit?
I'm also thinking about the config to gating these new interfaces(
register/lookup etc.), I am using CONFIG_HYPERV_PVIOMMU, and I guess
you may propably propose another one for the host side change(or just
CONFIG_MSHV_ROOT)?
B.R.
Yu
>
> Thanks,
> -Mukesh
>
>
> > > >
> > > > [...]
> > > >
> > > > > > +static void hv_flush_device_domain(struct hv_iommu_domain
> > > > > > *hv_domain)
> > > > > > +{
> > > > > > + u64 status;
> > > > > > + unsigned long flags;
> > > > > > + struct hv_input_flush_device_domain *input;
> > > > > > +
> > > > > > + local_irq_save(flags);
> > > > > > +
> > > > > > + input = *this_cpu_ptr(hyperv_pcpu_input_arg);
> > > > > > + memset(input, 0, sizeof(*input));
> > > > > > + input->device_domain = hv_domain->device_domain;
> > > > >
> > > > > The previous version of this patch had code to set several other
> > > > > fields in
> > > > > the input. I wanted to confirm that not setting them in this version
> > > > > is
> > > > > intentional. Were they not needed?
> > > > >
> > > >
> > > > Oh. The RFC v1 set partition_id, owner_vtl, domain_id.type, and
> > > > domain_id.id
> > > > individually. In this version, I just simplified it to a struct
> > > > assignment.
> > > > No functional change.
> > >
> > > Of course! I should have looked more closely at the details before making
> > > this comment. :-(
> > >
> > > [...]
> > >
> > > > >
> > > > > Previous versions of this function did hv_iommu_detach_dev(). With
> > > > > that call
> > > > > removed from here, hv_iommu_detach_dev() is only called when
> > > > > attaching a
> > > > > domain to a device that already has a domain attached. Is it the case
> > > > > that
> > > > > Hyper-V doesn't require the detach as a cleanup step?
> > > > >
> > > >
> > > > The IOMMU core attaches the device to release_domain (our blocking
> > > > domain)
> > > > before calling release_device(), so I believe the explicit detach in
> > > > the RFC
> > > > was redundant. I simply didn't realize that at the time.
> > > >
> > >
> > > Got it. But after the IOMMU core attaches the device to the blocking
> > > domain, there's the possibility that the vPCI device is rescinded by
> > > Hyper-V and it goes away entirely. Or the device might be subjected
> > > to an "unbind/bind" cycle in Linux. Does the detach need to be done
> > > on the blocking domain in such cases? In this version of the patches, the
> > > Hyper-V "attach" and "detach" hypercalls still end up unbalanced. That
> > > seems a bit untidy at best, and I wonder if there are scenarios where
> > > Hyper-V will complain about the lack of balance.
> > >
> >
> > Thank you, Michael. May I ask what "the vPCI device is rescinded by
> > Hyper-V and it goes away entirely" mean?
> >
> > I realized it's a bit untidy. But I want to understand this issue more
> > clearly first. :)
> >
> > B.R.
> > Yu
>