smmuv3-accel: Restrict accelerated SMMUv3 to vfio-pci endpoints with iommufd

Shameer Kolothum Mon, 08 Sep 2025 00:57:38 -0700

Hi Eric,

> -----Original Message-----
> From: Eric Auger <eric.au...@redhat.com>
> Sent: 05 September 2025 09:30
> To: Nicolin Chen <nicol...@nvidia.com>; Duan, Zhenzhong
> <zhenzhong.d...@intel.com>; Shameer Kolothum
> <skolothum...@nvidia.com>
> Cc: qemu-...@nongnu.org; qemu-devel@nongnu.org;
> peter.mayd...@linaro.org; Jason Gunthorpe <j...@nvidia.com>;
> ddut...@redhat.com; berra...@redhat.com; Nathan Chen
> <nath...@nvidia.com>; Matt Ochs <mo...@nvidia.com>;
> smost...@google.com; Linuxarm <linux...@huawei.com>; Wangzhou (B)
> <wangzh...@hisilicon.com>; jiangkunkun <jiangkun...@huawei.com>;
> Jonathan Cameron <jonathan.came...@huawei.com>;
> zhangfei....@linaro.org; shameerkolot...@gmail.com
> Subject: Re: [RFC PATCH v3 06/15] hw/arm/smmuv3-accel: Restrict
> accelerated SMMUv3 to vfio-pci endpoints with iommufd
> 
> External email: Use caution opening links or attachments
> 
> 
> Hi Shameer,
> 
> On 7/16/25 10:06 AM, Shameerali Kolothum Thodi wrote:
> >
> >> -----Original Message-----
> >> From: Nicolin Chen <nicol...@nvidia.com>
> >> Sent: Tuesday, July 15, 2025 6:59 PM
> >> To: Duan, Zhenzhong <zhenzhong.d...@intel.com>
> >> Cc: Shameerali Kolothum Thodi
> >> <shameerali.kolothum.th...@huawei.com>; qemu-...@nongnu.org;
> >> qemu-devel@nongnu.org; eric.au...@redhat.com;
> >> peter.mayd...@linaro.org; j...@nvidia.com; ddut...@redhat.com;
> >> berra...@redhat.com; nath...@nvidia.com; mo...@nvidia.com;
> >> smost...@google.com; Linuxarm <linux...@huawei.com>; Wangzhou (B)
> >> <wangzh...@hisilicon.com>; jiangkunkun <jiangkun...@huawei.com>;
> >> Jonathan Cameron <jonathan.came...@huawei.com>;
> >> zhangfei....@linaro.org; shameerkolot...@gmail.com
> >> Subject: Re: [RFC PATCH v3 06/15] hw/arm/smmuv3-accel: Restrict
> >> accelerated SMMUv3 to vfio-pci endpoints with iommufd
> > ...
> >
> >>>> +    if (pdev && !smmuv3_accel_pdev_allowed(pdev, &vfio_pci)) {
> >>>> +        error_report("Device(%s) not allowed. Only PCIe root complex
> >>>> devices "
> >>>> +                     "or PCI bridge devices or vfio-pci endpoint devices
> >>>> with "
> >>>> +                     "iommufd as backend is allowed with
> >>>> arm-smmuv3,accel=on",
> >>>> +                     pdev->name);
> >>>> +        exit(1);
> >>> Seems aggressive for a hotplug, could we fail hotplug instead of kill
> >> QEMU?
> > That's right. I will try to see whether it is possible to do a 
> > dev->hotplugged
> > check here.
> >
> >> Hotplug will unlikely be supported well, as it would introduce
> >> too much complication.
> >>
> >> With iommufd, a vIOMMU object is allocated per device (vfio). If
> >> the device fd (cdev) is not yet given to the QEMU. It isn't able
> >> to allocate a vIOMMU object when creating a VM.
> >>
> >> While a vIOMMU object can be allocated at a later stage once the
> >> device is hotplugged. But things like IORT mappings aren't able
> >> to get refreshed since the OS is likely already booted.
> > Why do we need IORT mappings to be refreshed during hotplug?
> > AFAICS, the mappings are created per host bridge Ids. And how is this
> > different from a host machine doing hotplug?
> >
> >  Even an
> >> IOMMU capability sync via the hw_info ioctl will be difficult to
> >> do at the runtime post the guest iommu driver's initialization.
> > We had some discussion on this "at least one vfio-pci" restriction
> > for accelerated mode previously here.
> > https://lore.kernel.org/qemu-devel/z6ttclq35ui12...@redhat.com/#t
> >
> > I am not sure we reached any consensus on that. The 3 different approaches
> > discussed were,
> >
> > 1. The current one used here. At least one cold plugged vfio-pci device
> >    so that  we can retrieve the host SMMUV3 HW_INFO as per current
> >   IOMMUFD APIs.
> 
> I do not get why you can't wait for the 1st device to be attached to
> "freeze" the settings. Is it because you may also have some bridges /
> root ports also attached to the same viommu. As those latter do not have
> any adherence to the host SMMU, is that a problem?


We need to initialise the registers for SMMUv3 before Guest boots as 
SMMUv3 is a platform device and can't be hot plugged later.

This is where we do it now,
smmu_reset_exit()
  --> smmuv3_init_regs(s);
    if (sys->accel) {
        smmuv3_accel_init_regs(s);
    }
I am not sure how we can update the Guest SMMUv3 features
after the boot.

And the only way to retrieve the Host SMMUv3 HW features is 
through a dev bound to iommufd(IOMMU_GET_HW_INFO).

> >
> > 2. A new IOMMUFD API to retrieve HW_INFO without a device.
> this can only be possible if, on the command line you connect the vsmmu
> to a sysfs path to the host iommu (or maybe this is what you meant in
> 3). This would be another option we also evoked in the past. But this is
> not very user friendly for the guy who launches the VM to care both the
> device and the associated physical SMMU. Logically we could build that
> relationship automatically.
> >
> > 3. A fully specified vSMMUv3 through Qemu command line so that we
> >    don't need HW_INFO from kernel.
> 
> I don't think this is sensible as there may be plenty of those, each
> requirement a libvirt adaptation

As mentioned in my previous reply, the idea is to initialize the accel SMMUv3
with all the features of emulated SMMUv3 and add support for any additional
features required for accel case through command line(like STALL, PASID etc).

Thanks,
Shameer

RE: [RFC PATCH v3 06/15] hw/arm/smmuv3-accel: Restrict accelerated SMMUv3 to vfio-pci endpoints with iommufd

Reply via email to