Hi Zhenzhong, On 8/22/25 8:40 AM, Zhenzhong Duan wrote: > Call pci_device_get_viommu_cap() to get if vIOMMU supports > VIOMMU_CAP_HW_NESTED, > if yes, create nested parent domain which could be reused by vIOMMU to create > nested domain. > > Introduce helper vfio_device_viommu_get_nested to facilitate this > implementation. > > It is safe because even if VIOMMU_CAP_HW_NESTED is returned, s->flts is > forbidden and VFIO device fails in set_iommu_device() call, until we support > passthrough device with x-flts=on. > > Suggested-by: Nicolin Chen <nicol...@nvidia.com> > Suggested-by: Yi Liu <yi.l....@intel.com> > Signed-off-by: Zhenzhong Duan <zhenzhong.d...@intel.com> > --- > include/hw/vfio/vfio-device.h | 2 ++ > hw/vfio/device.c | 12 ++++++++++++ > hw/vfio/iommufd.c | 8 ++++++++ > 3 files changed, 22 insertions(+) > > diff --git a/include/hw/vfio/vfio-device.h b/include/hw/vfio/vfio-device.h > index 6e4d5ccdac..ecd82c16c7 100644 > --- a/include/hw/vfio/vfio-device.h > +++ b/include/hw/vfio/vfio-device.h > @@ -257,6 +257,8 @@ void vfio_device_prepare(VFIODevice *vbasedev, > VFIOContainerBase *bcontainer, > > void vfio_device_unprepare(VFIODevice *vbasedev); > > +bool vfio_device_viommu_get_nested(VFIODevice *vbasedev); I would suggest vfio_device_viommu_has_feature_hw_nested or something alike get usually means tou take a ref count associated with a put > + > int vfio_device_get_region_info(VFIODevice *vbasedev, int index, > struct vfio_region_info **info); > int vfio_device_get_region_info_type(VFIODevice *vbasedev, uint32_t type, > diff --git a/hw/vfio/device.c b/hw/vfio/device.c > index 08f12ac31f..3eeb71bd51 100644 > --- a/hw/vfio/device.c > +++ b/hw/vfio/device.c > @@ -23,6 +23,7 @@ > > #include "hw/vfio/vfio-device.h" > #include "hw/vfio/pci.h" > +#include "hw/iommu.h" > #include "hw/hw.h" > #include "trace.h" > #include "qapi/error.h" > @@ -504,6 +505,17 @@ void vfio_device_unprepare(VFIODevice *vbasedev) > vbasedev->bcontainer = NULL; > } > > +bool vfio_device_viommu_get_nested(VFIODevice *vbasedev) > +{ > + VFIOPCIDevice *vdev = vfio_pci_from_vfio_device(vbasedev); > + > + if (vdev) { > + return !!(pci_device_get_viommu_cap(&vdev->pdev) & > + VIOMMU_CAP_HW_NESTED); > + } > + return false; > +} > + > /* > * Traditional ioctl() based io > */ > diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c > index 8c27222f75..e503c232e1 100644 > --- a/hw/vfio/iommufd.c > +++ b/hw/vfio/iommufd.c > @@ -379,6 +379,14 @@ static bool iommufd_cdev_autodomains_get(VFIODevice > *vbasedev, > flags = IOMMU_HWPT_ALLOC_DIRTY_TRACKING; > } > > + /* > + * If vIOMMU supports stage-1 translation, force to create nested parent I would rather not use another terminology here. You previously used hw_nested, I think that's better. Also bear in mind that smmu supports S1, S2 and S1+S2 in emulated code.
Thanks Eric > + * domain which could be reused by vIOMMU to create nested domain. > + */ > + if (vfio_device_viommu_get_nested(vbasedev)) { > + flags |= IOMMU_HWPT_ALLOC_NEST_PARENT; > + } > + > if (cpr_is_incoming()) { > hwpt_id = vbasedev->cpr.hwpt_id; > goto skip_alloc;