Hi Cédric, >-----Original Message----- >From: Cédric Le Goater <c...@redhat.com> >Subject: Re: [PATCH v4 02/20] hw/pci: Introduce >pci_device_get_viommu_cap() > >Hello Zhenzhong > >On 7/29/25 11:20, Zhenzhong Duan wrote: >> Introduce a new PCIIOMMUOps optional callback, get_viommu_cap() which >> allows to retrieve capabilities exposed by a vIOMMU. The first planned > vIOMMU device > >"device" is implied, but I find it easier to understand if it is stated >openly.
Will do. > >> capability is VIOMMU_CAP_HW_NESTED that advertises the support of HW >> nested stage translation scheme. pci_device_get_viommu_cap is a wrapper >> that can be called on a PCI device potentially protected by a vIOMMU. >> >> get_viommu_cap() is designed to return 64bit bitmap of purely emulated >> capabilities which are only derermined by user's configuration, no host > >typo: determined Will fix. > >> capabilities involved. Reasons are: >> >> 1. there can be more than one host IOMMUs with different capabilities > >"host IOMMU devices" Will do. > >Such as iommufd and VFIO IOMMU Type1 ? Not VFIO iommu backend, I mean host iommu hardware units. > >> 2. there can also be more than one vIOMMUs with different user >> configuration, e.g., arm smmuv3. >> 3. This is migration friendly, return value is consistent between source >> and target. >> 4. It's too late for VFIO to call get_viommu_cap() after set_iommu_device() >> because we need get_viommu_cap() to determine if creating nested >parent >> hwpt or not at attaching stage, meanwhile hiod realize needs >iommufd, > >hiod -> "host IOMMU device" Will do. > >> devid and hwpt_id which are ready after attach_device(). > >I find the above sentence difficult to understand. This is trying to explain the reason of order between attach_device(), get_viommu_cap() and hiod realizing. What about: 4. host IOMMU capabilities are passed to vIOMMU through set_iommu_device() interface which have to be after attach_device(), when get_viommu_cap() is called in attach_device(), there is no way for vIOMMU to get host IOMMU capabilities yet, so only emulated capabilities can be returned. See below sequence: attach_device() get_viommu_cap() create hwpt ... vfio_device_hiod_create_and_realize() set_iommu_device(hiod) > > >> See below sequence: >> >> attach_device() >> get_viommu_cap() >> create hwpt >> ... >> create hiod >> set_iommu_device(hiod) >> >> Suggested-by: Yi Liu <yi.l....@intel.com> >> Signed-off-by: Zhenzhong Duan <zhenzhong.d...@intel.com> >> --- >> MAINTAINERS | 1 + >> include/hw/iommu.h | 17 +++++++++++++++++ >> include/hw/pci/pci.h | 25 +++++++++++++++++++++++++ >> hw/pci/pci.c | 11 +++++++++++ >> 4 files changed, 54 insertions(+) >> create mode 100644 include/hw/iommu.h >> >> diff --git a/MAINTAINERS b/MAINTAINERS >> index 37879ab64e..840cb1e604 100644 >> --- a/MAINTAINERS >> +++ b/MAINTAINERS >> @@ -2304,6 +2304,7 @@ F: include/system/iommufd.h >> F: backends/host_iommu_device.c >> F: include/system/host_iommu_device.h >> F: include/qemu/chardev_open.h >> +F: include/hw/iommu.h >> F: util/chardev_open.c >> F: docs/devel/vfio-iommufd.rst >> >> diff --git a/include/hw/iommu.h b/include/hw/iommu.h >> new file mode 100644 >> index 0000000000..021db50db5 >> --- /dev/null >> +++ b/include/hw/iommu.h >> @@ -0,0 +1,17 @@ >> +/* >> + * General vIOMMU capabilities, flags, etc >> + * >> + * Copyright (C) 2025 Intel Corporation. >> + * >> + * SPDX-License-Identifier: GPL-2.0-or-later >> + */ >> + >> +#ifndef HW_IOMMU_H >> +#define HW_IOMMU_H >> + >> +enum { >> + /* hardware nested stage-1 page table support */ >> + VIOMMU_CAP_HW_NESTED = BIT_ULL(0), >> +}; >> + >> +#endif /* HW_IOMMU_H */ >> diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h >> index 6b7d3ac8a3..d89aefc030 100644 >> --- a/include/hw/pci/pci.h >> +++ b/include/hw/pci/pci.h >> @@ -462,6 +462,21 @@ typedef struct PCIIOMMUOps { >> * @devfn: device and function number of the PCI device. >> */ >> void (*unset_iommu_device)(PCIBus *bus, void *opaque, int devfn); >> + /** >> + * @get_viommu_cap: get vIOMMU capabilities >> + * >> + * Optional callback, if not implemented, then vIOMMU doesn't >> + * support exposing capabilities to other subsystem, e.g., VFIO. >> + * vIOMMU can choose which capabilities to expose. >> + * >> + * @opaque: the data passed to pci_setup_iommu(). >> + * >> + * Returns: 64bit bitmap with each bit represents a capability >emulated by >> + * VIOMMU_CAP_* in include/hw/iommu.h, these capabilities are >theoretical >> + * which are only determined by user's configuration and >independent on the > >What is meant by "user's configuration" ? the vIOMMU device properties ? Yes, I mean user's qemu cmdline configuration. > >> + * actual host capabilities they may depend on. >> + */ >> + uint64_t (*get_viommu_cap)(void *opaque); >> /** >> * @get_iotlb_info: get properties required to initialize a device >IOTLB. >> * >> @@ -642,6 +657,16 @@ bool pci_device_set_iommu_device(PCIDevice >*dev, HostIOMMUDevice *hiod, >> Error **errp); >> void pci_device_unset_iommu_device(PCIDevice *dev); >> >> +/** >> + * pci_device_get_viommu_cap: get vIOMMU capabilities. >> + * >> + * Returns a 64bit bitmap with each bit represents a vIOMMU exposed >> + * capability, 0 if vIOMMU doesn't support esposing capabilities. > >exposing Will fix. Thanks Zhenzhong