Hi On 8/27/25 1:13 PM, Yi Liu wrote: > On 2025/8/22 14:40, Zhenzhong Duan wrote: >> Introduce a new PCIIOMMUOps optional callback, get_viommu_cap() which >> allows to retrieve capabilities exposed by a vIOMMU. The first planned >> vIOMMU device capability is VIOMMU_CAP_HW_NESTED that advertises the >> support of HW nested stage translation scheme. pci_device_get_viommu_cap >> is a wrapper that can be called on a PCI device potentially protected by >> a vIOMMU. >> >> get_viommu_cap() is designed to return 64bit bitmap of purely emulated >> capabilities which are only determined by user's configuration, no host >> capabilities involved. Reasons are: >> >> 1. host may has heterogeneous IOMMUs, each with different capabilities >> 2. this is migration friendly, return value is consistent between source >> and target. >> 3. host IOMMU capabilities are passed to vIOMMU through >> set_iommu_device() >> interface which have to be after attach_device(), when >> get_viommu_cap() >> is called in attach_device(), there is no way for vIOMMU to get host >> IOMMU capabilities yet, so only emulated capabilities can be >> returned. >> See below sequence: >> >> vfio_device_attach(): >> iommufd_cdev_attach(): >> pci_device_get_viommu_cap() for HW nesting cap >> create a nesting parent hwpt >> attach device to the hwpt >> vfio_device_hiod_create_and_realize() creating hiod >> ... >> pci_device_set_iommu_device(hiod) > >> Suggested-by: Yi Liu <yi.l....@intel.com> >> Signed-off-by: Zhenzhong Duan <zhenzhong.d...@intel.com> >> --- >> MAINTAINERS | 1 + >> include/hw/iommu.h | 19 +++++++++++++++++++ >> include/hw/pci/pci.h | 25 +++++++++++++++++++++++++ >> hw/pci/pci.c | 11 +++++++++++ >> 4 files changed, 56 insertions(+) >> create mode 100644 include/hw/iommu.h >> >> diff --git a/MAINTAINERS b/MAINTAINERS >> index a07086ed76..54fb878128 100644 >> --- a/MAINTAINERS >> +++ b/MAINTAINERS >> @@ -2305,6 +2305,7 @@ F: include/system/iommufd.h >> F: backends/host_iommu_device.c >> F: include/system/host_iommu_device.h >> F: include/qemu/chardev_open.h >> +F: include/hw/iommu.h >> F: util/chardev_open.c >> F: docs/devel/vfio-iommufd.rst >> diff --git a/include/hw/iommu.h b/include/hw/iommu.h >> new file mode 100644 >> index 0000000000..7dd0c11b16 >> --- /dev/null >> +++ b/include/hw/iommu.h >> @@ -0,0 +1,19 @@ >> +/* >> + * General vIOMMU capabilities, flags, etc >> + * >> + * Copyright (C) 2025 Intel Corporation. >> + * >> + * SPDX-License-Identifier: GPL-2.0-or-later >> + */ >> + >> +#ifndef HW_IOMMU_H >> +#define HW_IOMMU_H >> + >> +#include "qemu/bitops.h" >> + >> +enum { >> + /* hardware nested stage-1 page table support */ >> + VIOMMU_CAP_HW_NESTED = BIT_ULL(0), > > This naming is a bit confusing. get_viommu_cap indicates it will return > the viommu's capability while this naming is HW_NESTED. It's conflict > with the commit message which claims only emulated capability will be > returned.
it actually means the viommu has the code to handle HW nested case, independently on the actual HW support. maybe remove the "emulation" wording. Otherwise we may also use the virtio has_feature naming? > > TBH. I'm hesitating to name it as get_viommu_cap. The scope is a little > larger than what we want so far. So I'm wondering if it can be done in a > more straightforward way. e.g. just a bool op named > iommu_nested_wanted(). Just an example, maybe better naming. We can > extend the op to be returning a u64 value in the future when we see > another request on VFIO from vIOMMU. personnally I am fine with the bitmask which looks more future proof. besides, Reviewed-by: Eric Auger <eric.au...@redhat.com> > > Regards, > Yi Liu >