Hi Eric,

On 2025/8/27 19:22, Eric Auger wrote:
Hi

On 8/27/25 1:13 PM, Yi Liu wrote:
On 2025/8/22 14:40, Zhenzhong Duan wrote:
Introduce a new PCIIOMMUOps optional callback, get_viommu_cap() which
allows to retrieve capabilities exposed by a vIOMMU. The first planned
vIOMMU device capability is VIOMMU_CAP_HW_NESTED that advertises the
support of HW nested stage translation scheme. pci_device_get_viommu_cap
is a wrapper that can be called on a PCI device potentially protected by
a vIOMMU.

get_viommu_cap() is designed to return 64bit bitmap of purely emulated
capabilities which are only determined by user's configuration, no host
capabilities involved. Reasons are:

1. host may has heterogeneous IOMMUs, each with different capabilities
2. this is migration friendly, return value is consistent between source
     and target.
3. host IOMMU capabilities are passed to vIOMMU through
set_iommu_device()
     interface which have to be after attach_device(), when
get_viommu_cap()
     is called in attach_device(), there is no way for vIOMMU to get host
     IOMMU capabilities yet, so only emulated capabilities can be
returned.
     See below sequence:

       vfio_device_attach():
           iommufd_cdev_attach():
               pci_device_get_viommu_cap() for HW nesting cap
               create a nesting parent hwpt
               attach device to the hwpt
               vfio_device_hiod_create_and_realize() creating hiod
       ...
       pci_device_set_iommu_device(hiod)

Suggested-by: Yi Liu <yi.l....@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.d...@intel.com>
---
   MAINTAINERS          |  1 +
   include/hw/iommu.h   | 19 +++++++++++++++++++
   include/hw/pci/pci.h | 25 +++++++++++++++++++++++++
   hw/pci/pci.c         | 11 +++++++++++
   4 files changed, 56 insertions(+)
   create mode 100644 include/hw/iommu.h

diff --git a/MAINTAINERS b/MAINTAINERS
index a07086ed76..54fb878128 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2305,6 +2305,7 @@ F: include/system/iommufd.h
   F: backends/host_iommu_device.c
   F: include/system/host_iommu_device.h
   F: include/qemu/chardev_open.h
+F: include/hw/iommu.h
   F: util/chardev_open.c
   F: docs/devel/vfio-iommufd.rst
   diff --git a/include/hw/iommu.h b/include/hw/iommu.h
new file mode 100644
index 0000000000..7dd0c11b16
--- /dev/null
+++ b/include/hw/iommu.h
@@ -0,0 +1,19 @@
+/*
+ * General vIOMMU capabilities, flags, etc
+ *
+ * Copyright (C) 2025 Intel Corporation.
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#ifndef HW_IOMMU_H
+#define HW_IOMMU_H
+
+#include "qemu/bitops.h"
+
+enum {
+    /* hardware nested stage-1 page table support */
+    VIOMMU_CAP_HW_NESTED = BIT_ULL(0),

This naming is a bit confusing. get_viommu_cap indicates it will return
the viommu's capability while this naming is HW_NESTED. It's conflict
with the commit message which claims only emulated capability will be
returned.

it actually means the viommu has the code to handle HW nested case,
independently on the actual HW support.
maybe remove the "emulation" wording.

yeah, I know the meaning and the purpose here. Just not quite satisfied
with the naming.


Otherwise we may also use the virtio has_feature naming?

has_feature seems better. Looks to ask if vIOMMU has something and then
do something.



TBH. I'm hesitating to name it as get_viommu_cap. The scope is a little
larger than what we want so far. So I'm wondering if it can be done in a
more straightforward way. e.g. just a bool op named
iommu_nested_wanted(). Just an example, maybe better naming. We can
extend the op to be returning a u64 value in the future when we see
another request on VFIO from vIOMMU.
personnally I am fine with the bitmask which looks more future proof.

not quite sure if there is another info that needs to be checked in
this "VFIO asks vIOMMU" manner. Have you seen one beside this
nested hwpt requirement by vIOMMU?

Regards,
Yi Liu

Reply via email to