Hi Michael,

>-----Original Message-----
>From: Michael S. Tsirkin <m...@redhat.com>
>Subject: Re: [PATCH v2 3/5] intel_iommu: Add a framework to do
>compatibility check with host IOMMU cap/ecap
>
>On Fri, Apr 26, 2024 at 03:10:14AM +0000, Duan, Zhenzhong wrote:
>>
>>
>> >-----Original Message-----
>> >From: Cédric Le Goater <c...@redhat.com>
>> >Subject: Re: [PATCH v2 3/5] intel_iommu: Add a framework to do
>> >compatibility check with host IOMMU cap/ecap
>> >
>> >On 4/25/24 10:46, Duan, Zhenzhong wrote:
>> >> Hi Cédric,
>> >>
>> >>> -----Original Message-----
>> >>> From: Cédric Le Goater <c...@redhat.com>
>> >>> Subject: Re: [PATCH v2 3/5] intel_iommu: Add a framework to do
>> >>> compatibility check with host IOMMU cap/ecap
>> >>>
>> >>> Hello Zhenzhong,
>> >>>
>> >>> On 4/18/24 10:42, Duan, Zhenzhong wrote:
>> >>>> Hi Cédric,
>> >>>>
>> >>>>> -----Original Message-----
>> >>>>> From: Cédric Le Goater <c...@redhat.com>
>> >>>>> Subject: Re: [PATCH v2 3/5] intel_iommu: Add a framework to do
>> >>>>> compatibility check with host IOMMU cap/ecap
>> >>>>>
>> >>>>> Hello Zhenzhong
>> >>>>>
>> >>>>> On 4/17/24 11:24, Duan, Zhenzhong wrote:
>> >>>>>>
>> >>>>>>
>> >>>>>>> -----Original Message-----
>> >>>>>>> From: Cédric Le Goater <c...@redhat.com>
>> >>>>>>> Subject: Re: [PATCH v2 3/5] intel_iommu: Add a framework to
>do
>> >>>>>>> compatibility check with host IOMMU cap/ecap
>> >>>>>>>
>> >>>>>>> On 4/17/24 06:21, Duan, Zhenzhong wrote:
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>> -----Original Message-----
>> >>>>>>>>> From: Cédric Le Goater <c...@redhat.com>
>> >>>>>>>>> Subject: Re: [PATCH v2 3/5] intel_iommu: Add a framework to
>do
>> >>>>>>>>> compatibility check with host IOMMU cap/ecap
>> >>>>>>>>>
>> >>>>>>>>> Hello,
>> >>>>>>>>>
>> >>>>>>>>> On 4/16/24 09:09, Duan, Zhenzhong wrote:
>> >>>>>>>>>> Hi Cédric,
>> >>>>>>>>>>
>> >>>>>>>>>>> -----Original Message-----
>> >>>>>>>>>>> From: Cédric Le Goater <c...@redhat.com>
>> >>>>>>>>>>> Subject: Re: [PATCH v2 3/5] intel_iommu: Add a framework
>to
>> >do
>> >>>>>>>>>>> compatibility check with host IOMMU cap/ecap
>> >>>>>>>>>>>
>> >>>>>>>>>>> On 4/8/24 10:44, Zhenzhong Duan wrote:
>> >>>>>>>>>>>> From: Yi Liu <yi.l....@intel.com>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> If check fails, the host side device(either vfio or vdpa device)
>> >>> should
>> >>>>>>> not
>> >>>>>>>>>>>> be passed to guest.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Implementation details for different backends will be in
>> >>> following
>> >>>>>>>>> patches.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Signed-off-by: Yi Liu <yi.l....@intel.com>
>> >>>>>>>>>>>> Signed-off-by: Yi Sun <yi.y....@linux.intel.com>
>> >>>>>>>>>>>> Signed-off-by: Zhenzhong Duan
><zhenzhong.d...@intel.com>
>> >>>>>>>>>>>> ---
>> >>>>>>>>>>>>        hw/i386/intel_iommu.c | 35
>> >>>>>>>>>>> +++++++++++++++++++++++++++++++++++
>> >>>>>>>>>>>>        1 file changed, 35 insertions(+)
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> diff --git a/hw/i386/intel_iommu.c
>b/hw/i386/intel_iommu.c
>> >>>>>>>>>>>> index 4f84e2e801..a49b587c73 100644
>> >>>>>>>>>>>> --- a/hw/i386/intel_iommu.c
>> >>>>>>>>>>>> +++ b/hw/i386/intel_iommu.c
>> >>>>>>>>>>>> @@ -35,6 +35,7 @@
>> >>>>>>>>>>>>        #include "sysemu/kvm.h"
>> >>>>>>>>>>>>        #include "sysemu/dma.h"
>> >>>>>>>>>>>>        #include "sysemu/sysemu.h"
>> >>>>>>>>>>>> +#include "sysemu/iommufd.h"
>> >>>>>>>>>>>>        #include "hw/i386/apic_internal.h"
>> >>>>>>>>>>>>        #include "kvm/kvm_i386.h"
>> >>>>>>>>>>>>        #include "migration/vmstate.h"
>> >>>>>>>>>>>> @@ -3819,6 +3820,32 @@ VTDAddressSpace
>> >>>>>>>>>>> *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus,
>> >>>>>>>>>>>>            return vtd_dev_as;
>> >>>>>>>>>>>>        }
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> +static int vtd_check_legacy_hdev(IntelIOMMUState *s,
>> >>>>>>>>>>>> +                                 HostIOMMUDevice *hiod,
>> >>>>>>>>>>>> +                                 Error **errp)
>> >>>>>>>>>>>> +{
>> >>>>>>>>>>>> +    return 0;
>> >>>>>>>>>>>> +}
>> >>>>>>>>>>>> +
>> >>>>>>>>>>>> +static int vtd_check_iommufd_hdev(IntelIOMMUState *s,
>> >>>>>>>>>>>> +                                  HostIOMMUDevice *hiod,
>> >>>>>>>>>>>> +                                  Error **errp)
>> >>>>>>>>>>>> +{
>> >>>>>>>>>>>> +    return 0;
>> >>>>>>>>>>>> +}
>> >>>>>>>>>>>> +
>> >>>>>>>>>>>> +static int vtd_check_hdev(IntelIOMMUState *s,
>> >>>>>>>>> VTDHostIOMMUDevice
>> >>>>>>>>>>> *vtd_hdev,
>> >>>>>>>>>>>> +                          Error **errp)
>> >>>>>>>>>>>> +{
>> >>>>>>>>>>>> +    HostIOMMUDevice *hiod = vtd_hdev->dev;
>> >>>>>>>>>>>> +
>> >>>>>>>>>>>> +    if (object_dynamic_cast(OBJECT(hiod),
>> >>> TYPE_HIOD_IOMMUFD))
>> >>>>> {
>> >>>>>>>>>>>> +        return vtd_check_iommufd_hdev(s, hiod, errp);
>> >>>>>>>>>>>> +    }
>> >>>>>>>>>>>> +
>> >>>>>>>>>>>> +    return vtd_check_legacy_hdev(s, hiod, errp);
>> >>>>>>>>>>>> +}
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> I think we should be using the .get_host_iommu_info() class
>> >>> handler
>> >>>>>>>>>>> instead. Can we refactor the code slightly to avoid this check
>on
>> >>>>>>>>>>> the type ?
>> >>>>>>>>>>
>> >>>>>>>>>> There is some difficulty ini avoiding this check, the behavior
>of
>> >>>>>>>>> vtd_check_legacy_hdev
>> >>>>>>>>>> and vtd_check_iommufd_hdev are different especially after
>> >>> nesting
>> >>>>>>>>> support introduced.
>> >>>>>>>>>> vtd_check_iommufd_hdev() has much wider check over
>> >cap/ecap
>> >>> bits
>> >>>>>>>>> besides aw_bits.
>> >>>>>>>>>
>> >>>>>>>>> I think it is important to fully separate the vIOMMU model
>from
>> >the
>> >>>>>>>>> host IOMMU backing device.
>> >>>>>
>> >>>>> This comment is true for the structures also.
>> >>>>>
>> >>>>>>>>> Could we introduce a new HostIOMMUDeviceClass
>> >>>>>>>>> handler .check_hdev() handler, which would
>> >>>>> call .get_host_iommu_info() ?
>> >>>>>
>> >>>>> This means that HIOD_LEGACY_INFO and HIOD_IOMMUFD_INFO
>> >should
>> >>> be
>> >>>>> a common structure 'HostIOMMUDeviceInfo' holding all attributes
>> >>>>> for the different backends. Each .get_host_iommu_info()
>> >implementation
>> >>>>> would translate the specific host iommu device data presentation
>> >>>>> into the common 'HostIOMMUDeviceInfo', this is true for
>> >host_aw_bits.
>> >>>>
>> >>>> I see, it's just not easy to define the unified elements in
>> >>> HostIOMMUDeviceInfo
>> >>>> so that they maps to bits or fields in host return IOMMU info.
>> >>>
>> >>> The proposal is adding a vIOMMU <-> HostIOMMUDevice interface
>and a
>> >>> new
>> >>> API needs to be completely defined for it. The IOMMU backend
>> >>> implementation
>> >>> could be anything, legacy, iommufd, iommufd v2, some other
>framework
>> >>> and
>> >>> the vIOMMU shouldn't be aware of its implementation.
>> >>>
>> >>> Exposing the kernel structures as done below should be avoided
>because
>> >>> they are part of the QEMU <-> kernel IOMMUFD interface.
>> >>>
>> >>>
>> >>>> Different platform returned host IOMMU info is platform specific.
>> >>>> For vtd and siommu:
>> >>>>
>> >>>> struct iommu_hw_info_vtd {
>> >>>>           __u32 flags;
>> >>>>           __u32 __reserved;
>> >>>>           __aligned_u64 cap_reg;
>> >>>>           __aligned_u64 ecap_reg;
>> >>>> };
>> >>>>
>> >>>> struct iommu_hw_info_arm_smmuv3 {
>> >>>>          __u32 flags;
>> >>>>          __u32 __reserved;
>> >>>>          __u32 idr[6];
>> >>>>          __u32 iidr;
>> >>>>          __u32 aidr;
>> >>>> };
>> >>>>
>> >>>> I can think of two kinds of declaration of HostIOMMUDeviceInfo:
>> >>>>
>> >>>> struct HostIOMMUDeviceInfo {
>> >>>>       uint8_t aw_bits;
>> >>>>       enum iommu_hw_info_type type;
>> >>>>       union {
>> >>>>           struct iommu_hw_info_vtd vtd;
>> >>>>           struct iommu_hw_info_arm_smmuv3;
>> >>>>           ......
>> >>>>       } data;
>> >>>> }
>> >>>>
>> >>>> or
>> >>>>
>> >>>> struct HostIOMMUDeviceInfo {
>> >>>>       uint8_t aw_bits;
>> >>>>       enum iommu_hw_info_type type;
>> >>>>       __u32 flags;
>> >>>>       __aligned_u64 cap_reg;
>> >>>>       __aligned_u64 ecap_reg;
>> >>>>       __u32 idr[6];
>> >>>>       __u32 iidr;
>> >>>>       __u32 aidr;
>> >>>>      ......
>> >>>> }
>> >>>>
>> >>>> Not clear if any is your expected format.
>> >>>>
>> >>>>> 'type' could be handled the same way, with a
>'HostIOMMUDeviceInfo'
>> >>>>> type attribute and host iommu device type definitions, or as you
>> >>>>> suggested with a QOM interface. This is more complex however. In
>> >>>>> this case, I would suggest to implement a .compatible() handler to
>> >>>>> compare the host iommu device type with the vIOMMU type.
>> >>>>>
>> >>>>> The resulting check_hdev routine would look something like :
>> >>>>>
>> >>>>> static int vtd_check_hdev(IntelIOMMUState *s,
>> >VTDHostIOMMUDevice
>> >>>>> *vtd_hdev,
>> >>>>>                             Error **errp)
>> >>>>> {
>> >>>>>       HostIOMMUDevice *hiod = vtd_hdev->dev;
>> >>>>>       HostIOMMUDeviceClass *hiodc =
>> >>>>> HOST_IOMMU_DEVICE_GET_CLASS(hiod);
>> >>>>>       HostIOMMUDevice info;
>> >>>>>       int host_aw_bits, ret;
>> >>>>>
>> >>>>>       ret = hiodc->get_host_iommu_info(hiod, &info, sizeof(info),
>errp);
>> >>>>>       if (ret) {
>> >>>>>           return ret;
>> >>>>>       }
>> >>>>>
>> >>>>>       ret = hiodc->is_compatible(hiod, VIOMMU_INTERFACE(s));
>> >>>>>       if (ret) {
>> >>>>>           return ret;
>> >>>>>       }
>> >>>>>
>> >>>>>       if (s->aw_bits > info.aw_bits) {
>> >>>>>           error_setg(errp, "aw-bits %d > host aw-bits %d",
>> >>>>>                      s->aw_bits, info.aw_bits);
>> >>>>>           return -EINVAL;
>> >>>>>       }
>> >>>>> }
>> >>>>>
>> >>>>> and the HostIOMMUDeviceClass::is_compatible() handler would
>call a
>> >>>>> vIOMMUInterface::compatible() handler simply returning
>> >>>>> IOMMU_HW_INFO_TYPE_INTEL_VTD. How does that sound ?
>> >>>>
>> >>>> Not quite get what HostIOMMUDeviceClass::is_compatible() does.
>> >>>
>> >>> HostIOMMUDeviceClass::is_compatible() calls in the host IOMMU
>> >backend
>> >>> to determine which IOMMU types are exposed by the host, then calls
>the
>> >>> vIOMMUInterface::compatible() handler to do the compare. API is to
>be
>> >>> defined.
>> >>>
>> >>> As a refinement, we could introduce in the vIOMMU <->
>> >HostIOMMUDevice
>> >>> interface capabilities, or features, to check more precisely the level
>> >>> of compatibility between the vIOMMU and the host IOMMU device.
>This
>> >is
>> >>> similar to what is done between QEMU and KVM.
>> >>>
>> >>> If you think this is too complex, include type in HostIOMMUDeviceInfo.
>> >>>
>> >>>> Currently legacy and IOMMUFD host device has different check logic,
>> >how
>> >>> it can help
>> >>>> in merging vtd_check_legacy_hdev() and vtd_check_iommufd_hdev()
>> >into
>> >>> a single vtd_check_hdev()?
>> >>>
>> >>> IMHO, IOMMU shouldn't be aware of the IOMMU backend
>> >implementation,
>> >>> but
>> >>> if you think the Intel vIOMMU should access directly the iommufd
>> >backend
>> >>> when available, then we should drop this proposal and revisit the
>design
>> >>> to take a different approach.
>> >>
>> >> I implemented a draft following your suggestions so we could explore
>> >further.
>> >> See
>> >https://github.com/yiliu1765/qemu/tree/zhenzhong/iommufd_nesting_
>pre
>> >q_v3_tmp
>> >>
>> >> In this draft, it uses .check_cap() to query
>HOST_IOMMU_DEVICE_CAP_xxx
>> >> just like KVM CAPs.
>> >> A common HostIOMMUDeviceCaps structure is introduced to be used
>by
>> >> both legacy and iommufd backend.
>> >>
>> >> It indeed is cleaner. Only problem is I failed to implement .compatible()
>> >> as all the check could go ahead by just calling check_cap().
>> >> Could you help a quick check to see if I misunderstood any of your
>> >suggestion?
>> >
>> >Thanks for the changes. It looks cleaner and simpler ! Some comments,
>> >
>> >* HostIOMMUDeviceIOMMUFDClass seems useless as it is empty. I don't
>> >   remember if you told me already you had plans for future changes.
>> >   Sorry about that if this is the case. I forgot.
>>
>> Never mind😊, reason is:
>>
>> In nesting series
>>
>https://github.com/yiliu1765/qemu/commits/zhenzhong/iommufd_nesting
>_rfcv2/
>> This commit
>>
>https://github.com/yiliu1765/qemu/commit/581fc900aa296988eaa48abee
>6d68d3670faf8c9
>> implement [at|de]tach_hwpt handlers.
>>
>> So I add an extra layer of abstract HostIOMMUDeviceIOMMUFDClass to
>define
>> [at|de]tach_hwpt handlers.
>>
>> >
>> >* I would use the 'host_iommu_device_' prefix for external routines
>> >   which are part of the HostIOMMUDevice API and use 'hiod_' for
>> >   internal routines where it makes sense, to limit the name length for
>> >   instance.
>>
>> Good idea, will do.
>>
>> >
>> >* I would rename HOST_IOMMU_DEVICE_CAP_IOMMUFD_V1 to
>> >   HOST_IOMMU_DEVICE_CAP_IOMMUFD. I mentioned IOMMUFD v2 as
>a
>> >   theoretical example of a different IOMMU interface. I don't think we
>> >   need to anticipate yet :)
>>
>> Will do.
>>
>> >
>> >* HostIOMMUDeviceCaps is using 'enum iommu_hw_info_type' from
>> >   'linux/iommufd.h', that's not my preferred choice but I won't
>> >   object. The result looks good.
>>
>> Ok, will keep it for now. We can change when you want in future.
>>
>> >
>> >* HostIOMMUDevice now has a realize() routine to query the host
>IOMMU
>> >   capability for later usage. This is a good idea. However, you could
>> >   change the return value to bool and avoid the ERRP_GUARD() prologue.
>>
>> Will do.
>>
>> >
>> >* Beware of :
>> >
>> >     struct Range {
>> >         /*
>> >          * Do not access members directly, use the functions!
>> >          * A non-empty range has @lob <= @upb.
>> >          * An empty range has @lob == @upb + 1.
>> >          */
>> >         uint64_t lob;        /* inclusive lower bound */
>> >         uint64_t upb;        /* inclusive upper bound */
>> >     };
>>
>> I remember😊, will add the change in formal version.
>>
>> >
>> >
>> >* I think you could introduce a new VFIOIOMMUClass attribute. Let's
>> >   call it VFIOIOMMUClass::hiod_typename. The creation of
>> >HostIOMMUDevice
>> >   would become generic and you could move :
>> >
>> >     hiod=
>> >HOST_IOMMU_DEVICE(object_new(TYPE_HOST_IOMMU_DEVICE_LEGAC
>Y_V
>> >FIO));
>> >     HOST_IOMMU_DEVICE_GET_CLASS(hiod)->realize(hiod, vbasedev,
>errp);
>> >     if (*errp) {
>> >         object_unref(hiod);
>> >         return -EINVAL;
>> >     }
>> >     vbasedev->hiod = hiod;
>> >
>> >   at the end of vfio_attach_device().
>>
>> Good suggestion! Will do.
>>
>> Thanks
>> Zhenzhong
>
>
>So I'm expecting v3 of this.

Just posted v6, 
https://lists.gnu.org/archive/html/qemu-devel/2024-06/msg00102.html
Comments appreciated!

Thanks
Zhenzhong

Reply via email to