>-----Original Message-----
>From: Nicolin Chen <nicol...@nvidia.com>
>Subject: Re: [PATCH v5 20/21] Workaround for ERRATA_772415_SPR17
>
>On Mon, Aug 25, 2025 at 09:21:48AM +0000, Duan, Zhenzhong wrote:
>>
>>
>> >-----Original Message-----
>> >From: Nicolin Chen <nicol...@nvidia.com>
>> >Subject: Re: [PATCH v5 20/21] Workaround for ERRATA_772415_SPR17
>> >
>> >On Fri, Aug 22, 2025 at 02:40:58AM -0400, Zhenzhong Duan wrote:
>> >> diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
>> >> index e503c232e1..59735e878c 100644
>> >> --- a/hw/vfio/iommufd.c
>> >> +++ b/hw/vfio/iommufd.c
>> >> @@ -324,6 +324,7 @@ static bool
>> >iommufd_cdev_autodomains_get(VFIODevice *vbasedev,
>> >>  {
>> >>      ERRP_GUARD();
>> >>      IOMMUFDBackend *iommufd = vbasedev->iommufd;
>> >> +    struct iommu_hw_info_vtd vtd;
>> >
>> >VendorCaps vendor_caps;
>> >
>> >>      uint32_t type, flags = 0;
>> >>      uint64_t hw_caps;
>> >>      VFIOIOASHwpt *hwpt;
>> >> @@ -371,10 +372,15 @@ static bool
>> >iommufd_cdev_autodomains_get(VFIODevice *vbasedev,
>> >>       * instead.
>> >>       */
>> >>      if (!iommufd_backend_get_device_info(vbasedev->iommufd,
>> >vbasedev->devid,
>> >> -                                         &type, NULL, 0,
>> >&hw_caps, errp)) {
>> >> +                                         &type, &vtd,
>sizeof(vtd),
>> >&hw_caps,
>> >
>> >s/vtd/vendor_caps/g
>> >
>> >> +                                         errp)) {
>> >>          return false;
>> >>      }
>> >>
>> >> +    if (vtd.flags & IOMMU_HW_INFO_VTD_ERRATA_772415_SPR17) {
>> >> +        container->bcontainer.bypass_ro = true;
>> >
>> >This circled back to checking a vendor specific flag in the core..
>>
>> I'm not sure if VendorCaps struct wrapper is overprogramming as this
>> ERRARA is only VTD specific. We still need to check VendorCaps.vtd.flags
>bit.
>
>Look, the HW_INFO call is done by the core.
>
>Then, the core needs:
>  1 HW caps for dirty tracking and PASID
>  2 IOMMU_HWPT_ALLOC_NEST_PARENT (vIOMMU cap)
>  3 bcontainer.bypass_ro (vIOMMU workaround)

Why vIOMMU workaround? ERRATA is from host IOMMU. In a heterogeneous 
environment, some host IOMMUs can have this ERRATA while other newer IOMMUs not.

IOMMU_HW_INFO_VTD_ERRATA_772415_SPR17 may only exist on old IOMMUs with nesting 
capability, vIOMMU doesn't support nesting emulation yet, it's also no sense to 
emulate an ERRATA in vIOMMU.

>
>Both 2 and 3 need to get from vIOMMU, while 3 needs VendorCaps.
>Arguably 2 could do a bit validation using the VendorCaps too.
>
>> >Perhaps we could upgrade the get_viommu_cap op and its API:
>> >
>> >enum viommu_flags {
>> >    VIOMMU_FLAG_HW_NESTED = BIT_ULL(0),
>> >    VIOMMU_FLAG_BYPASS_RO = BIT_ULL(1),
>> >};
>> >
>> >bool vfio_device_get_viommu_flags(VFIODevice *vbasedev, VendorCaps
>> >*vendor_caps,
>> >                                  uint64_t *viommu_flags);
>> >
>> >Then:
>> >    if (viommu_flags & VIOMMU_FLAG_BYPASS_RO) {
>> >        container->bcontainer.bypass_ro = true;
>> >    }
>> >...
>> >    if (viommu_flags & VIOMMU_FLAG_HW_NESTED) {
>> >        flags |= IOMMU_HWPT_ALLOC_NEST_PARENT;
>> >    }
>>
>> IOMMU_HW_INFO_VTD_ERRATA_772415_SPR17 is a VTD specific flag bit
>> from host IOMMU, we have defined get_viommu_cap() to return pure
>> vIOMMU capability bits, so no host IOMMU flag bit can be returned
>> here. See patch2 commit log for the reason.
>
>VIOMMU_FLAG_BYPASS_RO is a "pure" vIOMMU flag, not confined to
>VTD. IOW, if some other vIOMMU has a similar issue, they can use
>it as well. Since we define a "bypass_ro" in the core bcontainer
>structure, it makes sense to have a core-level flag for it, v.s.
>checking the vendor flag in the core.

It's not a vIOMMU flag but host IOMMU flag except vIOMMU want to emulate that 
ERRATA.

Due to patch9, there is only one VFIO device under a container, so bypass_ro is 
set based on VFIO device's backend host IOMMU's flag 
IOMMU_HW_INFO_VTD_ERRATA_772415_SPR17.

Thanks
Zhenzhong

>
>My sample code is turning this get_viommu_cap to something like
>get_viommu_flags, which could include both "cap" and "errata".
>
>Nicolin

Reply via email to