>-----Original Message-----
>From: Nicolin Chen <nicol...@nvidia.com>
>Subject: Re: [PATCH v5 20/21] Workaround for ERRATA_772415_SPR17
>
>On Mon, Aug 25, 2025 at 09:21:48AM +0000, Duan, Zhenzhong wrote:
>>
>>
>> >-----Original Message-----
>> >From: Nicolin Chen <nicol...@nvidia.com>
>> >Subject: Re: [PATCH v5 20/21] Workaround for ERRATA_772415_SPR17
>> >
>> >On Fri, Aug 22, 2025 at 02:40:58AM -0400, Zhenzhong Duan wrote:
>> >> diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
>> >> index e503c232e1..59735e878c 100644
>> >> --- a/hw/vfio/iommufd.c
>> >> +++ b/hw/vfio/iommufd.c
>> >> @@ -324,6 +324,7 @@ static bool
>> >iommufd_cdev_autodomains_get(VFIODevice *vbasedev,
>> >> {
>> >> ERRP_GUARD();
>> >> IOMMUFDBackend *iommufd = vbasedev->iommufd;
>> >> + struct iommu_hw_info_vtd vtd;
>> >
>> >VendorCaps vendor_caps;
>> >
>> >> uint32_t type, flags = 0;
>> >> uint64_t hw_caps;
>> >> VFIOIOASHwpt *hwpt;
>> >> @@ -371,10 +372,15 @@ static bool
>> >iommufd_cdev_autodomains_get(VFIODevice *vbasedev,
>> >> * instead.
>> >> */
>> >> if (!iommufd_backend_get_device_info(vbasedev->iommufd,
>> >vbasedev->devid,
>> >> - &type, NULL, 0,
>> >&hw_caps, errp)) {
>> >> + &type, &vtd,
>sizeof(vtd),
>> >&hw_caps,
>> >
>> >s/vtd/vendor_caps/g
>> >
>> >> + errp)) {
>> >> return false;
>> >> }
>> >>
>> >> + if (vtd.flags & IOMMU_HW_INFO_VTD_ERRATA_772415_SPR17) {
>> >> + container->bcontainer.bypass_ro = true;
>> >
>> >This circled back to checking a vendor specific flag in the core..
>>
>> I'm not sure if VendorCaps struct wrapper is overprogramming as this
>> ERRARA is only VTD specific. We still need to check VendorCaps.vtd.flags
>bit.
>
>Look, the HW_INFO call is done by the core.
>
>Then, the core needs:
> 1 HW caps for dirty tracking and PASID
> 2 IOMMU_HWPT_ALLOC_NEST_PARENT (vIOMMU cap)
> 3 bcontainer.bypass_ro (vIOMMU workaround)
Why vIOMMU workaround? ERRATA is from host IOMMU. In a heterogeneous
environment, some host IOMMUs can have this ERRATA while other newer IOMMUs not.
IOMMU_HW_INFO_VTD_ERRATA_772415_SPR17 may only exist on old IOMMUs with nesting
capability, vIOMMU doesn't support nesting emulation yet, it's also no sense to
emulate an ERRATA in vIOMMU.
>
>Both 2 and 3 need to get from vIOMMU, while 3 needs VendorCaps.
>Arguably 2 could do a bit validation using the VendorCaps too.
>
>> >Perhaps we could upgrade the get_viommu_cap op and its API:
>> >
>> >enum viommu_flags {
>> > VIOMMU_FLAG_HW_NESTED = BIT_ULL(0),
>> > VIOMMU_FLAG_BYPASS_RO = BIT_ULL(1),
>> >};
>> >
>> >bool vfio_device_get_viommu_flags(VFIODevice *vbasedev, VendorCaps
>> >*vendor_caps,
>> > uint64_t *viommu_flags);
>> >
>> >Then:
>> > if (viommu_flags & VIOMMU_FLAG_BYPASS_RO) {
>> > container->bcontainer.bypass_ro = true;
>> > }
>> >...
>> > if (viommu_flags & VIOMMU_FLAG_HW_NESTED) {
>> > flags |= IOMMU_HWPT_ALLOC_NEST_PARENT;
>> > }
>>
>> IOMMU_HW_INFO_VTD_ERRATA_772415_SPR17 is a VTD specific flag bit
>> from host IOMMU, we have defined get_viommu_cap() to return pure
>> vIOMMU capability bits, so no host IOMMU flag bit can be returned
>> here. See patch2 commit log for the reason.
>
>VIOMMU_FLAG_BYPASS_RO is a "pure" vIOMMU flag, not confined to
>VTD. IOW, if some other vIOMMU has a similar issue, they can use
>it as well. Since we define a "bypass_ro" in the core bcontainer
>structure, it makes sense to have a core-level flag for it, v.s.
>checking the vendor flag in the core.
It's not a vIOMMU flag but host IOMMU flag except vIOMMU want to emulate that
ERRATA.
Due to patch9, there is only one VFIO device under a container, so bypass_ro is
set based on VFIO device's backend host IOMMU's flag
IOMMU_HW_INFO_VTD_ERRATA_772415_SPR17.
Thanks
Zhenzhong
>
>My sample code is turning this get_viommu_cap to something like
>get_viommu_flags, which could include both "cap" and "errata".
>
>Nicolin