On Wed, Aug 27, 2025 at 07:11:54AM +0000, Duan, Zhenzhong wrote: > >> >> + if (vtd.flags & IOMMU_HW_INFO_VTD_ERRATA_772415_SPR17) { > >> >> + container->bcontainer.bypass_ro = true; > >> > > >> >This circled back to checking a vendor specific flag in the core.. > >> > >> I'm not sure if VendorCaps struct wrapper is overprogramming as this > >> ERRARA is only VTD specific. We still need to check VendorCaps.vtd.flags > >bit. > > > >Look, the HW_INFO call is done by the core. > > > >Then, the core needs: > > 1 HW caps for dirty tracking and PASID > > 2 IOMMU_HWPT_ALLOC_NEST_PARENT (vIOMMU cap) > > 3 bcontainer.bypass_ro (vIOMMU workaround) > > Why vIOMMU workaround? ERRATA is from host IOMMU. > In a heterogeneous environment, some host IOMMUs can have > this ERRATA while other newer IOMMUs not.
To be fair, the subject of your patch is "Workaround". Though it might be inaccurate to call it "vIOMMU Workaround", the idea was to let vendor code decode vendor bits and flags. Arguably, when a host IOMMU has an erratum while requiring a HW acceleration like nesting, vIOMMU can be a partner to help apply the workaround. > IOMMU_HW_INFO_VTD_ERRATA_772415_SPR17 may only exist on old IOMMUs > with nesting capability, vIOMMU doesn't support nesting emulation yet, > it's also no sense to emulate an ERRATA in vIOMMU. Certainly, I never suggest to "emulate an ERRATA". > >Both 2 and 3 need to get from vIOMMU, while 3 needs VendorCaps. > >Arguably 2 could do a bit validation using the VendorCaps too. > > > >> >Perhaps we could upgrade the get_viommu_cap op and its API: > >> > > >> >enum viommu_flags { > >> > VIOMMU_FLAG_HW_NESTED = BIT_ULL(0), > >> > VIOMMU_FLAG_BYPASS_RO = BIT_ULL(1), > >> >}; > >> > > >> >bool vfio_device_get_viommu_flags(VFIODevice *vbasedev, VendorCaps > >> >*vendor_caps, > >> > uint64_t *viommu_flags); > >> > > >> >Then: > >> > if (viommu_flags & VIOMMU_FLAG_BYPASS_RO) { > >> > container->bcontainer.bypass_ro = true; > >> > } > >> >... > >> > if (viommu_flags & VIOMMU_FLAG_HW_NESTED) { > >> > flags |= IOMMU_HWPT_ALLOC_NEST_PARENT; > >> > } > >> > >> IOMMU_HW_INFO_VTD_ERRATA_772415_SPR17 is a VTD specific flag bit > >> from host IOMMU, we have defined get_viommu_cap() to return pure > >> vIOMMU capability bits, so no host IOMMU flag bit can be returned > >> here. See patch2 commit log for the reason. > > > >VIOMMU_FLAG_BYPASS_RO is a "pure" vIOMMU flag, not confined to > >VTD. IOW, if some other vIOMMU has a similar issue, they can use > >it as well. Since we define a "bypass_ro" in the core bcontainer > >structure, it makes sense to have a core-level flag for it, v.s. > >checking the vendor flag in the core. > > It's not a vIOMMU flag but host IOMMU flag except vIOMMU want to emulate that > ERRATA. Again, the idea here is not to blame vIOMMU for every flag, nor to emulate the erratum, but to use vIOMMU as a vendor specific place to translate a vendor specific flag (either host IOMMU or vIOMMU) to something that QEMU core code can understand. The rule of thumb is to avoid checking vendor flags in the core, which both Eric and I have noted for a couple of times.. It's okay that you don't like the way of grouping the host flag with vIOMMU cap. Please find some other place in the vendor zone to load the vendor flag? Maybe add another op/API to decode the host IOMMU flags/caps exclusively? Nicolin