On Thu, Jul 14, 2022 at 02:39:32PM +0100, Robin Murphy wrote:
> > In the meantime we do need to restore VFIO support under virtio-iommu,
> > since userspace still expects that to work, and the existing use-cases are
> > coherent devices.
> 
> Yeah, I'm not necessarily against adding this as a horrible bodge for now -
> the reality is that people using VFIO must be doing it on coherent systems
> or it wouldn't be working properly anyway - as long as we all agree that
> that's what it is.
> 
> Next cycle I'll be sending the follow-up patches to bring
> device_iommu_capable() to its final form (hoping the outstanding VDPA patch
> lands in the meantime), at which point we get to sort-of-fix the SMMU
> drivers[1], and can do something similar here too. I guess the main question
> for virtio-iommu is whether it needs to be described/negotiated in the
> protocol itself, or can be reliably described by other standard firmware
> properties (with maybe just a spec not to clarify that coherency must be
> consistent).


What consumers of IOMMU_CAP_CACHE_COHERENCY now want to know, is whether
coherency is managed in HW for one particular endpoint, or if they need to
issue cache maintenance. The latter cannot be handled by VFIO since cache
maintenance is generally privileged.

So I had to list several possibilities regarding non-coherent accesses.
I don't think we need a spec change.

A. Accesses through physical IOMMU are never coherent
-----------------------------------------------------

In this case, translated accesses from the physical device can't access
memory coherently. The host would describe it using existing FW methods
(dma-coherent in DT, _CCA in ACPI) since it's also needed without a
vIOMMU.

No change needed for virtio-iommu, I think, it can support non-coherent
devices. It can also support mixing coherent and non-coherent devices in
the same domain, because domains just allow to multiplex map requests at
the moment. Since we allow sending the same map request onto two different
domains, one with coherent devices and one with non-coherent ones, then we
can also allow using a single domain for that. If the host cannot handle
this, it is allowed to reject attach requests for incompatible devices.

In Linux I think compatible() should include dev->dma_coherent after your
change, or the callers should check dev->dma_coherent themselves
(vfio-platform in particular)


B. Non-cacheable mappings
-------------------------

Here, accesses are normally coherent but the pIOMMU mappings may be
configured to be non-coherent (non-cacheable access type on Arm). If there
is an actual need for this, we could add a feature bit, a probe request
property and a map flag.

In Linux we may want to disallow !IOMMU_CACHE if the device is coherent,
since we don't support this case.


C. PCIe devices performing no-snoop accesses
--------------------------------------------

Accesses are normally coherent but the device may set a transaction bit
requesting the transaction to be non-coherent.

A guest can't enable and use no-snoop in a PCIe device without knowing
whether the system supports it. It's not discoverable on Arm, so a guest
can't use it. On x86 I think it's always supported but the pIOMMU may
enforce snoop (and the guest may be unable to perform cache maintenance?
I didn't follow the whole wbinvd discussions for lack of time).

The problem is the same without a vIOMMU, so I'd defer that to some
firmware method describing no-snoop.


D. Non-coherent virtio-iommu
----------------------------

Non-coherent virtqueues. It's not forbidden by the spec, and a transport
driver could support it, but it's a transport problem and virtio-iommu
doesn't need to know about it.


Did I forget anything?  Otherwise I don't think we need any spec change at
the moment. But when adding support for page tables, we'll have to
consider each of these cases since the guest will be able to set memory
attributes and will care about page walks coherency. That will be bundled
in a probe request along with the other page table information.

Thanks,
Jean
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Reply via email to