Hi Jean-Philippe,

On 5/24/2017 2:01 PM, Jean-Philippe Brucker wrote:
PCIe devices can implement their own TLB, named Address Translation Cache
(ATC). In order to support Address Translation Service (ATS), the
following changes are needed in software:

* Enable ATS on endpoints when the system supports it. Both PCI root
   complex and associated SMMU must implement the ATS protocol.

* When unmapping an IOVA, send an ATC invalidate request to the endpoint
   in addition to the usual SMMU IOTLB invalidations.

I previously sent this as part of a lengthy RFC [1] adding SVM (ATS +
PASID + PRI) support to SMMUv3. The next PASID/PRI version is almost
ready, but isn't likely to get merged because it needs hardware testing,
so I will send it later. PRI depends on ATS, but ATS should be useful on
its own.

Without PASID and PRI, ATS is used for accelerating transactions. Instead
of having all memory accesses go through SMMU translation, the endpoint
can translate IOVA->PA once, store the result in its ATC, then issue
subsequent transactions using the PA, partially bypassing the SMMU. So in
theory it should be faster while keeping the advantages of an IOMMU,
namely scatter-gather and access control.

The ATS patches can now be tested on some hardware, even though the lack
of compatible PCI endpoints makes it difficult to assess what performance
optimizations we need. That's why the ATS implementation is a bit rough at
the moment, and we will work on optimizing things like invalidation ranges

Sinan and I have tested this series on a QDF2400 development platform
using a PCIe exerciser card as the ATS capable endpoint. We were able
to verify that ATS requests complete with a valid translated address
and that DMA transactions using the pre-translated address "bypass"
the SMMU. Testing ATC invalidations was a bit more difficult as we
could not figure out how to get the exerciser card to automatically
send the completion message. We ended up having to write a debugger
script that would monitor the CMDQ and tell the exerciser to send
the completion when a hanging CMD_SYNC following a CMD_ATC_INV was
detected. Hopefully we'll get some real ATS capable endpoints to
test with soon.

Since the RFC [1]:
* added DT and ACPI patches,
* added invalidate-all on domain detach,
* removed smmu_group again,
* removed invalidation print from the fast path,
* disabled tagged pointers for good,
* some style changes.

These patches are based on Linux v4.12-rc2

[1] https://www.spinics.net/lists/linux-pci/msg58650.html

Jean-Philippe Brucker (7):
   PCI: Move ATS declarations outside of CONFIG_PCI
   dt-bindings: PCI: Describe ATS property for root complex nodes
   iommu/of: Check ATS capability in root complex nodes
   ACPI/IORT: Check ATS capability in root complex nodes
   iommu/arm-smmu-v3: Link domains and devices
   iommu/arm-smmu-v3: Add support for PCI ATS
   iommu/arm-smmu-v3: Disable tagged pointers

  .../devicetree/bindings/pci/pci-iommu.txt          |   8 +
  drivers/acpi/arm64/iort.c                          |  10 +
  drivers/iommu/arm-smmu-v3.c                        | 258 ++++++++++++++++++++-
  drivers/iommu/of_iommu.c                           |   8 +
  include/linux/iommu.h                              |   4 +
  include/linux/pci.h                                |  26 +--
  6 files changed, 293 insertions(+), 21 deletions(-)

Qualcomm Datacenter Technologies as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux 
Foundation Collaborative Project.
iommu mailing list

Reply via email to