On 31/05/17 16:27, Nate Watterson wrote:
> Hi Jean-Philippe,
> On 5/24/2017 2:01 PM, Jean-Philippe Brucker wrote:
>> PCIe devices can implement their own TLB, named Address Translation Cache
>> (ATC). In order to support Address Translation Service (ATS), the
>> following changes are needed in software:
>> * Enable ATS on endpoints when the system supports it. Both PCI root
>>    complex and associated SMMU must implement the ATS protocol.
>> * When unmapping an IOVA, send an ATC invalidate request to the endpoint
>>    in addition to the usual SMMU IOTLB invalidations.
>> I previously sent this as part of a lengthy RFC [1] adding SVM (ATS +
>> PASID + PRI) support to SMMUv3. The next PASID/PRI version is almost
>> ready, but isn't likely to get merged because it needs hardware testing,
>> so I will send it later. PRI depends on ATS, but ATS should be useful on
>> its own.
>> Without PASID and PRI, ATS is used for accelerating transactions. Instead
>> of having all memory accesses go through SMMU translation, the endpoint
>> can translate IOVA->PA once, store the result in its ATC, then issue
>> subsequent transactions using the PA, partially bypassing the SMMU. So in
>> theory it should be faster while keeping the advantages of an IOMMU,
>> namely scatter-gather and access control.
>> The ATS patches can now be tested on some hardware, even though the lack
>> of compatible PCI endpoints makes it difficult to assess what performance
>> optimizations we need. That's why the ATS implementation is a bit rough at
>> the moment, and we will work on optimizing things like invalidation ranges
>> later.
> Sinan and I have tested this series on a QDF2400 development platform
> using a PCIe exerciser card as the ATS capable endpoint. We were able
> to verify that ATS requests complete with a valid translated address
> and that DMA transactions using the pre-translated address "bypass"
> the SMMU. Testing ATC invalidations was a bit more difficult as we
> could not figure out how to get the exerciser card to automatically
> send the completion message. We ended up having to write a debugger
> script that would monitor the CMDQ and tell the exerciser to send
> the completion when a hanging CMD_SYNC following a CMD_ATC_INV was
> detected. Hopefully we'll get some real ATS capable endpoints to
> test with soon.

That's still a big step forward from my software tests, thanks a lot for
the report. If you get around testing a real endpoint, there are a few
data points that would be really useful to compare, if only to see whether
enabling ATS is at all viable, or if we end up getting stuck in
queue_poll_cons in normal conditions:

* ATS enabled/disabled in endpoint
* ATSCHK enabled/disabled in SMMU
* Invalidation duration when ATC entry is present/absent, and the range is

Knowing this would indicate if more work is needed on invalidation sizing,
batching, postponing or if we can optimize later.

iommu mailing list

Reply via email to