On 14/08/2019 18:56, Will Deacon wrote:
Hi everybody,
These are the core IOMMU changes that I have posted previously as part
of my ongoing effort to reduce the lock contention of the SMMUv3 command
queue. I thought it would be better to split this out as a separate
series, since I think it's ready to go and all the driver conversions
mean that it's quite a pain for me to maintain out of tree!
The idea of the patch series is to allow TLB invalidation to be batched
up into a new 'struct iommu_iotlb_gather' structure, which tracks the
properties of the virtual address range being invalidated so that it
can be deferred until the driver's ->iotlb_sync() function is called.
This allows for more efficient invalidation on hardware that can submit
multiple invalidations in one go.
The previous series was included in:
https://lkml.kernel.org/r/[email protected]
The only real change since then is incorporating the newly merged
virtio-iommu driver.
If you'd like to play with the patches, then I've also pushed them here:
https://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git/log/?h=iommu/unmap
but they should behave as a no-op on their own.
Hi Will,
As anticipated, my storage testing scenarios roughly give parity
throughput and CPU loading before and after this series.
Patches to convert the
Arm SMMUv3 driver to the new API are here:
https://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git/log/?h=iommu/cmdq
I quickly tested this again and now I see a performance lift:
before (5.3-rc1) after
D05 8x SAS disks 907K IOPS 970K IOPS
D05 1x NVMe 450K IOPS 466K IOPS
D06 1x NVMe 467K IOPS 466K IOPS
The CPU loading seems to track throughput, so nothing much to say there.
Note: From 5.2 testing, I was seeing >900K IOPS from that NVMe disk for
!IOMMU.
BTW, what were your thoughts on changing
arm_smmu_atc_inv_domain()->arm_smmu_atc_inv_master() to batching? It
seems suitable, but looks untouched. Were you waiting for a resolution
to the performance issue which Leizhen reported?
Thanks,
John
Cheers,
Will
--->8
Cc: Jean-Philippe Brucker <[email protected]>
Cc: Robin Murphy <[email protected]>
Cc: Jayachandran Chandrasekharan Nair <[email protected]>
Cc: Jan Glauber <[email protected]>
Cc: Jon Masters <[email protected]>
Cc: Eric Auger <[email protected]>
Cc: Zhen Lei <[email protected]>
Cc: Jonathan Cameron <[email protected]>
Cc: Vijay Kilary <[email protected]>
Cc: Joerg Roedel <[email protected]>
Cc: John Garry <[email protected]>
Cc: Alex Williamson <[email protected]>
Cc: Marek Szyprowski <[email protected]>
Cc: David Woodhouse <[email protected]>
Will Deacon (13):
iommu: Remove empty iommu_tlb_range_add() callback from iommu_ops
iommu/io-pgtable-arm: Remove redundant call to io_pgtable_tlb_sync()
iommu/io-pgtable: Rename iommu_gather_ops to iommu_flush_ops
iommu: Introduce struct iommu_iotlb_gather for batching TLB flushes
iommu: Introduce iommu_iotlb_gather_add_page()
iommu: Pass struct iommu_iotlb_gather to ->unmap() and ->iotlb_sync()
iommu/io-pgtable: Introduce tlb_flush_walk() and tlb_flush_leaf()
iommu/io-pgtable: Hook up ->tlb_flush_walk() and ->tlb_flush_leaf() in
drivers
iommu/io-pgtable-arm: Call ->tlb_flush_walk() and ->tlb_flush_leaf()
iommu/io-pgtable: Replace ->tlb_add_flush() with ->tlb_add_page()
iommu/io-pgtable: Remove unused ->tlb_sync() callback
iommu/io-pgtable: Pass struct iommu_iotlb_gather to ->unmap()
iommu/io-pgtable: Pass struct iommu_iotlb_gather to ->tlb_add_page()
drivers/gpu/drm/panfrost/panfrost_mmu.c | 24 +++++---
drivers/iommu/amd_iommu.c | 11 ++--
drivers/iommu/arm-smmu-v3.c | 52 +++++++++++-----
drivers/iommu/arm-smmu.c | 103 ++++++++++++++++++++++++--------
drivers/iommu/dma-iommu.c | 9 ++-
drivers/iommu/exynos-iommu.c | 3 +-
drivers/iommu/intel-iommu.c | 3 +-
drivers/iommu/io-pgtable-arm-v7s.c | 57 +++++++++---------
drivers/iommu/io-pgtable-arm.c | 48 ++++++++-------
drivers/iommu/iommu.c | 24 ++++----
drivers/iommu/ipmmu-vmsa.c | 28 +++++----
drivers/iommu/msm_iommu.c | 42 +++++++++----
drivers/iommu/mtk_iommu.c | 45 +++++++++++---
drivers/iommu/mtk_iommu_v1.c | 3 +-
drivers/iommu/omap-iommu.c | 2 +-
drivers/iommu/qcom_iommu.c | 44 +++++++++++---
drivers/iommu/rockchip-iommu.c | 2 +-
drivers/iommu/s390-iommu.c | 3 +-
drivers/iommu/tegra-gart.c | 12 +++-
drivers/iommu/tegra-smmu.c | 2 +-
drivers/iommu/virtio-iommu.c | 5 +-
drivers/vfio/vfio_iommu_type1.c | 27 +++++----
include/linux/io-pgtable.h | 57 ++++++++++++------
include/linux/iommu.h | 92 +++++++++++++++++++++-------
24 files changed, 483 insertions(+), 215 deletions(-)
_______________________________________________
iommu mailing list
[email protected]
https://lists.linuxfoundation.org/mailman/listinfo/iommu