On 24/07/2019 13:20, Will Deacon wrote:
On Wed, Jul 24, 2019 at 10:58:26AM +0100, John Garry wrote:
On 11/07/2019 18:19, Will Deacon wrote:
This is a significant rework of the RFC I previously posted here:
https://lkml.kernel.org/r/[email protected]
But this time, it looks like it might actually be worthwhile according
to my perf profiles, where __iommu_unmap() falls a long way down the
profile for a multi-threaded netperf run. I'm still relying on others to
confirm this is useful, however.
Some of the changes since last time are:
* Support for constructing and submitting a list of commands in the
driver
* Numerous changes to the IOMMU and io-pgtable APIs so that we can
submit commands in batches
* Removal of cmpxchg() from cmdq_shared_lock() fast-path
* Code restructuring and cleanups
This current applies against my iommu/devel branch that Joerg has pulled
for 5.3. If you want to test it out, I've put everything here:
https://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git/log/?h=iommu/cmdq
Feedback welcome. I appreciate that we're in the merge window, but I
wanted to get this on the list for people to look at as an RFC.
I tested storage performance on this series, which I think is a better
scenario to test than network performance, that being generally limited by
the network link speed.
Interesting, thanks for sharing. Do you also see a similar drop in CPU time
to the one reported by Ganapat?
Not really, CPU load reported by fio is mostly the same.
Baseline performance (will/iommu/devel, commit 9e6ea59f3)
8x SAS disks D05 839K IOPS
1x NVMe D05 454K IOPS
1x NVMe D06 442k IOPS
Patchset performance (will/iommu/cmdq)
8x SAS disk D05 835K IOPS
1x NVMe D05 472K IOPS
1x NVMe D06 459k IOPS
So we see a bit of an NVMe boost, but about the same for 8x disks. No iommu
performance is about 918K IOPs for 8x disks, so it is not limited by the
medium.
It would be nice to know if this performance gap is because of Linux, or
simply because of the translation overhead in the SMMU hardware. Are you
able to get a perf profile to see where we're spending time?
I'll look to do that, but I'd really expect it to be down to the time
linux spends on the DMA map and unmaps.
Cheers,
john
Thanks,
Will
.
_______________________________________________
iommu mailing list
[email protected]
https://lists.linuxfoundation.org/mailman/listinfo/iommu