Hello Lu - Many thanks for your prototype.
> On Jan 24, 2021, at 9:38 PM, Lu Baolu <[email protected]> wrote: > > This patch series is only for Request-For-Testing purpose. It aims to > fix the performance regression reported here. > > https://lore.kernel.org/linux-iommu/[email protected]/ > > The first two patches are borrowed from here. > > https://lore.kernel.org/linux-iommu/[email protected]/ > > Please kindly help to verification. > > Best regards, > baolu > > Lu Baolu (1): > iommu/vt-d: Add iotlb_sync_map callback > > Yong Wu (2): > iommu: Move iotlb_sync_map out from __iommu_map > iommu: Add iova and size as parameters in iotlb_sync_map > > drivers/iommu/intel/iommu.c | 86 +++++++++++++++++++++++++------------ > drivers/iommu/iommu.c | 23 +++++++--- > drivers/iommu/tegra-gart.c | 7 ++- > include/linux/iommu.h | 3 +- > 4 files changed, 83 insertions(+), 36 deletions(-) Here are results with the NFS client at stock v5.11-rc5 and the NFS server at v5.10, showing the regression I reported earlier. Children see throughput for 12 initial writers = 4534582.00 kB/sec Parent sees throughput for 12 initial writers = 4458145.56 kB/sec Min throughput per process = 373101.59 kB/sec Max throughput per process = 382669.50 kB/sec Avg throughput per process = 377881.83 kB/sec Min xfer = 1022720.00 kB CPU Utilization: Wall time 2.787 CPU time 1.922 CPU utilization 68.95 % Children see throughput for 12 rewriters = 4542003.12 kB/sec Parent sees throughput for 12 rewriters = 4538024.19 kB/sec Min throughput per process = 374672.00 kB/sec Max throughput per process = 383983.78 kB/sec Avg throughput per process = 378500.26 kB/sec Min xfer = 1022976.00 kB CPU utilization: Wall time 2.733 CPU time 1.947 CPU utilization 71.25 % Children see throughput for 12 readers = 4568632.03 kB/sec Parent sees throughput for 12 readers = 4563672.02 kB/sec Min throughput per process = 376727.56 kB/sec Max throughput per process = 383783.91 kB/sec Avg throughput per process = 380719.34 kB/sec Min xfer = 1029376.00 kB CPU utilization: Wall time 2.733 CPU time 1.898 CPU utilization 69.46 % Children see throughput for 12 re-readers = 4610702.78 kB/sec Parent sees throughput for 12 re-readers = 4606135.66 kB/sec Min throughput per process = 381532.78 kB/sec Max throughput per process = 387072.53 kB/sec Avg throughput per process = 384225.23 kB/sec Min xfer = 1034496.00 kB CPU utilization: Wall time 2.711 CPU time 1.910 CPU utilization 70.45 % Here's the NFS client at v5.11-rc5 with your series applied. The NFS server remains at v5.10: Children see throughput for 12 initial writers = 4434778.81 kB/sec Parent sees throughput for 12 initial writers = 4408190.69 kB/sec Min throughput per process = 367865.28 kB/sec Max throughput per process = 371134.38 kB/sec Avg throughput per process = 369564.90 kB/sec Min xfer = 1039360.00 kB CPU Utilization: Wall time 2.842 CPU time 1.904 CPU utilization 66.99 % Children see throughput for 12 rewriters = 4476870.69 kB/sec Parent sees throughput for 12 rewriters = 4471701.48 kB/sec Min throughput per process = 370985.34 kB/sec Max throughput per process = 374752.28 kB/sec Avg throughput per process = 373072.56 kB/sec Min xfer = 1038592.00 kB CPU utilization: Wall time 2.801 CPU time 1.902 CPU utilization 67.91 % Children see throughput for 12 readers = 5865268.88 kB/sec Parent sees throughput for 12 readers = 5854519.73 kB/sec Min throughput per process = 487766.81 kB/sec Max throughput per process = 489623.88 kB/sec Avg throughput per process = 488772.41 kB/sec Min xfer = 1044736.00 kB CPU utilization: Wall time 2.144 CPU time 1.895 CPU utilization 88.41 % Children see throughput for 12 re-readers = 5847438.62 kB/sec Parent sees throughput for 12 re-readers = 5839292.18 kB/sec Min throughput per process = 485835.03 kB/sec Max throughput per process = 488702.12 kB/sec Avg throughput per process = 487286.55 kB/sec Min xfer = 1042688.00 kB CPU utilization: Wall time 2.148 CPU time 1.909 CPU utilization 88.84 % NFS READ throughput is almost fully restored. A normal-looking throughput result, copied from the previous thread, is: Children see throughput for 12 readers = 5921370.94 kB/sec Parent sees throughput for 12 readers = 5914106.69 kB/sec The NFS WRITE throughput result appears to be unchanged, or slightly worse than before. I don't have an explanation for this result. I applied your patches on the NFS server also without noting improvement. -- Chuck Lever _______________________________________________ iommu mailing list [email protected] https://lists.linuxfoundation.org/mailman/listinfo/iommu
