Re: [PATCH RFC 0/9] Possible set of VT-d optimizations

Chuck Lever Thu, 28 Jan 2021 06:53:20 -0800

> On Jan 28, 2021, at 8:59 AM, Robin Murphy <[email protected]> wrote:
> 
> On 2021-01-27 20:00, Chuck Lever wrote:
>> Hi-
>> This collection of patches seems to get the best throughtput results
>> so far. The NFS WRITE result is fully restored, and the NFS READ
>> result is very close to fully restored.
>>      Children see throughput for 12 initial writers  = 5008474.03 kB/sec
>>      Parent sees throughput for 12 initial writers   = 4996927.80 kB/sec
>>      Min throughput per process                      = 416956.88 kB/sec
>>      Max throughput per process                      = 417910.22 kB/sec
>>      Avg throughput per process                      = 417372.84 kB/sec
>>      Min xfer                                        = 1046272.00 kB
>>      CPU Utilization: Wall time    2.515    CPU time    1.996    CPU 
>> utilization  79.37 %
>>      Children see throughput for 12 rewriters        = 5020584.59 kB/sec
>>      Parent sees throughput for 12 rewriters         = 5012539.29 kB/sec
>>      Min throughput per process                      = 417799.00 kB/sec
>>      Max throughput per process                      = 419082.22 kB/sec
>>      Avg throughput per process                      = 418382.05 kB/sec
>>      Min xfer                                        = 1046528.00 kB
>>      CPU utilization: Wall time    2.507    CPU time    2.024    CPU 
>> utilization  80.73 %
>>      Children see throughput for 12 readers          = 5805484.25 kB/sec
>>      Parent sees throughput for 12 readers           = 5799535.68 kB/sec
>>      Min throughput per process                      = 482888.16 kB/sec
>>      Max throughput per process                      = 484444.16 kB/sec
>>      Avg throughput per process                      = 483790.35 kB/sec
>>      Min xfer                                        = 1045760.00 kB
>>      CPU utilization: Wall time    2.167    CPU time    1.964    CPU 
>> utilization  90.63 %
>>      Children see throughput for 12 re-readers       = 5812227.16 kB/sec
>>      Parent sees throughput for 12 re-readers        = 5803793.06 kB/sec
>>      Min throughput per process                      = 483242.97 kB/sec
>>      Max throughput per process                      = 485724.41 kB/sec
>>      Avg throughput per process                      = 484352.26 kB/sec
>>      Min xfer                                        = 1043456.00 kB
>>      CPU utilization: Wall time    2.161    CPU time    1.976    CPU 
>> utilization  91.45 %
>> I've included a simple-minded implementation of a map_sg op for
>> the Intel IOMMU. This is nothing more than a copy of the loop in
>> __iommu_map_sg() with the call to __iommu_map() replaced with a
>> call to intel_iommu_map().
> 
> ...which is the main reason I continue to strongly dislike patches #4-#9 (#3 
> definitely seems to makes sense either way, now that #1 and #2 are going to 
> land). If a common operation is worth optimising anywhere, then it deserves 
> optimising everywhere, so we end up with a dozen diverging copies of 
> essentially the same code - particularly when the driver-specific 
> functionality *is* already in the drivers, so what gets duplicated is solely 
> the "generic" parts.

I don't disagree with that assessment, but I don't immediately see an
alternative API arrangement that would be more successful in the short
term. If 4/9 - 9/9 are not acceptable, then the responsible thing to
do would be to revert:

 - 58a8bb39490d ("iommu/vt-d: Cleanup after converting to dma-iommu ops")
 - c588072bba6b ("iommu/vt-d: Convert intel iommu driver to the iommu ops")

for v5.11, work out the proper API design, and then try the VT-d conversion
again.

IMHO.


> And if there's justification for pushing iommu_map_sg() entirely into 
> drivers, then it's verging on self-contradictory not to do the same for 
> iommu_map() and iommu_unmap(). Some IOMMU drivers - mainly intel-iommu, as it 
> happens - are already implementing hacks around the "one call per page" 
> interface being inherently inefficient, so the logical thing to do here is 
> take a step back and reconsider the fundamental design of the whole map/unmap 
> interface. Implementing hacks on top of hacks to make particular things 
> faster on particular systems that particular people care about is not going 
> to do us any favours in the long run.
> 
> As it stands, I can easily see a weird anti-pattern emerging where people 
> start adding code to fake up scatterlists in random drivers because they see 
> dma_map_sg() performing paradoxically better than dma_map_page().
> 
> Robin.
> 
>> ---
>> Chuck Lever (1):
>>       iommu/vt-d: Introduce map_sg() for Intel IOMMUs
>> Isaac J. Manjarres (5):
>>       iommu/io-pgtable: Introduce map_sg() as a page table op
>>       iommu/io-pgtable-arm: Hook up map_sg()
>>       iommu/io-pgtable-arm-v7s: Hook up map_sg()
>>       iommu: Introduce map_sg() as an IOMMU op for IOMMU drivers
>>       iommu/arm-smmu: Hook up map_sg()
>> Lu Baolu (1):
>>       iommu/vt-d: Add iotlb_sync_map callback
>> Yong Wu (2):
>>       iommu: Move iotlb_sync_map out from __iommu_map
>>       iommu: Add iova and size as parameters in iotlb_sync_map
>>  drivers/iommu/arm/arm-smmu/arm-smmu.c |  19 ++++
>>  drivers/iommu/intel/iommu.c           | 131 ++++++++++++++++++++------
>>  drivers/iommu/io-pgtable-arm-v7s.c    |  90 ++++++++++++++++++
>>  drivers/iommu/io-pgtable-arm.c        |  86 +++++++++++++++++
>>  drivers/iommu/iommu.c                 |  47 +++++++--
>>  drivers/iommu/tegra-gart.c            |   7 +-
>>  include/linux/iommu.h                 |  16 +++-
>>  7 files changed, 353 insertions(+), 43 deletions(-)
>> --
>> Chuck Lever
>> _______________________________________________
>> iommu mailing list
>> [email protected]
>> https://lists.linuxfoundation.org/mailman/listinfo/iommu

--
Chuck Lever



_______________________________________________
iommu mailing list
[email protected]
https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH RFC 0/9] Possible set of VT-d optimizations

Reply via email to