> On Jan 28, 2021, at 9:52 AM, Chuck Lever <[email protected]> wrote:
>
>
>
>> On Jan 28, 2021, at 8:59 AM, Robin Murphy <[email protected]> wrote:
>>
>> On 2021-01-27 20:00, Chuck Lever wrote:
>>> Hi-
>>> This collection of patches seems to get the best throughtput results
>>> so far. The NFS WRITE result is fully restored, and the NFS READ
>>> result is very close to fully restored.
>>> Children see throughput for 12 initial writers = 5008474.03 kB/sec
>>> Parent sees throughput for 12 initial writers = 4996927.80 kB/sec
>>> Min throughput per process = 416956.88 kB/sec
>>> Max throughput per process = 417910.22 kB/sec
>>> Avg throughput per process = 417372.84 kB/sec
>>> Min xfer = 1046272.00 kB
>>> CPU Utilization: Wall time 2.515 CPU time 1.996 CPU
>>> utilization 79.37 %
>>> Children see throughput for 12 rewriters = 5020584.59 kB/sec
>>> Parent sees throughput for 12 rewriters = 5012539.29 kB/sec
>>> Min throughput per process = 417799.00 kB/sec
>>> Max throughput per process = 419082.22 kB/sec
>>> Avg throughput per process = 418382.05 kB/sec
>>> Min xfer = 1046528.00 kB
>>> CPU utilization: Wall time 2.507 CPU time 2.024 CPU
>>> utilization 80.73 %
>>> Children see throughput for 12 readers = 5805484.25 kB/sec
>>> Parent sees throughput for 12 readers = 5799535.68 kB/sec
>>> Min throughput per process = 482888.16 kB/sec
>>> Max throughput per process = 484444.16 kB/sec
>>> Avg throughput per process = 483790.35 kB/sec
>>> Min xfer = 1045760.00 kB
>>> CPU utilization: Wall time 2.167 CPU time 1.964 CPU
>>> utilization 90.63 %
>>> Children see throughput for 12 re-readers = 5812227.16 kB/sec
>>> Parent sees throughput for 12 re-readers = 5803793.06 kB/sec
>>> Min throughput per process = 483242.97 kB/sec
>>> Max throughput per process = 485724.41 kB/sec
>>> Avg throughput per process = 484352.26 kB/sec
>>> Min xfer = 1043456.00 kB
>>> CPU utilization: Wall time 2.161 CPU time 1.976 CPU
>>> utilization 91.45 %
>>> I've included a simple-minded implementation of a map_sg op for
>>> the Intel IOMMU. This is nothing more than a copy of the loop in
>>> __iommu_map_sg() with the call to __iommu_map() replaced with a
>>> call to intel_iommu_map().
>>
>> ...which is the main reason I continue to strongly dislike patches #4-#9 (#3
>> definitely seems to makes sense either way, now that #1 and #2 are going to
>> land). If a common operation is worth optimising anywhere, then it deserves
>> optimising everywhere, so we end up with a dozen diverging copies of
>> essentially the same code - particularly when the driver-specific
>> functionality *is* already in the drivers, so what gets duplicated is solely
>> the "generic" parts.
>
> I don't disagree with that assessment, but I don't immediately see an
> alternative API arrangement that would be more successful in the short
> term. If 4/9 - 9/9 are not acceptable, then the responsible thing to
> do would be to revert:
>
> - 58a8bb39490d ("iommu/vt-d: Cleanup after converting to dma-iommu ops")
> - c588072bba6b ("iommu/vt-d: Convert intel iommu driver to the iommu ops")
>
> for v5.11, work out the proper API design, and then try the VT-d conversion
> again.
>
> IMHO.
Are all y'all waiting for me to post such patches? ;-)
>> And if there's justification for pushing iommu_map_sg() entirely into
>> drivers, then it's verging on self-contradictory not to do the same for
>> iommu_map() and iommu_unmap(). Some IOMMU drivers - mainly intel-iommu, as
>> it happens - are already implementing hacks around the "one call per page"
>> interface being inherently inefficient, so the logical thing to do here is
>> take a step back and reconsider the fundamental design of the whole
>> map/unmap interface. Implementing hacks on top of hacks to make particular
>> things faster on particular systems that particular people care about is not
>> going to do us any favours in the long run.
>>
>> As it stands, I can easily see a weird anti-pattern emerging where people
>> start adding code to fake up scatterlists in random drivers because they see
>> dma_map_sg() performing paradoxically better than dma_map_page().
>>
>> Robin.
>>
>>> ---
>>> Chuck Lever (1):
>>> iommu/vt-d: Introduce map_sg() for Intel IOMMUs
>>> Isaac J. Manjarres (5):
>>> iommu/io-pgtable: Introduce map_sg() as a page table op
>>> iommu/io-pgtable-arm: Hook up map_sg()
>>> iommu/io-pgtable-arm-v7s: Hook up map_sg()
>>> iommu: Introduce map_sg() as an IOMMU op for IOMMU drivers
>>> iommu/arm-smmu: Hook up map_sg()
>>> Lu Baolu (1):
>>> iommu/vt-d: Add iotlb_sync_map callback
>>> Yong Wu (2):
>>> iommu: Move iotlb_sync_map out from __iommu_map
>>> iommu: Add iova and size as parameters in iotlb_sync_map
>>> drivers/iommu/arm/arm-smmu/arm-smmu.c | 19 ++++
>>> drivers/iommu/intel/iommu.c | 131 ++++++++++++++++++++------
>>> drivers/iommu/io-pgtable-arm-v7s.c | 90 ++++++++++++++++++
>>> drivers/iommu/io-pgtable-arm.c | 86 +++++++++++++++++
>>> drivers/iommu/iommu.c | 47 +++++++--
>>> drivers/iommu/tegra-gart.c | 7 +-
>>> include/linux/iommu.h | 16 +++-
>>> 7 files changed, 353 insertions(+), 43 deletions(-)
>>> --
>>> Chuck Lever
>>> _______________________________________________
>>> iommu mailing list
>>> [email protected]
>>> https://lists.linuxfoundation.org/mailman/listinfo/iommu
>
> --
> Chuck Lever
>
>
>
> _______________________________________________
> iommu mailing list
> [email protected]
> https://lists.linuxfoundation.org/mailman/listinfo/iommu
--
Chuck Lever
_______________________________________________
iommu mailing list
[email protected]
https://lists.linuxfoundation.org/mailman/listinfo/iommu