On Thu, Sep 18, 2025 at 07:05:50AM +0000, Tian, Kevin wrote:
> > iommu_map()
> > pgsz ,avg new,old ns, min new,old ns , min % (+ve is better)
> > 2^12, 65,64 , 62,61 , -1.01
> > 2^13, 70,66 , 67,62 , -8.08
> > 2^14, 73,69 , 71,65 , -9.09
> > 2^15, 78,75 , 75,71 , -5.05
> > 2^16, 89,89 , 86,84 , -2.02
> > 2^17, 128,121 , 124,112 , -10.10
> > 2^18, 175,175 , 170,163 , -4.04
> > 2^19, 264,306 , 261,279 , 6.06
> > 2^20, 444,525 , 438,489 , 10.10
> > 2^21, 60,62 , 58,59 , 1.01
> > 256*2^12, 381,1833 , 367,1795 , 79.79
> > 256*2^21, 375,1623 , 356,1555 , 77.77
> > 256*2^30, 356,1338 , 349,1277 , 72.72
> >
> > iommu_unmap()
> > pgsz ,avg new,old ns, min new,old ns , min % (+ve is better)
> > 2^12, 76,89 , 71,86 , 17.17
> > 2^13, 79,89 , 75,86 , 12.12
> > 2^14, 78,90 , 74,86 , 13.13
> > 2^15, 82,89 , 74,86 , 13.13
> > 2^16, 79,89 , 74,86 , 13.13
> > 2^17, 81,89 , 77,87 , 11.11
> > 2^18, 90,92 , 87,89 , 2.02
> > 2^19, 91,93 , 88,90 , 2.02
> > 2^20, 96,95 , 91,92 , 1.01
> > 2^21, 72,88 , 68,85 , 20.20
> > 256*2^12, 372,6583 , 364,6251 , 94.94
> > 256*2^21, 398,6032 , 392,5758 , 93.93
> > 256*2^30, 396,5665 , 389,5258 , 92.92
>
> data here mismatches those in coverletter, though the difference
> didn't affect the conclusion. 😊
I was looking fixing this and realized they are different
deliberately. The cover letter has:
* Above numbers include additional patches to remove the iommu_pgsize()
overheads. gcc 13.3.0, i7-12700
Which is why the numbers are so much higher:
2^12, 53,66 , 51,63 , 19.19 (AMDV1)
2^12, 65,64 , 62,61 , -1.01
The additional patches make the difference.
So this is telling two stories, this patch at this moment gets a
slight negative for small small sizes and a huge positive for big
sizes, while after some additional optimization on the core code we
move to a full significant win everywhere.
Jason