On Thu, Sep 18, 2025 at 07:05:50AM +0000, Tian, Kevin wrote:
> > iommu_map()
> >    pgsz  ,avg new,old ns, min new,old ns  , min % (+ve is better)
> >      2^12,     65,64    ,      62,61      ,  -1.01
> >      2^13,     70,66    ,      67,62      ,  -8.08
> >      2^14,     73,69    ,      71,65      ,  -9.09
> >      2^15,     78,75    ,      75,71      ,  -5.05
> >      2^16,     89,89    ,      86,84      ,  -2.02
> >      2^17,    128,121   ,     124,112     , -10.10
> >      2^18,    175,175   ,     170,163     ,  -4.04
> >      2^19,    264,306   ,     261,279     ,   6.06
> >      2^20,    444,525   ,     438,489     ,  10.10
> >      2^21,     60,62    ,      58,59      ,   1.01
> >  256*2^12,    381,1833  ,     367,1795    ,  79.79
> >  256*2^21,    375,1623  ,     356,1555    ,  77.77
> >  256*2^30,    356,1338  ,     349,1277    ,  72.72
> > 
> > iommu_unmap()
> >    pgsz  ,avg new,old ns, min new,old ns  , min % (+ve is better)
> >      2^12,     76,89    ,      71,86      ,  17.17
> >      2^13,     79,89    ,      75,86      ,  12.12
> >      2^14,     78,90    ,      74,86      ,  13.13
> >      2^15,     82,89    ,      74,86      ,  13.13
> >      2^16,     79,89    ,      74,86      ,  13.13
> >      2^17,     81,89    ,      77,87      ,  11.11
> >      2^18,     90,92    ,      87,89      ,   2.02
> >      2^19,     91,93    ,      88,90      ,   2.02
> >      2^20,     96,95    ,      91,92      ,   1.01
> >      2^21,     72,88    ,      68,85      ,  20.20
> >  256*2^12,    372,6583  ,     364,6251    ,  94.94
> >  256*2^21,    398,6032  ,     392,5758    ,  93.93
> >  256*2^30,    396,5665  ,     389,5258    ,  92.92
> 
> data here mismatches those in coverletter, though the difference
> didn't affect the conclusion. 😊

I was looking fixing this and realized they are different
deliberately. The cover letter has:

  * Above numbers include additional patches to remove the iommu_pgsize()
    overheads. gcc 13.3.0, i7-12700

Which is why the numbers are so much higher:

     2^12,     53,66    ,      51,63      ,  19.19 (AMDV1)
     2^12,     65,64    ,      62,61      ,  -1.01

The additional patches make the difference.

So this is telling two stories, this patch at this moment gets a
slight negative for small small sizes and a huge positive for big
sizes, while after some additional optimization on the core code we
move to a full significant win everywhere.

Jason

Reply via email to