On Fri, Sep 26, 2025 at 07:47:31AM +0000, Tian, Kevin wrote:
> > From: Jason Gunthorpe <[email protected]>
> > Sent: Thursday, September 4, 2025 1:47 AM
> > 
> > map is slightly complicated because it has to handle a number of special
> > edge cases:
> >  - Overmapping a previously shared table with an OA - requries validating
> >    and freeing the possibly empty tables
> >  - Doing the above across an entire to-be-created contiguous entry
> >  - Installing a new shared table level concurrently with another thread
> >  - Expanding the table by adding more top levels
> 
> what is 'shared table'? Looks this term doesn't appear in previous patches.

"shared table level". It is the actual 4k page. Shared means
more than one iommu_map() calls are using indexes in it to make their
mappings work.

like if you make 4k twice then the PGD/PMD/etc table would be "shared"

> also it's unclear to me why overmapping a previously shared table can
> succeed while overmapping leaf entries cannot (w/ -EADDRINUSE)

It has to be empty, let me clarify

 - Overmapping a previously shared, but now empty, table level with an OA.
   Requries validating and freeing the possibly empty tables

> > +
> > +   /* Calculate target page size and level for the leaves */
> > +   if (pt_has_system_page(common) && pgsize == PAGE_SIZE &&
> > pgcount == 1) {
> > +           PT_WARN_ON(!(pgsize_bitmap & PAGE_SIZE));
> > +           if (log2_mod(iova | paddr, PAGE_SHIFT))
> > +                   return -ENXIO;
> > +           map.leaf_pgsize_lg2 = PAGE_SHIFT;
> > +           map.leaf_level = 0;
> > +           single_page = true;
> > +   } else {
> > +           map.leaf_pgsize_lg2 = pt_compute_best_pgsize(
> > +                   pgsize_bitmap, range.va, range.last_va, paddr);
> > +           if (!map.leaf_pgsize_lg2)
> > +                   return -ENXIO;
> > +           map.leaf_level =
> > +                   pt_pgsz_lg2_to_level(common, map.leaf_pgsize_lg2);
> 
> Existing driver checks alignment on pgsize, e.g. intel-iommu:
> 
>         if (!IS_ALIGNED(iova | paddr, pgsize))
>                 return -EINVAL;

Yes
 
> But pt_compute_best_pgsize() doesn't use 'pgsize' and only have checks
> on calculated pgsz_lg2:

pgsz_lg2 is the same as 'pgsize' in the intel driver..

pt_compute_best_pgsize() takes in a bitmap of all supported page sizes
at all levels and returns a single page size that should be used for
this mapping.

The single page size satisfies the same alignemnt checks vtd had:

>         PT_WARN_ON(log2_mod(va, pgsz_lg2) != 0);
>         PT_WARN_ON(oalog2_mod(oa, pgsz_lg2) != 0);

The above are equivalent to IS_ALIGNED(iova | paddr, pgsize).

If no page sizes match the alignment of va and oa then it returns 0
and we fail:

 +              if (!map.leaf_pgsize_lg2)
 +                      return -ENXIO;

If it doesn't fail then it returns the single pgsize that should be
used for this mapping and then we seek to that table level:

 +              map.leaf_level =
 +                      pt_pgsz_lg2_to_level(common, map.leaf_pgsize_lg2);

Then there is another safety check during install leaf through
pt_check_install_leaf_args()

        if (PT_WARN_ON(oalog2_mod(oa, oasz_lg2)))
                return false;

By the time we get here oasz_lg2 is also pgsize.

> Looks not identical.

It rejects unaligned the same way though.

Further, this is all dead code right now, even the vtd code. Things
were switched over to map_pages() and so the core code has this:

        if (!IS_ALIGNED(iova | paddr | size, min_pagesz)) {
                return -EINVAL;

then iommu_pgsize() is guarenteed to work similarly to
pt_compute_best_pgsize().

Meaning the drivers can't see unaligned inputs anyhow.

Jason

Reply via email to