On Fri, 2021-11-12 at 10:59 +0800, Lu Baolu wrote:
> Hi Alex,
> 
> On 11/11/21 8:32 AM, Alex Williamson wrote:
> > When supporting only the .map and .unmap callbacks of iommu_ops,
> > the IOMMU driver can make assumptions about the size and alignment
> > used for mappings based on the driver provided pgsize_bitmap.  VT-d
> > previously used essentially PAGE_MASK for this bitmap as any power
> > of two mapping was acceptably filled by native page sizes.
> > 
> > However, with the .map_pages and .unmap_pages interface we're now
> > getting page-size and count arguments.  If we simply combine these
> > as (page-size * count) and make use of the previous map/unmap
> > functions internally, any size and alignment assumptions are very
> > different.
> > 
> > As an example, a given vfio device assignment VM will often create
> > a 4MB mapping at IOVA pfn [0x3fe00 - 0x401ff].  On a system that
> > does not support IOMMU super pages, the unmap_pages interface will
> > ask to unmap 1024 4KB pages at the base IOVA. 
> > dma_pte_clear_level()
> > will recurse down to level 2 of the page table where the first half
> > of the pfn range exactly matches the entire pte level.  We clear
> > the
> > pte, increment the pfn by the level size, but (oops) the next pte
> > is
> > on a new page, so we exit the loop an pop back up a level.  When we
> > then update the pfn based on that higher level, we seem to assume
> > that the previous pfn value was at the start of the level.  In this
> > case the level size is 256K pfns, which we add to the base pfn and
> > get a results of 0x7fe00, which is clearly greater than 0x401ff,
> > so we're done.  Meanwhile we never cleared the ptes for the
> > remainder
> > of the range.  When the VM remaps this range, we're overwriting
> > valid
> > ptes and the VT-d driver complains loudly, as reported by the user
> > report linked below.
> > 
> > The fix for this seems relatively simple, if each iteration of the
> > loop in dma_pte_clear_level() is assumed to clear to the end of the
> > level pte page, then our next pfn should be calculated from
> > level_pfn
> > rather than our working pfn.
> > 
> > Fixes: 3f34f1259776 ("iommu/vt-d: Implement map/unmap_pages()
> > iommu_ops callback")
> > Reported-by: Ajay Garg <ajaygargn...@gmail.com>
> > Link:
> > https://lore.kernel.org/all/20211002124012.18186-1-ajaygargn...@gmail.com/
> > Signed-off-by: Alex Williamson <alex.william...@redhat.com>
> 
> Thank you for fixing this! I will queue it for v5.16.
> 
> Best regards,
> baolu
> 

Hi Baolu,

Do you have an estimate of when this will be submitted?

Regards,
Jerry


> > ---
> >   drivers/iommu/intel/iommu.c |    2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/iommu/intel/iommu.c
> > b/drivers/iommu/intel/iommu.c
> > index d75f59ae28e6..f6395f5425f0 100644
> > --- a/drivers/iommu/intel/iommu.c
> > +++ b/drivers/iommu/intel/iommu.c
> > @@ -1249,7 +1249,7 @@ static struct page
> > *dma_pte_clear_level(struct dmar_domain *domain, int level,
> >                                                        freelist);
> >                 }
> >   next:
> > -               pfn += level_size(level);
> > +               pfn = level_pfn + level_size(level);
> >         } while (!first_pte_in_page(++pte) && pfn <= last_pfn);
> >   
> >         if (first_pte)
> > 
> > 
> _______________________________________________
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu


_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Reply via email to