On Fri, 2021-11-12 at 10:59 +0800, Lu Baolu wrote: > Hi Alex, > > On 11/11/21 8:32 AM, Alex Williamson wrote: > > When supporting only the .map and .unmap callbacks of iommu_ops, > > the IOMMU driver can make assumptions about the size and alignment > > used for mappings based on the driver provided pgsize_bitmap. VT-d > > previously used essentially PAGE_MASK for this bitmap as any power > > of two mapping was acceptably filled by native page sizes. > > > > However, with the .map_pages and .unmap_pages interface we're now > > getting page-size and count arguments. If we simply combine these > > as (page-size * count) and make use of the previous map/unmap > > functions internally, any size and alignment assumptions are very > > different. > > > > As an example, a given vfio device assignment VM will often create > > a 4MB mapping at IOVA pfn [0x3fe00 - 0x401ff]. On a system that > > does not support IOMMU super pages, the unmap_pages interface will > > ask to unmap 1024 4KB pages at the base IOVA. > > dma_pte_clear_level() > > will recurse down to level 2 of the page table where the first half > > of the pfn range exactly matches the entire pte level. We clear > > the > > pte, increment the pfn by the level size, but (oops) the next pte > > is > > on a new page, so we exit the loop an pop back up a level. When we > > then update the pfn based on that higher level, we seem to assume > > that the previous pfn value was at the start of the level. In this > > case the level size is 256K pfns, which we add to the base pfn and > > get a results of 0x7fe00, which is clearly greater than 0x401ff, > > so we're done. Meanwhile we never cleared the ptes for the > > remainder > > of the range. When the VM remaps this range, we're overwriting > > valid > > ptes and the VT-d driver complains loudly, as reported by the user > > report linked below. > > > > The fix for this seems relatively simple, if each iteration of the > > loop in dma_pte_clear_level() is assumed to clear to the end of the > > level pte page, then our next pfn should be calculated from > > level_pfn > > rather than our working pfn. > > > > Fixes: 3f34f1259776 ("iommu/vt-d: Implement map/unmap_pages() > > iommu_ops callback") > > Reported-by: Ajay Garg <ajaygargn...@gmail.com> > > Link: > > https://lore.kernel.org/all/20211002124012.18186-1-ajaygargn...@gmail.com/ > > Signed-off-by: Alex Williamson <alex.william...@redhat.com> > > Thank you for fixing this! I will queue it for v5.16. > > Best regards, > baolu >
Hi Baolu, Do you have an estimate of when this will be submitted? Regards, Jerry > > --- > > drivers/iommu/intel/iommu.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/iommu/intel/iommu.c > > b/drivers/iommu/intel/iommu.c > > index d75f59ae28e6..f6395f5425f0 100644 > > --- a/drivers/iommu/intel/iommu.c > > +++ b/drivers/iommu/intel/iommu.c > > @@ -1249,7 +1249,7 @@ static struct page > > *dma_pte_clear_level(struct dmar_domain *domain, int level, > > freelist); > > } > > next: > > - pfn += level_size(level); > > + pfn = level_pfn + level_size(level); > > } while (!first_pte_in_page(++pte) && pfn <= last_pfn); > > > > if (first_pte) > > > > > _______________________________________________ > iommu mailing list > iommu@lists.linux-foundation.org > https://lists.linuxfoundation.org/mailman/listinfo/iommu _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu