On Wed, Jun 19, 2019 at 03:17:41PM +0200, Auger Eric wrote: > Hi Yan, > > [+ Peter] > > On 6/19/19 10:49 AM, Yan Zhao wrote: > > even if an entry overlaps with notifier's range, should not map/unmap > > out of bound part in the entry. > > I don't think the patch was based on the master as the trace at the very > end if not part of the upstream code. > > > > This would cause problem in below case: > > 1. initially there are two notifiers with ranges > > 0-0xfedfffff, 0xfef00000-0xffffffffffffffff, > > IOVAs from 0x3c000000 - 0x3c1fffff is in shadow page table. > > > > 2. in vfio, memory_region_register_iommu_notifier() is followed by > > memory_region_iommu_replay(), which will first call address space unmap, > > and walk and add back all entries in vtd shadow page table. e.g. > > (1) for notifier 0-0xfedfffff, > > IOVAs from 0 - 0xffffffff get unmapped, > > and IOVAs from 0x3c000000 - 0x3c1fffff get mapped > > While the patch looks sensible, the issue is the notifier scope used in > vtd_address_space_unmap is not a valid mask (ctpop64(size) != 1). Then > the size is recomputed (either using n = 64 - clz64(size) for the 1st > notifier or n = s->aw_bits for the 2d) and also the entry (especially > for the 2d notifier where it becomes 0) to get a proper alignment. > > vtd_page_walk sends notifications per block or page (with valid > addr_mask) so stays within the notifier. > > Modifying the entry->iova/addr_mask again in memory_region_notify_one > leads to unaligned start address / addr_mask. I don't think we want that. > > Can't we modity the vtd_address_space_unmap() implementation to split > the invalidation in smaller chunks instead?
Seems workable, to be explicit - we can even cut it into chunks with different size to be efficient. Like, this range: 0x0e00_0000 - 0x1_0000_0000 (size 0xf200_0000) can be one of this: 0x0e000000 - 0x1000_0000 (size 0x0200_0000) plus one of this: 0x1000_0000 - 0x1_0000_0000 (size 0xf000_0000) Yan, could you help explain the issue better on how to reproduce and what's the error when the problem occurs? For example, is that happened when a device hot-plugged into an existing VFIO container (with some mapped IOVAs)? Did you get host DMA errors later on? Thanks, -- Peter Xu