On Fri, 16 Oct 2020 13:42:29 +0200
Paolo Bonzini <pbonz...@redhat.com> wrote:

> On 16/10/20 13:29, FelixCuioc wrote:
> > The issue here is that an assinged EHCI device accesses
> > an adjacent mapping between the delete and add phases
> > of the VFIO MemoryListener.
> > We want to skip flatview_simplify() is to prevent EHCI
> > device IOVA mappings from being unmapped.  
> 
> Hi,
> 
> there is indeed a bug, but I have already explained last month
> (https://mail.gnu.org/archive/html/qemu-devel/2020-09/msg01279.html)
> that this patch is conceptually wrong:
> 
> 1) you're adding host_get_vendor conditioned on compiling the x86
> emulator, so you are breaking compilation on non-x86 machines.
> 
> 2) you're adding a check for the host, but the bug applies to all hosts.
>  If there is a bug on x86 hardware emulation, it should be fixed even
> when emulating x86 from ARM.  It should also apply to all CPU vendors.
> 
> Alex, the issue here is that the delete+add passes are racing against an
> assigned device's DMA. For KVM we were thinking of changing the whole
> memory map with a single ioctl, but that's much easier because KVM
> builds its page tables lazily. It would be possible for the IOMMU too
> but it would require a relatively complicated comparison of the old and
> new memory maps in the kernel.

We can only build IOMMU page tables lazily if we get faults, which we
generally don't.  We also cannot atomically update IOMMU page tables
relative to a device, so "housekeeping" updates of mappings to (I
assume) consolidate KVM memory slots doesn't work so well when the
device is still running.  Stopping the device via something like the
bus-master enable bit also sounds like a whole set of problems itself.
I assume these simplified mappings also reduce our resolution for later
unmaps, which isn't necessarily a win for an assigned device either if
it exposes the race again each boot.

Maybe the question is why we don't see these errors more regularly, is
there something unique about the memory layout of this platform versus
others that causes larger memory regions to be coalesced together only
to be later unmapped and provide more exposure to this issue?  Thanks,

Alex


Reply via email to