On Thu, Dec 01, 2016 at 08:42:04AM -0700, Alex Williamson wrote:
> On Wed, 30 Nov 2016 17:23:59 +0800
> Peter Xu <pet...@redhat.com> wrote:
> > On Mon, Nov 28, 2016 at 05:51:50PM +0200, Aviv B.D wrote:
> > > * intel_iommu's replay op is not implemented yet (May come in different 
> > > patch 
> > >   set).
> > >   The replay function is required for hotplug vfio device and to move 
> > > devices 
> > >   between existing domains.  
> > 
> > I am thinking about this replay thing recently and now I start to
> > doubt whether the whole vt-d vIOMMU framework suites this...
> > 
> > Generally speaking, current work is throwing away the IOMMU "domain"
> > layer here. We maintain the mapping only per device, and we don't care
> > too much about which domain it belongs. This seems problematic.
> > 
> > A simplest wrong case for this is (let's assume cache-mode is
> > enabled): if we have two assigned devices A and B, both belong to the
> > same domain 1. Meanwhile, in domain 1 assume we have one mapping which
> > is the first page (iova range 0-0xfff). Then, if guest wants to
> > invalidate the page, it'll notify VT-d vIOMMU with an invalidation
> > message. If we do this invalidation per-device, we'll need to UNMAP
> > the region twice - once for A, once for B (if we have more devices, we
> > will unmap more times), and we can never know we have done duplicated
> > work since we don't keep domain info, so we don't know they are using
> > the same address space. The first unmap will work, and then we'll
> > possibly get some errors on the rest of dma unmap failures.
> > 
> > Looks like we just cannot live without knowing this domain layer.
> > Because the address space is binded to the domain. If we want to sync
> > the address space (here to setup a correct shadow page table), we need
> > to do it per-domain.
> > 
> > What I can think of as a solution is that we introduce this "domain"
> > layer - like a memory region per domain. When invalidation happens,
> > it's per-domain, not per-device any more (actually I guess that's what
> > current vt-d iommu driver in kernel is doing, we just ignored it - we
> > fetch the devices that matches the domain ID). We can/need to maintain
> > something different, like sid <-> domain mappings (we can do this as
> > long as we are notified when context entries changed), per-domain
> > mappings (just like per-device mappings that we are trying to build in
> > this series, but what we really need is IMHO per domain one), etc.
> > When device switches domain, we switch the IOMMU memory region
> > accordingly.
> > 
> > Does this make any sense? Comments are greatly welcomed (especially
> > from AlexW and DavidG).
> It's been a bit since I've looked at VT-d emulation, but I certainly
> remember that it's way more convoluted than I expected.  It seems like
> a domain should create an AddressSpace and any devices assigned to that
> domain should make use of that single address space, but IIRC VT-d
> creates an address space per device, ie. per context entry.

Yes, I think this idea (one address space per domain) came from one of
your replies in the past, and I just found it more essential than I
thought before.

I'll see whether I can clear the way out before moving on to the
replay implementations. Because IIUC the replay will depend on this
(introducing the domain layer in VT-d IOMMU emulation).


-- peterx

Reply via email to