Re: [PATCH] On unmap, flush IOMMU TLB and return correct size

David Woodhouse Wed, 25 Sep 2013 13:12:30 -0700

On Wed, 2013-09-25 at 13:44 -0600, Alex Williamson wrote:
> On Wed, 2013-09-25 at 19:52 +0100, David Woodhouse wrote:
> > On Wed, 2013-09-25 at 11:36 -0600, Alex Williamson wrote:
> > > On Wed, 2013-09-25 at 17:05 +0100, David Woodhouse wrote:
> > > > Why would it ever care? If it *happens* to map something that can use
> > > > large pages, yay!. If it subsequently breaks apart those large pages by
> > > > unmapping 4KiB in the middle, let the IOMMU driver break that apart.
> > > 
> > > Can this be done atomically?  I thought part of the reason for this
> > > interface was that iommu drivers typically couldn't replace a huge page
> > > with multiple smaller pages in the presence of DMA.
> > 
> > For the Intel IOMMU it can. You can atomically change from a large page
> > entry, to a pointer to a full set of smaller page tables. Do the IOTLB
> > flush, and at no time is there an interruption in service.
> 
> Cool
> 
> > Not sure if this is true for *all* IOMMU hardware; I'd be perfectly
> > happy to accept a variant of Jörg's proposal that we should only ever
> > unmap exactly the same range that we mapped. Except we should allow the
> > unmapping of adjacent regions together; just not a partial unmap of
> > something that was mapped in one go.
> 
> Well, except if we've just trusted the IOMMU driver to add a device
> behind a non-SP capable IOMMU to our domain and convert the page tables,
> that partial unmap is no longer partial and now we get different
> behavior than before so we can't depend on that adjacent unmapping.


Que?

Jörg's proposal was that if you add a mapping at a given address+size,
you should always remove *exactly* that address+size. Which will always
work exactly the same, regardless of superpages.

My slight change to that was that if you also added an *adjacent*
mapping at address2+size2, you should be able to unmap both at the same
time. Which will *also* always work the same regardless of superpages.

Even if your two mappings were also *physically* contiguous, and *could*
have used superpages, they probably won't anyway because you mapped them
in two parts.

> > >  What happens if my IOMMU domain makes use of super pages and
> > > then I add a new device behind a new IOMMU without hardware super page
> > > support? 
> > 
> > Currently, you end up with the domain happily including superpages, and
> > the less capable IOMMU that you added later won't cope.
> 
> This is the trouble with trusting the iommu driver. ;)

Sorry, I should have made it clearer that this is a *bug*. It's not by
design. The IOMMU driver ought to get this right, and will do.

> >  What we probably
> > *ought* to do is walk the page tables and convert any pre-existing
> > superpages to small pages, at the time we add the non-SP-capable IOMMU.
> 
> And then we need to figure out how to handle that in the proposed
> interface changes above since it changes the unmap behavior to the naive
> user. 

Isn't that what you'd *expect*? Surely you don't *expect* the breakage
you currently get?

>  There's also the question of whether the IOMMU driver should
> re-evaluate super pages when the less capable IOMMU is removed from the
> domain.

I wouldn't bother to go looking for opportunities to use super pages if
we remove the last non-SP-capable IOMMU from the domain.

> > FWIW we currently screw up the handling of cache-coherent vs.
> > non-coherent page tables too. That one wants a wbinvd somewhere when we
> > add a non-coherent IOMMU to the domain.
> 
> You're not selling the "trust the IOMMU driver" story very well here.
> Can we assume that the IOMMU_CACHE flag (SNP) is ignored appropriately
> by non-coherent IOMMUs?  Is there any downside to ignoring it and always
> setting SNP in the IOMMU page tables?  AMD IOMMU ignores it, but it's
> also always cache coherent.  Thanks,

SNP is a separate issue. I'm speaking of cache coherency of the hardware
page table walk — the feature bit that all the horrid clflush calls are
predicated on.

Again, this is just a bug. We *should* be getting this right, but don't
yet.

-- 
dwmw2

smime.p7s
Description: S/MIME cryptographic signature

_______________________________________________
iommu mailing list
[email protected]
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] On unmap, flush IOMMU TLB and return correct size

Reply via email to