On Fri, Feb 28, 2020 at 11:13:40AM -0400, Jason Gunthorpe wrote:
> On Fri, Feb 28, 2020 at 04:04:27PM +0100, Jean-Philippe Brucker wrote:
> > On Fri, Feb 28, 2020 at 10:48:44AM -0400, Jason Gunthorpe wrote:
> > > On Fri, Feb 28, 2020 at 03:39:35PM +0100, Jean-Philippe Brucker wrote:
> > > > > > +   list_for_each_entry_rcu(bond, &io_mm->devices, mm_head) {
> > > > > > +           /*
> > > > > > +            * To ensure that we observe the initialization of 
> > > > > > io_mm fields
> > > > > > +            * by io_mm_finalize() before the registration of this 
> > > > > > bond to
> > > > > > +            * the list by io_mm_attach(), introduce an address 
> > > > > > dependency
> > > > > > +            * between bond and io_mm. It pairs with the 
> > > > > > smp_store_release()
> > > > > > +            * from list_add_rcu().
> > > > > > +            */
> > > > > > +           io_mm = rcu_dereference(bond->io_mm);
> > > > > 
> > > > > A rcu_dereference isn't need here, just a normal derference is fine.
> > > > 
> > > > bond->io_mm is annotated with __rcu (for iommu_sva_get_pasid_generic(),
> > > > which does bond->io_mm under rcu_read_lock())
> > > 
> > > I'm surprised the bond->io_mm can change over the lifetime of the
> > > bond memory..
> > 
> > The normal lifetime of the bond is between device driver calls to bind()
> > and unbind(). If the mm exits early, though, we clear bond->io_mm. The
> > bond is then stale but can only be freed when the device driver releases
> > it with unbind().
> 
> I usually advocate for simple use of these APIs. The mm_notifier_get()
> should happen in bind() and the matching put should happen in the
> call_rcu callbcak that does the kfree.

I tried to keep it simple like that: normally mmu_notifier_get() is called
in bind(), and mmu_notifier_put() is called in unbind(). 

Multiple device drivers may call bind() with the same mm. Each bind()
calls mmu_notifier_get(), obtains the same io_mm, and returns a new bond
(a device<->mm link). Each bond is freed by calling unbind(), which calls
mmu_notifier_put().

That's the most common case. Now if the process is killed and the mm
disappears, we do need to avoid use-after-free caused by DMA of the
mappings and the page tables. So the release() callback, before doing
invalidate_all, stops DMA and clears the page table pointer on the IOMMU
side. It detaches all bonds from the io_mm, calling mmu_notifier_put() for
each of them. After release(), bond objects still exists and device
drivers still need to free them with unbind(), but they don't point to an
io_mm anymore.

> Then you can never get a stale
> pointer. Don't worry about exit_mmap().
> 
> release() is an unusual callback and I see alot of places using it
> wrong. The purpose of release is to invalidate_all, that is it.
> 
> Also, confusingly release may be called multiple times in some
> situations, so it shouldn't disturb anything that might impact a 2nd
> call.

I hadn't realized that. The current implementation should be safe against
it, as release() is a nop if the io_mm doesn't have bonds anymore. Do you
have an example of such a situation?  I'm trying to write tests for this
kind of corner cases.

Thanks,
Jean
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Reply via email to