On Fri, Feb 28, 2020 at 11:13:40AM -0400, Jason Gunthorpe wrote:
> On Fri, Feb 28, 2020 at 04:04:27PM +0100, Jean-Philippe Brucker wrote:
> > On Fri, Feb 28, 2020 at 10:48:44AM -0400, Jason Gunthorpe wrote:
> > > On Fri, Feb 28, 2020 at 03:39:35PM +0100, Jean-Philippe Brucker wrote:
> > > > > > + list_for_each_entry_rcu(bond, &io_mm->devices, mm_head) {
> > > > > > + /*
> > > > > > + * To ensure that we observe the initialization of
> > > > > > io_mm fields
> > > > > > + * by io_mm_finalize() before the registration of this
> > > > > > bond to
> > > > > > + * the list by io_mm_attach(), introduce an address
> > > > > > dependency
> > > > > > + * between bond and io_mm. It pairs with the
> > > > > > smp_store_release()
> > > > > > + * from list_add_rcu().
> > > > > > + */
> > > > > > + io_mm = rcu_dereference(bond->io_mm);
> > > > >
> > > > > A rcu_dereference isn't need here, just a normal derference is fine.
> > > >
> > > > bond->io_mm is annotated with __rcu (for iommu_sva_get_pasid_generic(),
> > > > which does bond->io_mm under rcu_read_lock())
> > >
> > > I'm surprised the bond->io_mm can change over the lifetime of the
> > > bond memory..
> >
> > The normal lifetime of the bond is between device driver calls to bind()
> > and unbind(). If the mm exits early, though, we clear bond->io_mm. The
> > bond is then stale but can only be freed when the device driver releases
> > it with unbind().
>
> I usually advocate for simple use of these APIs. The mm_notifier_get()
> should happen in bind() and the matching put should happen in the
> call_rcu callbcak that does the kfree.
I tried to keep it simple like that: normally mmu_notifier_get() is called
in bind(), and mmu_notifier_put() is called in unbind().
Multiple device drivers may call bind() with the same mm. Each bind()
calls mmu_notifier_get(), obtains the same io_mm, and returns a new bond
(a device<->mm link). Each bond is freed by calling unbind(), which calls
mmu_notifier_put().
That's the most common case. Now if the process is killed and the mm
disappears, we do need to avoid use-after-free caused by DMA of the
mappings and the page tables. So the release() callback, before doing
invalidate_all, stops DMA and clears the page table pointer on the IOMMU
side. It detaches all bonds from the io_mm, calling mmu_notifier_put() for
each of them. After release(), bond objects still exists and device
drivers still need to free them with unbind(), but they don't point to an
io_mm anymore.
> Then you can never get a stale
> pointer. Don't worry about exit_mmap().
>
> release() is an unusual callback and I see alot of places using it
> wrong. The purpose of release is to invalidate_all, that is it.
>
> Also, confusingly release may be called multiple times in some
> situations, so it shouldn't disturb anything that might impact a 2nd
> call.
I hadn't realized that. The current implementation should be safe against
it, as release() is a nop if the io_mm doesn't have bonds anymore. Do you
have an example of such a situation? I'm trying to write tests for this
kind of corner cases.
Thanks,
Jean
_______________________________________________
iommu mailing list
[email protected]
https://lists.linuxfoundation.org/mailman/listinfo/iommu