On Wed, Feb 23, 2022 at 10:30:11AM -0400, Jason Gunthorpe wrote:
> On Wed, Feb 23, 2022 at 10:09:01AM -0400, Jason Gunthorpe wrote:
> > On Wed, Feb 23, 2022 at 03:06:35PM +0100, Greg Kroah-Hartman wrote:
> > > On Wed, Feb 23, 2022 at 09:46:27AM -0400, Jason Gunthorpe wrote:
> > > > On Wed, Feb 23, 2022 at 01:04:00PM +0000, Robin Murphy wrote:
> > > > 
> > > > > 1 - tmp->driver is non-NULL because tmp is already bound.
> > > > >   1.a - If tmp->driver->driver_managed_dma == 0, the group must 
> > > > > currently be
> > > > > DMA-API-owned as a whole. Regardless of what driver dev has unbound 
> > > > > from,
> > > > > its removal does not release someone else's DMA API (co-)ownership.
> > > > 
> > > > This is an uncommon locking pattern, but it does work. It relies on
> > > > the mutex being an effective synchronization barrier for an unlocked
> > > > store:
> > > > 
> > > >                                       WRITE_ONCE(dev->driver, NULL)
> > > 
> > > Only the driver core should be messing with the dev->driver pointer as
> > > when it does so, it already has the proper locks held.  Do I need to
> > > move that to a "private" location so that nothing outside of the driver
> > > core can mess with it?
> > 
> > It would be nice, I've seen a abuse and mislocking of it in drivers
> 
> Though to be clear, what Robin is describing is still keeping the
> dev->driver stores in dd.c, just reading it in a lockless way from
> other modules.

"other modules" should never care if a device has a driver bound to it
because instantly after the check happens, it can change so what ever
logic it wanted to do with that knowledge is gone.

Unless the bus lock is held that the device is on, but that should be
only accessable from within the driver core as it controls that type of
stuff, not any random other part of the kernel.

And in looking at this, ick, there are loads of places in the kernel
that are thinking that this pointer being set to something actually
means something.  Sometimes it does, but lots of places, it doesn't as
it can change.

In a semi-related incident right now, we currently have a syzbot failure
in the usb gadget code where it was manipulating the ->driver pointer
directly and other parts of the kernel are crashing.  See
https://lore.kernel.org/r/ph0pr11mb58805e3c4cf7d4c41d49bfcfda...@ph0pr11mb5880.namprd11.prod.outlook.com
for the thread.

I'll poke at this as a background task to try to clean up over time.

thanks,

greg k-h
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Reply via email to