On Mon, Jul 13, 2015 at 08:56:36PM +0200, Luigi Rizzo wrote:
> On Mon, Jul 13, 2015 at 06:29:12PM +0300, Konstantin Belousov wrote:
> > On Mon, Jul 13, 2015 at 05:00:30PM +0200, Luigi Rizzo wrote:
> ...
> > > thanks a lot for the clarification on the intent.
> > > I clearly need to understand more on the architecture of the module 
> > > unload.
> > > 
> > > In any case: the global contention on devmtx for I/O syscalls is
> > > really a showstopper for making effective use of modular drivers
> > > so we should really find a way to remove it.
> > What contention do you see ?  Is it on the device node for a single
> > device ?  If yes, then any modification of the below proposal would
> > not help.  I explained this below.
> 
> It was adrian that pointed it out to me the huge devmtx contention
> with multiple threads doing I/O on netmap file descriptor
> (4-8 threads each of them issuing around 200K syscalls/s)
> 
> Now i see how even if my idea of per-dev lock was correct
> it would not remove contention at all.
> 
> One final thing:
> 
> > > Is there any other way to protect access to dev->si_threadcount ?
> > > 
> > > Eg how about the following:
> > > - use a (leaf) lock into struct cdev to protect dev->si_threadcount, so 
> > > that
> > >   we could replace dev_lock() with mtx_lock(&dev->foo) in 
> > > dev_refthread(dev)
> > >   dev_relthread(dev) and other places that access si_threadcount
> > This would not work, you cannot protect a lifetime of the object by a lock
> > contained in the object.
> 
> i thought so but then the current dev_refthread() is already unsafe,
> accessing dev->si_flags unprotected
> 
>     sys/kern/kern_conf.c:
> 
>       struct cdevsw *
>       dev_refthread(struct cdev *dev, int *ref)
>       {
>               struct cdevsw *csw;
>               struct cdev_priv *cdp;
> 
>               mtx_assert(&devmtx, MA_NOTOWNED);
>               if ((dev->si_flags & SI_ETERNAL) != 0) {
>                       *ref = 0;
>                       return (dev->si_devsw);
>               }
>               dev_lock();
>               csw = dev->si_devsw;
>               if (csw != NULL) {
>                       cdp = cdev2priv(dev);
>                       if ((cdp->cdp_flags & CDP_SCHED_DTR) == 0)
>                               atomic_add_long(&dev->si_threadcount, 1);
>                       else
>                               csw = NULL;
>               }
>               dev_unlock();
>               *ref = 1;
>               return (csw);
>       }
> 
> that is particularly bad though, because it prevents from
> checking the SI_ETERNAL flag without holding the lock (short
> of encoding the flag in the pointer!)
> 

I believe this check is fine, although for the complicated reasons.

E.g., for devfs_open(), the vnode is locked, so it cannot be reclaimed.
Since the vnode is not doomed, it referenced the v_rdev and struct cdev
memory cannot be freed and reused. So the access to the so_flags is
fine.

Note that devfs_fp_check(), which is called with the vnode unlocked,
still has a guarantee that the vnode is not going away, because the
vnode is referenced by the file descriptor. Then, the devvn_refthread()
checks VV_ETERNALDEV, without accessing the struct cdev at all. If the
vnode claims that the backing device is eternal, we have a guarantee
that v_rdev is either valid cdev, or NULL.

For the non-managed device pager, the vm object owns a reference on
the cdev, so again the device memory cannot go away while in the
dev_refthread().

I believe I considered all cases when I added eternal flag.

But note that what I explained above does not allow to put the per-device
lock (like dev_mtx) into the struct cdev, e.g. due to the vnode reference
trick.  If the vnode is not VV_ETERNAL, v_rdev might be invalidated at
any time, if you do not own vnode lock or interlock.


_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Reply via email to