On Fri, May 29, 2020 at 08:48:14AM +0200, Martin Pieuchot wrote:
> On 28/05/20(Thu) 14:59, Vitaliy Makkoveev wrote:
> > socket(2) layer is already protected by solock(). It grabs NET_LOCK()
> > for inet{,6}(4) sockets, but all other sockets are still under
> > KERNEL_LOCK().
> > 
> > I guess solock is already placed everythere it's required. Also `struct
> > file' is already mp-safe. 
> > 
> > Diff below introduces rwlock `unp_lock'. It's like NET_LOCK()'s
> > `netlock' but for PF_UNIX sockets. This lock protects the whole PF_UNIX
> > layer. Vnode layer accessed in "connect" "bind" and "desconnect" cases
> > is still under KERNEL_LOCK().
> > 
> > I'am experimenting with his diff since 19.04.2020 and my test systems,
> > include Gnome workstaion are stable, without any issue.
> 
> Looks great, more tests are required :)
> 
> Your diff has many trailing spaces, could you remove them?  Commonly
> used editors or diff programs have a way to highlight them :)

Ok I will fix it.

> 
> One comment:
> 
> > Index: sys/kern/uipc_usrreq.c
> > ===================================================================
> > RCS file: /cvs/src/sys/kern/uipc_usrreq.c,v
> > retrieving revision 1.142
> > diff -u -p -r1.142 uipc_usrreq.c
> > --- sys/kern/uipc_usrreq.c  16 Jul 2019 21:41:37 -0000      1.142
> > +++ sys/kern/uipc_usrreq.c  1 May 2020 13:47:11 -0000
> > @@ -409,12 +418,17 @@ unp_detach(struct unpcb *unp)
> >  {
> >     struct vnode *vp;
> >  
> > +   rw_assert_wrlock(&unp_lock);
> > +
> >     LIST_REMOVE(unp, unp_link);
> >     if (unp->unp_vnode) {
> > +           /* this is an unlocked cleaning of `v_socket', but it's safe */
> >             unp->unp_vnode->v_socket = NULL;
> >             vp = unp->unp_vnode;
> >             unp->unp_vnode = NULL;
> > +           KERNEL_LOCK();
> >             vrele(vp);
> > +           KERNEL_UNLOCK();
> 
> Why is it safe?  That's what the comment should explain :)  If the code
> doesn't take the lock that implies it is not required.  Why it is not
> required is not obvious.
>

The only place where `v_socket' is accessed outside PF_UNIX layer is
kern/kern_sysctl.c:1143. This is just pointer to integer cast. Pointer
assignment is atomic.

Also we are accessing this `unp' in unp_detach(), unp_bind() and
unp_connect() but they are serialized by `unp_lock'.

This is the last reference to this `unp' and also there is the only
reference in socket layer to `unp_vnode'. Since `unpcb' has no reference
counter we can't add assertion to be sure this is the last reference to
`unp'. We can check reference count of vnode. But since sys_open() and
unp_detach() are not serialized by KERNEL_LOCK(), vn_open() called by
doopenat() will raise `v_usecount' before `v_type' check
(kern/vfs_vnops.c:112). And `vn->v_usecount' in this unp_detach() call
can be "2".

In fact, only vrele(9) should be called under KERNEL_LOCK() to protect
vnode. If we want to put "KASSERT(vn->v_usecount==1)" we should move it
under KERNEL_LOCK() before `v_socket' cleaning to be sure there is no
concurency with sys_open().

Something like this:

        if (unp->unp_vnode) {
                KERNEL_LOCK();
                KASSERT(unp->unp_vnode->v_usecount == 1);
                unp->unp_vnode->v_socket = NULL;
                vp = unp->unp_vnode;
                unp->unp_vnode = NULL;
                vrele(vp);
                KERNEL_UNLOCK();
        }

I will update diff if this is required.

Reply via email to