The open(2) system call uses a problematic locking pattern that can
cause problems when the process has more than one thread. The system
call keeps the file descriptor table locked when calling vn_open(9).
If vn_open(9) blocks, the other threads get blocked too if they try
to modify the descriptor table. The issue is explained by art@ on
the following page (mpi@ brought this to my attention):

https://stackoverflow.com/questions/23440216/race-condition-when-using-dup2/24012015#24012015

The following patch makes open(2) use a locking pattern similar to
accept(2). The descriptor table lock is released immediately after
allocating a file descriptor, so other threads are not directly
affected if vn_open(9) blocks. Once vn_open(9) has finished and the
code is done with the vnode, the descriptor table lock is reacquired
for inserting the file descriptor.

The change of the locking pattern has an effect on dup2(2).
If vn_open(9) has blocked and another thread tries to dup2() to the
already allocated but not yet inserted file descriptor, dup2()
fails with error EBUSY. accept(2) already has that behaviour.

OK?  Tests are welcome too.

Index: lib/libc/sys/dup.2
===================================================================
RCS file: src/lib/libc/sys/dup.2,v
retrieving revision 1.19
diff -u -p -r1.19 dup.2
--- lib/libc/sys/dup.2  28 May 2018 08:55:11 -0000      1.19
+++ lib/libc/sys/dup.2  2 Jun 2018 10:51:48 -0000
@@ -160,6 +160,8 @@ limit.
 .It Bq Er EBUSY
 A race condition with
 .Xr accept 2
+or
+.Xr open 2
 has been detected.
 .It Bq Er EINTR
 An interrupt was received.
Index: sys/kern/vfs_syscalls.c
===================================================================
RCS file: src/sys/kern/vfs_syscalls.c,v
retrieving revision 1.284
diff -u -p -r1.284 vfs_syscalls.c
--- sys/kern/vfs_syscalls.c     2 Jun 2018 10:27:43 -0000       1.284
+++ sys/kern/vfs_syscalls.c     2 Jun 2018 10:51:48 -0000
@@ -916,6 +916,8 @@ doopenat(struct proc *p, int fd, const c
        fdplock(fdp);
        if ((error = falloc(p, &fp, &indx)) != 0)
                goto out;
+       fdpunlock(fdp);
+
        flags = FFLAGS(oflags);
        if (flags & FREAD)
                ni_pledge |= PLEDGE_RPATH;
@@ -935,6 +937,7 @@ doopenat(struct proc *p, int fd, const c
                flags &= ~O_TRUNC;      /* Must do truncate ourselves */
        }
        if ((error = vn_open(&nd, flags, cmode)) != 0) {
+               fdplock(fdp);
                if (error == ENODEV &&
                    p->p_dupfd >= 0 &&                  /* XXX from fdopen */
                    (error =
@@ -969,6 +972,7 @@ doopenat(struct proc *p, int fd, const c
                VOP_UNLOCK(vp);
                error = VOP_ADVLOCK(vp, (caddr_t)fp, F_SETLK, &lf, type);
                if (error) {
+                       fdplock(fdp);
                        /* closef will vn_close the file for us. */
                        fdremove(fdp, indx);
                        closef(fp, p);
@@ -991,6 +995,7 @@ doopenat(struct proc *p, int fd, const c
                }
                if (error) {
                        VOP_UNLOCK(vp);
+                       fdplock(fdp);
                        /* closef will close the file for us. */
                        fdremove(fdp, indx);
                        closef(fp, p);
@@ -999,6 +1004,7 @@ doopenat(struct proc *p, int fd, const c
        }
        VOP_UNLOCK(vp);
        *retval = indx;
+       fdplock(fdp);
        fdinsert(fdp, indx, cloexec, fp);
        FRELE(fp, p);
 out:

Reply via email to