Module Name: src Committed By: kre Date: Sun Nov 10 00:11:43 UTC 2024
Modified Files: src/sys/kern: kern_descrip.c Log Message: Make O_CLOEXEC always close specified files on exec It turns out that close-on-exec doesn't always close on exec. If all close-on-exec fd's were made close-on-exec via dup3() or fcntl(F_DUPFD_CLOEXEC) or use of the internal fd_clone() (whose uses I did not fully investigate but I think is used to create a fd for the open of a cloner device, and perhaps other things) then none of the close-on-exec file descriptors will be closed when an exec happens - but will be passed through to the new process (still marked, apparently, as close-on-exec - but still won't be closed if another exec happens) - that is unless... If at least one fd in the process has close-on-exec set some other way (fcntl(F_SETFD), open(O_CLOEXEC) (and the similar functions for sockets, and epoll) and perhaps others then all close-on-exec file descriptors in the process will be correctly closed when an exec happens (however they obtained the close-on-exec status). There are two steps that need to be taken (in the kernel) when turning on close on exec - the obvious one of setting the ff_exclose field in the struct fdfile for the fd. And second, marking the file descriptor table (which holds the fdfile's for one or more processes) as containing file descriptors with close-on-exec set (it is a simple yes/no, and once set is never cleared until an actual exec happens). If it was set during an exec, all the file descriptors are examined, and those marked close-on-exec are closed. If the file descriptor table doesn't indicate that close-on-exec fds exist in the table, none of that happens. Several places were setting ff_exclose in the struct fdfile but not bothering to set the fd_exclose field in the file descriptor table. There's even a function (fd_set_exclose()) whose whole purpose is to do this properly - but it wasn't being used. Now it is, everywhere (I hope). To generate a diff of this commit: cvs rdiff -u -r1.263 -r1.264 src/sys/kern/kern_descrip.c Please note that diffs are not public domain; they are subject to the copyright notices on the relevant files.
Modified files: Index: src/sys/kern/kern_descrip.c diff -u src/sys/kern/kern_descrip.c:1.263 src/sys/kern/kern_descrip.c:1.264 --- src/sys/kern/kern_descrip.c:1.263 Sun Jul 14 05:10:40 2024 +++ src/sys/kern/kern_descrip.c Sun Nov 10 00:11:43 2024 @@ -1,4 +1,4 @@ -/* $NetBSD: kern_descrip.c,v 1.263 2024/07/14 05:10:40 kre Exp $ */ +/* $NetBSD: kern_descrip.c,v 1.264 2024/11/10 00:11:43 kre Exp $ */ /*- * Copyright (c) 2008, 2009, 2023 The NetBSD Foundation, Inc. @@ -70,7 +70,7 @@ */ #include <sys/cdefs.h> -__KERNEL_RCSID(0, "$NetBSD: kern_descrip.c,v 1.263 2024/07/14 05:10:40 kre Exp $"); +__KERNEL_RCSID(0, "$NetBSD: kern_descrip.c,v 1.264 2024/11/10 00:11:43 kre Exp $"); #include <sys/param.h> #include <sys/systm.h> @@ -747,7 +747,6 @@ int fd_dup(file_t *fp, int minfd, int *newp, bool exclose) { proc_t *p = curproc; - fdtab_t *dt; int error; while ((error = fd_alloc(p, minfd, newp)) != 0) { @@ -757,8 +756,7 @@ fd_dup(file_t *fp, int minfd, int *newp, fd_tryexpand(p); } - dt = atomic_load_consume(&curlwp->l_fd->fd_dt); - dt->dt_ff[*newp]->ff_exclose = exclose; + fd_set_exclose(curlwp, *newp, exclose); fd_affix(p, fp, *newp); return 0; } @@ -814,7 +812,7 @@ fd_dup2(file_t *fp, unsigned newfd, int fd_used(fdp, newfd); mutex_exit(&fdp->fd_lock); - dt->dt_ff[newfd]->ff_exclose = (flags & O_CLOEXEC) != 0; + fd_set_exclose(curlwp, newfd, (flags & O_CLOEXEC) != 0); fp->f_flag |= flags & (FNONBLOCK|FNOSIGPIPE); /* Slot is now allocated. Insert copy of the file. */ fd_affix(curproc, fp, newfd); @@ -1910,14 +1908,9 @@ int fd_clone(file_t *fp, unsigned fd, int flag, const struct fileops *fops, void *data) { - fdfile_t *ff; - filedesc_t *fdp; fp->f_flag = flag & FMASK; - fdp = curproc->p_fd; - ff = atomic_load_consume(&fdp->fd_dt)->dt_ff[fd]; - KASSERT(ff != NULL); - ff->ff_exclose = (flag & O_CLOEXEC) != 0; + fd_set_exclose(curlwp, fd, (flag & O_CLOEXEC) != 0); fp->f_type = DTYPE_MISC; fp->f_ops = fops; fp->f_data = data;