The select(2) family of syscalls was designed to work this way (block when file descriptors are not available, even when they cannot become available). For example:
select(0, NULL, NULL, NULL, NULL); blocks forever. In fact in the very old days (before usleep, nanosleep, clock_nanosleep), select(2) was the only way to get sub-second sleeps from the kernel, and the idiom was similar to the above: void usleep(int usec) { struct timeval tv; tv.tv_sec = 0; tv.tv_usec = usec; select(0, NULL, NULL, NULL, &tv); } Best, christos > On Apr 1, 2020, at 8:19 AM, Theo de Raadt <dera...@openbsd.org> wrote: > > The regress test is wrong. > > The first parameter to pselect has to tell it which bits to look at. > It should be something like getdtablesize(), or fd+1, or FD_SETSIZE. > > And then there's all the handwaving about fd_set overflow, meaning what > if fd >= FD_SETSIZE. I've never heard of an exploitable one though. > > Martin Pieuchot <m...@openbsd.org> wrote: > >> While working on moving pselect(2) outside of the KERNEL_LOCK() I found >> this surprising {p,}select(2) behavior exposed by the libc/sys/t_select.c >> regression test. Should it be considered as a bug or feature? >> >> The above mentioned test does the following (simplified): >> >> fd = open("/dev/null", O_RDONLY); >> FD_ZERO(&rset); >> FD_SET(fd, &rset); >> pselect(1, &rset, NULL, NULL, NULL, &set); >> >> Note that in most environments, open(2) will return a value higher than >> 1 which means the value written by FD_SET() will never be checked by the >> current implementation of {p,}select(2). In this case, even if it isn't >> waiting for any condition, the current implementation will block. >> >> To put it differently, the code above is an obfuscated way to write: >> >> sigsuspend(&set); >> >> If we agree that this is not what the user wants to do, I'm suggesting >> the diff below which makes {p,}select(2) return EINVAL if no bit has >> been found in the given set up to `nfds'. >> >> Independently from our decision, I'd suggest to rename the variable >> name `nfds' in the syscall documentation because it can lead to this >> kind of confusion. >> >> Finally if we accept this change the regression test will have to be >> adapted because polling "/dev/null" for reading always return true so >> pselect(2) wont block waiting for a signal. Any suggestion? >> >> Index: sys/kern/sys_generic.c >> =================================================================== >> RCS file: /cvs/src/sys/kern/sys_generic.c,v >> retrieving revision 1.131 >> diff -u -p -r1.131 sys_generic.c >> --- sys/kern/sys_generic.c 20 Mar 2020 04:11:05 -0000 1.131 >> +++ sys/kern/sys_generic.c 1 Apr 2020 10:27:51 -0000 >> @@ -703,7 +703,7 @@ selscan(struct proc *p, fd_set *ibits, f >> { >> caddr_t cibits = (caddr_t)ibits, cobits = (caddr_t)obits; >> struct filedesc *fdp = p->p_fd; >> - int msk, i, j, fd; >> + int msk, i, j, fd, foundfds = 0; >> fd_mask bits; >> struct file *fp; >> int n = 0; >> @@ -719,6 +719,7 @@ selscan(struct proc *p, fd_set *ibits, f >> bits &= ~(1 << j); >> if ((fp = fd_getfile(fdp, fd)) == NULL) >> return (EBADF); >> + foundfds++; >> if ((*fp->f_ops->fo_poll)(fp, flag[msk], p)) { >> FD_SET(fd, pobits); >> n++; >> @@ -727,6 +728,8 @@ selscan(struct proc *p, fd_set *ibits, f >> } >> } >> } >> + if (foundfds == 0) >> + return (EINVAL); >> *retval = n; >> return (0); >> } >> Index: lib/libc/sys/select.2 >> =================================================================== >> RCS file: /cvs/src/lib/libc/sys/select.2,v >> retrieving revision 1.42 >> diff -u -p -r1.42 select.2 >> --- lib/libc/sys/select.2 17 Sep 2016 01:01:42 -0000 1.42 >> +++ lib/libc/sys/select.2 1 Apr 2020 10:35:29 -0000 >> @@ -188,7 +188,7 @@ The specified time limit is invalid. >> One of its components is negative or too large. >> .It Bq Er EINVAL >> .Fa nfds >> -was less than 0. >> +was smaller than the smallest descriptor specified in the sets. >> .El >> .Sh SEE ALSO >> .Xr accept 2 , >>
signature.asc
Description: Message signed with OpenPGP