The select(2) family of syscalls was designed to work this way (block when
file descriptors are not available, even when they cannot become available).
For example:
select(0, NULL, NULL, NULL, NULL);
blocks forever.
In fact in the very old days (before usleep, nanosleep, clock_nanosleep),
select(2) was the only way to get sub-second sleeps from the kernel, and
the idiom was similar to the above:
void
usleep(int usec) {
struct timeval tv;
tv.tv_sec = 0;
tv.tv_usec = usec;
select(0, NULL, NULL, NULL, );
}
Best,
christos
> On Apr 1, 2020, at 8:19 AM, Theo de Raadt wrote:
>
> The regress test is wrong.
>
> The first parameter to pselect has to tell it which bits to look at.
> It should be something like getdtablesize(), or fd+1, or FD_SETSIZE.
>
> And then there's all the handwaving about fd_set overflow, meaning what
> if fd >= FD_SETSIZE. I've never heard of an exploitable one though.
>
> Martin Pieuchot wrote:
>
>> While working on moving pselect(2) outside of the KERNEL_LOCK() I found
>> this surprising {p,}select(2) behavior exposed by the libc/sys/t_select.c
>> regression test. Should it be considered as a bug or feature?
>>
>> The above mentioned test does the following (simplified):
>>
>> fd = open("/dev/null", O_RDONLY);
>> FD_ZERO();
>> FD_SET(fd, );
>> pselect(1, , NULL, NULL, NULL, );
>>
>> Note that in most environments, open(2) will return a value higher than
>> 1 which means the value written by FD_SET() will never be checked by the
>> current implementation of {p,}select(2). In this case, even if it isn't
>> waiting for any condition, the current implementation will block.
>>
>> To put it differently, the code above is an obfuscated way to write:
>>
>> sigsuspend();
>>
>> If we agree that this is not what the user wants to do, I'm suggesting
>> the diff below which makes {p,}select(2) return EINVAL if no bit has
>> been found in the given set up to `nfds'.
>>
>> Independently from our decision, I'd suggest to rename the variable
>> name `nfds' in the syscall documentation because it can lead to this
>> kind of confusion.
>>
>> Finally if we accept this change the regression test will have to be
>> adapted because polling "/dev/null" for reading always return true so
>> pselect(2) wont block waiting for a signal. Any suggestion?
>>
>> Index: sys/kern/sys_generic.c
>> ===
>> RCS file: /cvs/src/sys/kern/sys_generic.c,v
>> retrieving revision 1.131
>> diff -u -p -r1.131 sys_generic.c
>> --- sys/kern/sys_generic.c 20 Mar 2020 04:11:05 - 1.131
>> +++ sys/kern/sys_generic.c 1 Apr 2020 10:27:51 -
>> @@ -703,7 +703,7 @@ selscan(struct proc *p, fd_set *ibits, f
>> {
>> caddr_t cibits = (caddr_t)ibits, cobits = (caddr_t)obits;
>> struct filedesc *fdp = p->p_fd;
>> -int msk, i, j, fd;
>> +int msk, i, j, fd, foundfds = 0;
>> fd_mask bits;
>> struct file *fp;
>> int n = 0;
>> @@ -719,6 +719,7 @@ selscan(struct proc *p, fd_set *ibits, f
>> bits &= ~(1 << j);
>> if ((fp = fd_getfile(fdp, fd)) == NULL)
>> return (EBADF);
>> +foundfds++;
>> if ((*fp->f_ops->fo_poll)(fp, flag[msk], p)) {
>> FD_SET(fd, pobits);
>> n++;
>> @@ -727,6 +728,8 @@ selscan(struct proc *p, fd_set *ibits, f
>> }
>> }
>> }
>> +if (foundfds == 0)
>> +return (EINVAL);
>> *retval = n;
>> return (0);
>> }
>> Index: lib/libc/sys/select.2
>> ===
>> RCS file: /cvs/src/lib/libc/sys/select.2,v
>> retrieving revision 1.42
>> diff -u -p -r1.42 select.2
>> --- lib/libc/sys/select.217 Sep 2016 01:01:42 - 1.42
>> +++ lib/libc/sys/select.21 Apr 2020 10:35:29 -
>> @@ -188,7 +188,7 @@ The specified time limit is invalid.
>> One of its components is negative or too large.
>> .It Bq Er EINVAL
>> .Fa nfds
>> -was less than 0.
>> +was smaller than the smallest descriptor specified in the sets.
>> .El
>> .Sh SEE ALSO
>> .Xr accept 2 ,
>>
signature.asc
Description: Message signed with OpenPGP