Re: BUFSIZ-related pessimization in fvwrite.c

2024-03-29 Thread Christos Zoulas
In article ,
enh   wrote:
>(if anyone knows the equivalent freebsd list, please add them --- this
>code is the same between all three bsds.)
>
>Apple patched their copy of the FreeBSD fvwrite.c to change BUFSIZ to
>INT_MAX to massively improve the performance of large unbuffered
>writes[1], and patched the buffered case to use the largest multiple
>of the buffer size to massively improve the performance of large
>buffered writes:
>
>https://github.com/apple-oss-distributions/Libc/commit/c5a3293354e22262702a3add5b2dfc9bb0b93b85#diff-3b844a19cfb0aab1a23f7fbc457d3bce7453513730c489a72f66ff89d6557ff3
>
>i've tested similar changes to openbsd's fvwrite.c with great results.
>
>is it a _requirement_ that buffered writes _only_ happen in multiples
>of the buffer size? that seems unlikely to me (a) because of short
>writes and (b) musl just does a writev() of what's in the buffer and
>what it was just given in case of a larger-than-buffer write and (c)
>we already changed the openbsd fread() to read directly into the
>caller's buffer regardless of size, so the equivalent behavior on the
>_write_ side seems reasonable to me?

Makes sense, committed.

christos



Re: __hldtoa broken for ld128

2020-04-11 Thread Christos Zoulas
In article ,
enh   wrote:
>this was found by fuzzing the LLVM __cxa_demangle on an ld128 Android
>system using hwasan, but it turns out no to simply be a buffer
>overflow --- the results are just wrong. (which shows how much anyone
>uses ld128 in conjunction with %a!)

Thanks a lot for the heads up. NetBSD uses 2x64 bit words instead
of 4x32 bit words to represent long double so we avoided this bug
by chance (we don't have EXT_FRAC{H,L}MBITS. Nevertheless I will
commit something like it for completeness.

https://nxr.netbsd.org/xref/src/sys/sys/ieee754.h#126

Best,

christos



Re: pselect(2) bug or feature?

2020-04-01 Thread Christos Zoulas
The select(2) family of syscalls was designed to work this way (block when
file descriptors are not available, even when they cannot become available).
For example:

select(0, NULL, NULL, NULL, NULL);

blocks forever.

In fact in the very old days (before usleep, nanosleep, clock_nanosleep),
select(2) was the only way to get sub-second sleeps from the kernel, and
the idiom was similar to the above:

void
usleep(int usec) {
struct timeval tv;

tv.tv_sec = 0;
tv.tv_usec = usec;

select(0, NULL, NULL, NULL, );
}

Best,

christos

> On Apr 1, 2020, at 8:19 AM, Theo de Raadt  wrote:
> 
> The regress test is wrong.
> 
> The first parameter to pselect has to tell it which bits to look at.
> It should be something like getdtablesize(), or fd+1, or FD_SETSIZE.
> 
> And then there's all the handwaving about fd_set overflow, meaning what
> if fd >= FD_SETSIZE.  I've never heard of an exploitable one  though.
> 
> Martin Pieuchot  wrote:
> 
>> While working on moving pselect(2) outside of the KERNEL_LOCK() I found
>> this surprising {p,}select(2) behavior exposed by the libc/sys/t_select.c
>> regression test.  Should it be considered as a bug or feature?
>> 
>> The above mentioned test does the following (simplified):
>> 
>>  fd = open("/dev/null", O_RDONLY);
>>  FD_ZERO();
>>  FD_SET(fd, );
>>  pselect(1, , NULL, NULL, NULL, );
>> 
>> Note that in most environments, open(2) will return a value higher than
>> 1 which means the value written by FD_SET() will never be checked by the
>> current implementation of {p,}select(2).  In this case, even if it isn't
>> waiting for any condition, the current implementation will block.
>> 
>> To put it differently, the code above is an obfuscated way to write:
>> 
>>  sigsuspend();
>> 
>> If we agree that this is not what the user wants to do, I'm suggesting
>> the diff below which makes {p,}select(2) return EINVAL if no bit has
>> been found in the given set up to `nfds'.
>> 
>> Independently from our decision, I'd suggest to rename the variable
>> name `nfds' in the syscall documentation because it can lead to this
>> kind of confusion.
>> 
>> Finally if we accept this change the regression test will have to be
>> adapted because polling "/dev/null" for reading always return true so
>> pselect(2) wont block waiting for a signal.  Any suggestion?
>> 
>> Index: sys/kern/sys_generic.c
>> ===
>> RCS file: /cvs/src/sys/kern/sys_generic.c,v
>> retrieving revision 1.131
>> diff -u -p -r1.131 sys_generic.c
>> --- sys/kern/sys_generic.c   20 Mar 2020 04:11:05 -  1.131
>> +++ sys/kern/sys_generic.c   1 Apr 2020 10:27:51 -
>> @@ -703,7 +703,7 @@ selscan(struct proc *p, fd_set *ibits, f
>> {
>>  caddr_t cibits = (caddr_t)ibits, cobits = (caddr_t)obits;
>>  struct filedesc *fdp = p->p_fd;
>> -int msk, i, j, fd;
>> +int msk, i, j, fd, foundfds = 0;
>>  fd_mask bits;
>>  struct file *fp;
>>  int n = 0;
>> @@ -719,6 +719,7 @@ selscan(struct proc *p, fd_set *ibits, f
>>  bits &= ~(1 << j);
>>  if ((fp = fd_getfile(fdp, fd)) == NULL)
>>  return (EBADF);
>> +foundfds++;
>>  if ((*fp->f_ops->fo_poll)(fp, flag[msk], p)) {
>>  FD_SET(fd, pobits);
>>  n++;
>> @@ -727,6 +728,8 @@ selscan(struct proc *p, fd_set *ibits, f
>>  }
>>  }
>>  }
>> +if (foundfds == 0)
>> +return (EINVAL);
>>  *retval = n;
>>  return (0);
>> }
>> Index: lib/libc/sys/select.2
>> ===
>> RCS file: /cvs/src/lib/libc/sys/select.2,v
>> retrieving revision 1.42
>> diff -u -p -r1.42 select.2
>> --- lib/libc/sys/select.217 Sep 2016 01:01:42 -  1.42
>> +++ lib/libc/sys/select.21 Apr 2020 10:35:29 -
>> @@ -188,7 +188,7 @@ The specified time limit is invalid.
>> One of its components is negative or too large.
>> .It Bq Er EINVAL
>> .Fa nfds
>> -was less than 0.
>> +was smaller than the smallest descriptor specified in the sets.
>> .El
>> .Sh SEE ALSO
>> .Xr accept 2 ,
>> 



signature.asc
Description: Message signed with OpenPGP