Re: random(6): undefined cast and error checking

2022-08-20 Thread Theo Buehler
On Fri, Aug 05, 2022 at 10:55:23PM +0200, Theo Buehler wrote:
> On Fri, Aug 05, 2022 at 02:12:35PM -0500, luci...@bronze.ctrl-c.club wrote:
> > So this is the final verison of the patch solving the following
> > problems:
> > 
> > >The program is broken in multiple ways: return value clamping, casting
> > >from double to uint32_t, wrong error checking for putchar, lack of
> > >warnings when compiling.
> > 
> > Now the program is more pedantic and complicated, but at least it does
> > what is says in the man page.
> > 
> > Ok, deraadt, tb?
> 
> Pretty much. Below is my suggestion, based on yours.
> 
> > +CFLAGS=-Wall -Wconversion
> 
> I dropped these again. I don't think -Wconversion is helpful here. It
> doesn't understand the semantics of arc4random_buf(), so it whines for
> no good reason. I can be convinced to add -Wall.
> 
> I pulled Campbell's license and code up to the top, in particular, we
> can avoid prototypes. I switched a few variables from unsigned to int
> since that makes more sense to me. I dropped the error check to
> fprintf().
> 
> I dealt with -e differently than in your diff: reject denominators < 1
> since those makes no sense. Document that. If -e is given, make sure
> denom is at most 256, this way arc4random_uniform() will return a value
> between 0 and 255, which exit() will not truncate. The nature of -e is
> that we can't signal an error via return value, so we had better
> succeed.
> 
> Other than that, I think this is good to go in. Surely the manual could
> be improved...

Anyone willing to ok this?

Index: random.6
===
RCS file: /cvs/src/games/random/random.6,v
retrieving revision 1.7
diff -u -p -r1.7 random.6
--- random.631 May 2007 19:19:18 -  1.7
+++ random.65 Aug 2022 20:24:45 -
@@ -43,9 +43,10 @@
 .Nm
 reads lines from the standard input and copies them to the standard
 output with a probability of 1/denominator.
-The default value for
+The
 .Ar denominator
-is 2.
+must be at least 1,
+its default value is 2.
 .Pp
 The options are as follows:
 .Bl -tag -width Ds
@@ -55,7 +56,7 @@ If the
 option is specified,
 .Nm
 does not read or write anything, and simply exits with a random
-exit value of 0 to
+exit value of 0 to the minimum of 255 and
 .Ar denominator
 \&- 1, inclusive.
 .It Fl r
Index: random.c
===
RCS file: /cvs/src/games/random/random.c,v
retrieving revision 1.20
diff -u -p -r1.20 random.c
--- random.c7 Mar 2016 12:07:56 -   1.20
+++ random.c5 Aug 2022 20:30:53 -
@@ -33,8 +33,35 @@
  * SUCH DAMAGE.
  */
 
+/*-
+ * Copyright (c) 2014 Taylor R. Campbell
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *notice, this list of conditions and the following disclaimer in the
+ *documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
+ * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+ * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+ * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -42,6 +69,101 @@
 __dead void usage(void);
 
 int
+clz64(uint64_t x)
+{
+   static const uint64_t mask[] = {
+   0x, 0x, 0xff00, 0xf0, 0xc, 0x2,
+   };
+   static const int zeroes[] = {
+   32, 16, 8, 4, 2, 1,
+   };
+   int clz = 0;
+   int i;
+
+   if (x == 0)
+   return 64;
+
+   for (i = 0; i < 6; i++) {
+   if ((x & mask[i]) == 0)
+   clz += zeroes[i];
+   else
+   x >>= zeroes[i];
+   }
+
+   return clz;
+}
+
+uint64_t
+random64(void)
+{
+   uint64_t r;
+
+   arc4random_buf(, sizeof(uint64_t));
+
+   return r;
+}
+
+/*
+ * random_real: Generate a stream of bits uniformly at random and
+ * interpret it as the fractional part of the binary expansion of a
+ * number in [0, 1], 

Re: random(6): undefined cast and error checking

2022-08-05 Thread Theo Buehler
On Fri, Aug 05, 2022 at 02:12:35PM -0500, luci...@bronze.ctrl-c.club wrote:
> So this is the final verison of the patch solving the following
> problems:
> 
> >The program is broken in multiple ways: return value clamping, casting
> >from double to uint32_t, wrong error checking for putchar, lack of
> >warnings when compiling.
> 
> Now the program is more pedantic and complicated, but at least it does
> what is says in the man page.
> 
> Ok, deraadt, tb?

Pretty much. Below is my suggestion, based on yours.

> +CFLAGS=-Wall -Wconversion

I dropped these again. I don't think -Wconversion is helpful here. It
doesn't understand the semantics of arc4random_buf(), so it whines for
no good reason. I can be convinced to add -Wall.

I pulled Campbell's license and code up to the top, in particular, we
can avoid prototypes. I switched a few variables from unsigned to int
since that makes more sense to me. I dropped the error check to
fprintf().

I dealt with -e differently than in your diff: reject denominators < 1
since those makes no sense. Document that. If -e is given, make sure
denom is at most 256, this way arc4random_uniform() will return a value
between 0 and 255, which exit() will not truncate. The nature of -e is
that we can't signal an error via return value, so we had better
succeed.

Other than that, I think this is good to go in. Surely the manual could
be improved...

Index: random.6
===
RCS file: /cvs/src/games/random/random.6,v
retrieving revision 1.7
diff -u -p -r1.7 random.6
--- random.631 May 2007 19:19:18 -  1.7
+++ random.65 Aug 2022 20:24:45 -
@@ -43,9 +43,10 @@
 .Nm
 reads lines from the standard input and copies them to the standard
 output with a probability of 1/denominator.
-The default value for
+The
 .Ar denominator
-is 2.
+must be at least 1,
+its default value is 2.
 .Pp
 The options are as follows:
 .Bl -tag -width Ds
@@ -55,7 +56,7 @@ If the
 option is specified,
 .Nm
 does not read or write anything, and simply exits with a random
-exit value of 0 to
+exit value of 0 to the minimum of 255 and
 .Ar denominator
 \&- 1, inclusive.
 .It Fl r
Index: random.c
===
RCS file: /cvs/src/games/random/random.c,v
retrieving revision 1.20
diff -u -p -r1.20 random.c
--- random.c7 Mar 2016 12:07:56 -   1.20
+++ random.c5 Aug 2022 20:30:53 -
@@ -33,8 +33,35 @@
  * SUCH DAMAGE.
  */
 
+/*-
+ * Copyright (c) 2014 Taylor R. Campbell
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *notice, this list of conditions and the following disclaimer in the
+ *documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
+ * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+ * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+ * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -42,6 +69,101 @@
 __dead void usage(void);
 
 int
+clz64(uint64_t x)
+{
+   static const uint64_t mask[] = {
+   0x, 0x, 0xff00, 0xf0, 0xc, 0x2,
+   };
+   static const int zeroes[] = {
+   32, 16, 8, 4, 2, 1,
+   };
+   int clz = 0;
+   int i;
+
+   if (x == 0)
+   return 64;
+
+   for (i = 0; i < 6; i++) {
+   if ((x & mask[i]) == 0)
+   clz += zeroes[i];
+   else
+   x >>= zeroes[i];
+   }
+
+   return clz;
+}
+
+uint64_t
+random64(void)
+{
+   uint64_t r;
+
+   arc4random_buf(, sizeof(uint64_t));
+
+   return r;
+}
+
+/*
+ * random_real: Generate a stream of bits uniformly at random and
+ * interpret it as the fractional part of the binary expansion of a
+ * number in [0, 1], 0.101001010100...; then round it.
+ */
+double
+random_real(void)
+{
+   int exponent = -64;
+   uint64_t significand;
+   int shift;
+
+   /*
+

Re: random(6): undefined cast and error checking

2022-08-05 Thread lucic71
So this is the final verison of the patch solving the following
problems:

>The program is broken in multiple ways: return value clamping, casting
>from double to uint32_t, wrong error checking for putchar, lack of
>warnings when compiling.

Now the program is more pedantic and complicated, but at least it does
what is says in the man page.

Ok, deraadt, tb?

Index: Makefile
===
RCS file: /cvs/src/games/random/Makefile,v
retrieving revision 1.2
diff -u -p -r1.2 Makefile
--- Makefile21 Sep 1997 11:36:51 -  1.2
+++ Makefile5 Aug 2022 18:59:11 -
@@ -2,5 +2,6 @@
 
 PROG=  random
 MAN=   random.6
+CFLAGS=-Wall -Wconversion
 
 .include 
Index: random.6
===
RCS file: /cvs/src/games/random/random.6,v
retrieving revision 1.7
diff -u -p -r1.7 random.6
--- random.631 May 2007 19:19:18 -  1.7
+++ random.65 Aug 2022 18:59:11 -
@@ -55,9 +55,7 @@ If the
 option is specified,
 .Nm
 does not read or write anything, and simply exits with a random
-exit value of 0 to
-.Ar denominator
-\&- 1, inclusive.
+exit value of 0 to 255, inclusive.
 .It Fl r
 The
 .Fl r
Index: random.c
===
RCS file: /cvs/src/games/random/random.c,v
retrieving revision 1.20
diff -u -p -r1.20 random.c
--- random.c7 Mar 2016 12:07:56 -   1.20
+++ random.c5 Aug 2022 18:59:11 -
@@ -35,11 +35,15 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
 
 __dead void usage(void);
+unsigned clz64(uint64_t x);
+uint64_t random64(void);
+double random_real(void);
 
 int
 main(int argc, char *argv[])
@@ -84,9 +88,12 @@ main(int argc, char *argv[])
usage(); 
}
 
-   /* Compute a random exit status between 0 and denom - 1. */
+   /* 
+* Compute a random exit status between 0 and 255.
+* Shells only consider the first 8 bits of the return code.
+*/
if (random_exit)
-   return (arc4random_uniform(denom));
+   return (int)arc4random_uniform(256);
 
/*
 * Act as a filter, randomly choosing lines of the standard input
@@ -101,18 +108,19 @@ main(int argc, char *argv[])
 * 0 (which has a 1 / denom chance of being true), we select the
 * line.
 */
-   selected = arc4random_uniform(denom) == 0;
+   selected = random_real() < 1 / denom;
while ((ch = getchar()) != EOF) {
-   if (selected)
-   (void)putchar(ch);
-   if (ch == '\n') {
-   /* End of that line.  See if we got an error. */
-   if (ferror(stdout))
-   err(2, "stdout");
+   int retch;
 
-   /* Now see if the next line is to be printed. */
-   selected = arc4random_uniform(denom) == 0;
+   if (selected) {
+   errno = 0;
+   retch = putchar(ch);
+   if (retch == EOF && errno)
+   err(2, "putchar");
}
+   if (ch == '\n')
+   /* Now see if the next line is to be printed. */
+   selected = random_real() < 1 / denom;
}
if (ferror(stdin))
err(2, "stdin");
@@ -123,6 +131,128 @@ void
 usage(void)
 {
 
-   (void)fprintf(stderr, "usage: %s [-er] [denominator]\n", getprogname());
+   if (fprintf(stderr, "usage: %s [-er] [denominator]\n", getprogname()) < 
0)
+   err(2, "fprintf");
exit(1);
 }
+
+unsigned
+clz64(uint64_t x)
+{
+   static const uint64_t mask[] = {
+   0x, 0x, 0xff00, 0xf0, 0xc, 0x2,
+   };
+   static unsigned zeroes[] = {
+   32, 16, 8, 4, 2, 1,
+   };
+   unsigned clz = 0;
+   int i;
+
+   if (x == 0)
+   return 64;
+
+   for (i = 0; i < 6; i++) {
+   if ((x & mask[i]) == 0)
+   clz += zeroes[i];
+   else
+   x >>= zeroes[i];
+   }
+
+   return clz;
+}
+
+uint64_t
+random64(void)
+{
+   uint64_t r;
+
+   arc4random_buf(, sizeof(uint64_t));
+
+   return r;
+}
+
+/*
+ * random_real: Generate a stream of bits uniformly at random and
+ * interpret it as the fractional part of the binary expansion of a
+ * number in [0, 1], 0.101001010100...; then round it.
+ */
+double
+random_real(void)
+{
+   int exponent = -64;
+   uint64_t significand;
+   unsigned shift;
+
+   /*
+* Read zeros into the exponent until we hit a one; the rest
+* will go into the significand.
+*/
+   while (__predict_false((significand = random64()) == 0)) {
+   exponent -= 64;
+   /*
+* If the 

Re: random(6): undefined cast and error checking

2022-08-04 Thread Theo de Raadt
Theo Buehler  wrote:

> On Thu, Aug 04, 2022 at 07:11:40PM -0600, Theo de Raadt wrote:
> > And anyways, this directory is not in $PATH by default, so there is no
> > risk.
> 
> Unless you create a user during install in which case /etc/skel will
> give you a $PATH containing /usr/games via dot.profile (I assume for
> csh users via dot.cshrc).

Then fix that...



Re: random(6): undefined cast and error checking

2022-08-04 Thread Theo Buehler
On Thu, Aug 04, 2022 at 07:11:40PM -0600, Theo de Raadt wrote:
> And anyways, this directory is not in $PATH by default, so there is no
> risk.

Unless you create a user during install in which case /etc/skel will
give you a $PATH containing /usr/games via dot.profile (I assume for
csh users via dot.cshrc).



Re: random(6): undefined cast and error checking

2022-08-04 Thread Theo de Raadt
luci...@bronze.ctrl-c.club wrote:

> t...@theobuehler.org wrote:
> >I have no strong opinion. I'm fine with either approach. It's such a
> >silly program...
> >
> >As an aside, random -e has been completely broken (it's non-uniform)
> >since forever.  To fix -e, we should clamp denom to an integer between
> >1 and 256, otherwise the truncation of the exit exit code to an 8-bit
> >int introduces bias for numbers larger than 256 (that aren't powers of
> >2).
> 
> The program is broken in multiple ways: return value clamping, casting
> from double to uint32_t, wrong error checking for putchar, lack of
> warnings when compiling.
> 
> What I don't understand is why such a wrong program has its place in
> OpenBSD. Maybe it's historical reasons, who knows. But the fact that it
> exists and it's badly written bothers me.

the games directory is a playground.

It is low-risk place in our tree where someone can find some very old
code and try to raise it to the standard of non-games code, and learn
lessons.  If they make a mistake, that's OK.  If they learn from the
mistakes and find some great idea, that's even better.

A bunch of pledge and unveil experimentation happened in games, and some
privdrop and privsep has also happened there.  All which are incomplete.
There is more which can be done there.  It kind of demonstrates that very
old software CAN adopt the security-focus changes we've been making.
The ideas are pretty universal, and can even be applied to terribly old
software.  If all the software in our ecosystem was modern, there would
be no place to practice and learn.

And anyways, this directory is not in $PATH by default, so there is no
risk.



Re: random(6): undefined cast and error checking

2022-08-04 Thread lucic71
t...@theobuehler.org wrote:
>I have no strong opinion. I'm fine with either approach. It's such a
>silly program...
>
>As an aside, random -e has been completely broken (it's non-uniform)
>since forever.  To fix -e, we should clamp denom to an integer between
>1 and 256, otherwise the truncation of the exit exit code to an 8-bit
>int introduces bias for numbers larger than 256 (that aren't powers of
>2).

The program is broken in multiple ways: return value clamping, casting
from double to uint32_t, wrong error checking for putchar, lack of
warnings when compiling.

What I don't understand is why such a wrong program has its place in
OpenBSD. Maybe it's historical reasons, who knows. But the fact that it
exists and it's badly written bothers me.



Re: random(6): undefined cast and error checking

2022-08-04 Thread Theo Buehler
On Wed, Aug 03, 2022 at 04:21:43PM -0500, luci...@bronze.ctrl-c.club wrote:
> >On Wed, Aug 03, 2022 at 08:26:20AM -0600, Theo de Raadt wrote:
> >> luci...@bronze.ctrl-c.club wrote:
> >> 
> >> > Another way to solve this problem would be to trim the numbers with
> >> > something like this: if (denom > UINT32_MAX) denom = UINT32_MAX.
> >> 
> >> And then document that the program returns incorrect results?
> 
> t...@theobuehler.org wrote:
> 
> >See this. There's also linked BSD-licensed code that should not be hard
> >to adapt.
> >
> >https://mumble.net/~campbell/2014/04/28/uniform-random-float?
> >
> >Then pick a line if the random number drawn is < 1/denom.
> 
> So now we have two options. (1) We use the trim trick and document
> the range of values that the denominator can be in. (2) We use the
> code linked by tb@ to generate uniform doubles. (see patch below)
> 
> (2) seems to work but I need to spend some more time to understand
> the algorithm behind it.
> 
> (1) seems more robust cosidering that the man page will also contain
> information about the range of denom.

Thanks.

I have no strong opinion. I'm fine with either approach. It's such a
silly program...

Since you've already done 90% of the work for (2), we can push it over
the line. This will need adding Taylor R. Campbell's license since a
significant chunk of code is included verbatim. Add the license below
the existing one.

As an aside, random -e has been completely broken (it's non-uniform)
since forever.  To fix -e, we should clamp denom to an integer between 1
and 256, otherwise the truncation of the exit exit code to an 8-bit int
introduces bias for numbers larger than 256 (that aren't powers of 2).

Restricting denom >= 1 would seem necessary for all modes of this
program given that 1/denominator is documented to be a probability.


The patch doesn't compile due to missing clz64. Instead of using a
__builtin (which I'm unsure if it's available on all our ancient gccs),
I think we can implement this naively. Something along these lines
should be good enough (only very lightly tested):

int
clz64(uint64_t x)
{
static const uint64_t mask[] = {
0x, 0x, 0xff00, 0xf0, 0xc, 0x2,
};
static int zeroes[] = {
32, 16, 8, 4, 2, 1,
};
int clz = 0;
int i;

if (x == 0)
return 64;

for (i = 0; i < 6; i++) {
if ((x & mask[i]) == 0)
clz += zeroes[i];
else
x >>= zeroes[i];
}

return clz;
}

Some more comments inline.

> 
> Index: random.c
> ===
> RCS file: /cvs/src/games/random/random.c,v
> retrieving revision 1.20
> diff -u -p -r1.20 random.c
> --- random.c  7 Mar 2016 12:07:56 -   1.20
> +++ random.c  3 Aug 2022 20:28:08 -
> @@ -38,6 +38,75 @@
>  #include 
>  #include 
>  #include 
> +#include 
> +#include 

style: keep these in alphabetical order

> +
> +uint64_t random64(void)

style: tab should be a newline:

uint64_t
random64(void)

> +{
> + uint64_t r;

style: add empty line

> + arc4random_buf(, sizeof(uint64_t));

and here

> + return r;
> +}
> +
> +/*
> + * random_real: Generate a stream of bits uniformly at random and
> + * interpret it as the fractional part of the binary expansion of a
> + * number in [0, 1], 0.101001010100...; then round it.
> + */
> +double
> +random_real(void)
> +{
> + int exponent = -64;
> + uint64_t significand;
> + unsigned shift;
> +
> + /*
> +  * Read zeros into the exponent until we hit a one; the rest
> +  * will go into the significand.
> +  */
> + while (__predict_false((significand = random64()) == 0)) {
> + exponent -= 64;
> + /*
> +  * If the exponent falls below -1074 = emin + 1 - p,
> +  * the exponent of the smallest subnormal, we are
> +  * guaranteed the result will be rounded to zero.  This
> +  * case is so unlikely it will happen in realistic
> +  * terms only if random64 is broken.
> +  */
> + if (__predict_false(exponent < -1074))
> + return 0;
> + }
> +
> + /*
> +  * There is a 1 somewhere in significand, not necessarily in
> +  * the most significant position.  If there are leading zeros,
> +  * shift them into the exponent and refill the less-significant
> +  * bits of the significand.  Can't predict one way or another
> +  * whether there are leading zeros: there's a fifty-fifty
> +  * chance, if random64 is uniformly distributed.
> +  */
> + shift = clz64(significand);
> + if (shift != 0) {
> + exponent -= shift;
> + significand <<= shift;
> + significand |= (random64() >> (64 - shift));
> + }
> +
> + /*
> +  * Set the 

Re: random(6): undefined cast and error checking

2022-08-03 Thread lucic71
>On Wed, Aug 03, 2022 at 08:26:20AM -0600, Theo de Raadt wrote:
>> luci...@bronze.ctrl-c.club wrote:
>> 
>> > Another way to solve this problem would be to trim the numbers with
>> > something like this: if (denom > UINT32_MAX) denom = UINT32_MAX.
>> 
>> And then document that the program returns incorrect results?

t...@theobuehler.org wrote:

>See this. There's also linked BSD-licensed code that should not be hard
>to adapt.
>
>https://mumble.net/~campbell/2014/04/28/uniform-random-float?
>
>Then pick a line if the random number drawn is < 1/denom.

So now we have two options. (1) We use the trim trick and document
the range of values that the denominator can be in. (2) We use the
code linked by tb@ to generate uniform doubles. (see patch below)

(2) seems to work but I need to spend some more time to understand
the algorithm behind it.

(1) seems more robust cosidering that the man page will also contain
information about the range of denom.

Index: random.c
===
RCS file: /cvs/src/games/random/random.c,v
retrieving revision 1.20
diff -u -p -r1.20 random.c
--- random.c7 Mar 2016 12:07:56 -   1.20
+++ random.c3 Aug 2022 20:28:08 -
@@ -38,6 +38,75 @@
 #include 
 #include 
 #include 
+#include 
+#include 
+
+uint64_t   random64(void)
+{
+   uint64_t r;
+   arc4random_buf(, sizeof(uint64_t));
+   return r;
+}
+
+/*
+ * random_real: Generate a stream of bits uniformly at random and
+ * interpret it as the fractional part of the binary expansion of a
+ * number in [0, 1], 0.101001010100...; then round it.
+ */
+double
+random_real(void)
+{
+   int exponent = -64;
+   uint64_t significand;
+   unsigned shift;
+
+   /*
+* Read zeros into the exponent until we hit a one; the rest
+* will go into the significand.
+*/
+   while (__predict_false((significand = random64()) == 0)) {
+   exponent -= 64;
+   /*
+* If the exponent falls below -1074 = emin + 1 - p,
+* the exponent of the smallest subnormal, we are
+* guaranteed the result will be rounded to zero.  This
+* case is so unlikely it will happen in realistic
+* terms only if random64 is broken.
+*/
+   if (__predict_false(exponent < -1074))
+   return 0;
+   }
+
+   /*
+* There is a 1 somewhere in significand, not necessarily in
+* the most significant position.  If there are leading zeros,
+* shift them into the exponent and refill the less-significant
+* bits of the significand.  Can't predict one way or another
+* whether there are leading zeros: there's a fifty-fifty
+* chance, if random64 is uniformly distributed.
+*/
+   shift = clz64(significand);
+   if (shift != 0) {
+   exponent -= shift;
+   significand <<= shift;
+   significand |= (random64() >> (64 - shift));
+   }
+
+   /*
+* Set the sticky bit, since there is almost surely another 1
+* in the bit stream.  Otherwise, we might round what looks
+* like a tie to even when, almost surely, were we to look
+* further in the bit stream, there would be a 1 breaking the
+* tie.
+*/
+   significand |= 1;
+
+   /*
+* Finally, convert to double (rounding) and scale by
+* 2^exponent.
+*/
+   return ldexp((double)significand, exponent);
+}
 
 __dead void usage(void);
 
@@ -86,7 +155,7 @@ main(int argc, char *argv[])
 
/* Compute a random exit status between 0 and denom - 1. */
if (random_exit)
-   return (arc4random_uniform(denom));
+   return random_real();
 
/*
 * Act as a filter, randomly choosing lines of the standard input
@@ -101,7 +170,7 @@ main(int argc, char *argv[])
 * 0 (which has a 1 / denom chance of being true), we select the
 * line.
 */
-   selected = arc4random_uniform(denom) == 0;
+   selected = random_real() < (1 / denom);
while ((ch = getchar()) != EOF) {
if (selected)
(void)putchar(ch);
@@ -111,7 +180,7 @@ main(int argc, char *argv[])
err(2, "stdout");
 
/* Now see if the next line is to be printed. */
-   selected = arc4random_uniform(denom) == 0;
+   selected = random_real() < (1 / denom);
}
}
if (ferror(stdin))



Re: random(6): undefined cast and error checking

2022-08-03 Thread Theo Buehler
On Wed, Aug 03, 2022 at 08:26:20AM -0600, Theo de Raadt wrote:
> luci...@bronze.ctrl-c.club wrote:
> 
> > Another way to solve this problem would be to trim the numbers with
> > something like this: if (denom > UINT32_MAX) denom = UINT32_MAX.
> 
> And then document that the program returns incorrect results?
> 
> > >However, using drand48() will mean using a floating point modulus.
> > >We lose the uniform aspect.  I'm not the non-uniform aspects are as
> > >visible in the floating point range.  Succeeding for the full floating
> > >point range is more important than what arc4random_uniform() is trying
> > >to do.  But maybe a uniform version of the double code can grow out of
> > >using the drand48.c code?
> > 
> > Theo, I'm not sure I understand your reasoning. Are you trying to say
> > that we should see if it's possible to create a drand48_uniform? 
> 
> Yes, inside the program.
> 

See this. There's also linked BSD-licensed code that should not be hard
to adapt.

https://mumble.net/~campbell/2014/04/28/uniform-random-float?

Then pick a line if the random number drawn is < 1/denom.



Re: random(6): undefined cast and error checking

2022-08-03 Thread Theo de Raadt
luci...@bronze.ctrl-c.club wrote:

> Another way to solve this problem would be to trim the numbers with
> something like this: if (denom > UINT32_MAX) denom = UINT32_MAX.

And then document that the program returns incorrect results?

> >However, using drand48() will mean using a floating point modulus.
> >We lose the uniform aspect.  I'm not the non-uniform aspects are as
> >visible in the floating point range.  Succeeding for the full floating
> >point range is more important than what arc4random_uniform() is trying
> >to do.  But maybe a uniform version of the double code can grow out of
> >using the drand48.c code?
> 
> Theo, I'm not sure I understand your reasoning. Are you trying to say
> that we should see if it's possible to create a drand48_uniform? 

Yes, inside the program.



Re: random(6): undefined cast and error checking

2022-08-03 Thread lucic71
>luci...@bronze.ctrl-c.club wrote:
>
>> This patch solves two problems.
>> 
>> First, abort if denom is greater than UINT32_MAX. arc4random_uniform
>> expects an uint32_t. If floor(denom) is greater than UINT32_MAX then
>> the cast is undefined behaviour.
>
>This isn't a very important program, but the points are valid because
>we may learn something which applies in other places.
>
>So your change makes fail with error + stderr output for large
>floating point numbers, rather than producing limited range values
>(the range is incorrect, too small, but it still succeeds).  You
>make it more correct, but scripts using this would fail badly.  That
>bothers me.

Another way to solve this problem would be to trim the numbers with
something like this: if (denom > UINT32_MAX) denom = UINT32_MAX.

>I can understand why code was moved to arc4random(3) in 1997, and to
>arc4random_uniform(3) in 2008.
>
>In 2014, I changed the rand/random/drand48 functions to be
>non-deterministic by default.  drand48(3) is a double-sized random
>producer.  Maybe we should move back to drand48, to gain the full
>range?
>Alternatively, copy the drand48.c code locally, in case it needs
>some tweaks.
>
>However, using drand48() will mean using a floating point modulus.
>We lose the uniform aspect.  I'm not the non-uniform aspects are as
>visible in the floating point range.  Succeeding for the full floating
>point range is more important than what arc4random_uniform() is trying
>to do.  But maybe a uniform version of the double code can grow out of
>using the drand48.c code?

Theo, I'm not sure I understand your reasoning. Are you trying to say
that we should see if it's possible to create a drand48_uniform? 

-lucic71



Re: random(6): undefined cast and error checking

2022-08-01 Thread Theo de Raadt
luci...@bronze.ctrl-c.club wrote:

> This patch solves two problems.
> 
> First, abort if denom is greater than UINT32_MAX. arc4random_uniform
> expects an uint32_t. If floor(denom) is greater than UINT32_MAX then
> the cast is undefined behaviour.

This isn't a very important program, but the points are valid because
we may learn something which applies in other places.

So your change makes fail with error + stderr output for large floating
point numbers, rather than producing limited range values (the range is
incorrect, too small, but it still succeeds).  You make it more correct,
but scripts using this would fail badly.  That bothers me.

I can understand why code was moved to arc4random(3) in 1997, and to
arc4random_uniform(3) in 2008.

In 2014, I changed the rand/random/drand48 functions to be
non-deterministic by default.  drand48(3) is a double-sized random
producer.  Maybe we should move back to drand48, to gain the full range?

Alternatively, copy the drand48.c code locally, in case it needs
some tweaks.

However, using drand48() will mean using a floating point modulus.
We lose the uniform aspect.  I'm not the non-uniform aspects are as
visible in the floating point range.  Succeeding for the full floating
point range is more important than what arc4random_uniform() is trying
to do.  But maybe a uniform version of the double code can grow out of
using the drand48.c code?



random(6): undefined cast and error checking

2022-08-01 Thread lucic71
Hi,

This patch solves two problems.

First, abort if denom is greater than UINT32_MAX. arc4random_uniform
expects an uint32_t. If floor(denom) is greater than UINT32_MAX then
the cast is undefined behaviour.

Second, use consistent error checking for printing the lines. putchar
sets errno while ferrror tests the internal flags of the stream.

-lucic71

Index: random.c
===
RCS file: /cvs/src/games/random/random.c,v
retrieving revision 1.20
diff -u -p -r1.20 random.c
--- random.c7 Mar 2016 12:07:56 -   1.20
+++ random.c1 Aug 2022 19:14:11 -
@@ -79,6 +79,8 @@ main(int argc, char *argv[])
err(1, "%s", *argv);
if (denom == 0 || *ep != '\0')
errx(1, "denominator is not valid.");
+   if ((uint64_t) denom > 0x)
+   errx(1, "denominator out of bounds.");
break;
default:
usage(); 
@@ -103,16 +105,17 @@ main(int argc, char *argv[])
 */
selected = arc4random_uniform(denom) == 0;
while ((ch = getchar()) != EOF) {
-   if (selected)
-   (void)putchar(ch);
-   if (ch == '\n') {
-   /* End of that line.  See if we got an error. */
-   if (ferror(stdout))
-   err(2, "stdout");
+   int retch;
 
+   if (selected) {
+   errno = 0;
+   retch = putchar(ch);
+   if (retch == EOF && errno)
+   err(2, "putchar");
+   }
+   if (ch == '\n')
/* Now see if the next line is to be printed. */
selected = arc4random_uniform(denom) == 0;
-   }
}
if (ferror(stdin))
err(2, "stdin");