On Wed, Aug 03, 2022 at 04:21:43PM -0500, [email protected] wrote:
> >On Wed, Aug 03, 2022 at 08:26:20AM -0600, Theo de Raadt wrote:
> >> [email protected] wrote:
> >> 
> >> > Another way to solve this problem would be to trim the numbers with
> >> > something like this: if (denom > UINT32_MAX) denom = UINT32_MAX.
> >> 
> >> And then document that the program returns incorrect results?
> 
> [email protected] wrote:
> 
> >See this. There's also linked BSD-licensed code that should not be hard
> >to adapt.
> >
> >https://mumble.net/~campbell/2014/04/28/uniform-random-float?
> >
> >Then pick a line if the random number drawn is < 1/denom.
> 
> So now we have two options. (1) We use the trim trick and document
> the range of values that the denominator can be in. (2) We use the
> code linked by tb@ to generate uniform doubles. (see patch below)
> 
> (2) seems to work but I need to spend some more time to understand
> the algorithm behind it.
> 
> (1) seems more robust cosidering that the man page will also contain
> information about the range of denom.

Thanks.

I have no strong opinion. I'm fine with either approach. It's such a
silly program...

Since you've already done 90% of the work for (2), we can push it over
the line. This will need adding Taylor R. Campbell's license since a
significant chunk of code is included verbatim. Add the license below
the existing one.

As an aside, random -e has been completely broken (it's non-uniform)
since forever.  To fix -e, we should clamp denom to an integer between 1
and 256, otherwise the truncation of the exit exit code to an 8-bit int
introduces bias for numbers larger than 256 (that aren't powers of 2).

Restricting denom >= 1 would seem necessary for all modes of this
program given that 1/denominator is documented to be a probability.


The patch doesn't compile due to missing clz64. Instead of using a
__builtin (which I'm unsure if it's available on all our ancient gccs),
I think we can implement this naively. Something along these lines
should be good enough (only very lightly tested):

int
clz64(uint64_t x)
{
        static const uint64_t mask[] = {
                0xffffffff00000000, 0xffff0000, 0xff00, 0xf0, 0xc, 0x2,
        };
        static int zeroes[] = {
                32, 16, 8, 4, 2, 1,
        };
        int clz = 0;
        int i;

        if (x == 0)
                return 64;

        for (i = 0; i < 6; i++) {
                if ((x & mask[i]) == 0)
                        clz += zeroes[i];
                else
                        x >>= zeroes[i];
        }

        return clz;
}

Some more comments inline.

> 
> Index: random.c
> ===================================================================
> RCS file: /cvs/src/games/random/random.c,v
> retrieving revision 1.20
> diff -u -p -r1.20 random.c
> --- random.c  7 Mar 2016 12:07:56 -0000       1.20
> +++ random.c  3 Aug 2022 20:28:08 -0000
> @@ -38,6 +38,75 @@
>  #include <stdio.h>
>  #include <stdlib.h>
>  #include <unistd.h>
> +#include <stdint.h>
> +#include <math.h>

style: keep these in alphabetical order

> +
> +uint64_t     random64(void)

style: tab should be a newline:

uint64_t
random64(void)

> +{
> +     uint64_t r;

style: add empty line

> +     arc4random_buf(&r, sizeof(uint64_t));

and here

> +     return r;
> +}
> +
> +/*
> + * random_real: Generate a stream of bits uniformly at random and
> + * interpret it as the fractional part of the binary expansion of a
> + * number in [0, 1], 0.00001010011111010100...; then round it.
> + */
> +double
> +random_real(void)
> +{
> +     int exponent = -64;
> +     uint64_t significand;
> +     unsigned shift;
> +
> +     /*
> +      * Read zeros into the exponent until we hit a one; the rest
> +      * will go into the significand.
> +      */
> +     while (__predict_false((significand = random64()) == 0)) {
> +             exponent -= 64;
> +             /*
> +              * If the exponent falls below -1074 = emin + 1 - p,
> +              * the exponent of the smallest subnormal, we are
> +              * guaranteed the result will be rounded to zero.  This
> +              * case is so unlikely it will happen in realistic
> +              * terms only if random64 is broken.
> +              */
> +             if (__predict_false(exponent < -1074))
> +                     return 0;
> +     }
> +
> +     /*
> +      * There is a 1 somewhere in significand, not necessarily in
> +      * the most significant position.  If there are leading zeros,
> +      * shift them into the exponent and refill the less-significant
> +      * bits of the significand.  Can't predict one way or another
> +      * whether there are leading zeros: there's a fifty-fifty
> +      * chance, if random64 is uniformly distributed.
> +      */
> +     shift = clz64(significand);
> +     if (shift != 0) {
> +             exponent -= shift;
> +             significand <<= shift;
> +             significand |= (random64() >> (64 - shift));
> +     }
> +
> +     /*
> +      * Set the sticky bit, since there is almost surely another 1
> +      * in the bit stream.  Otherwise, we might round what looks
> +      * like a tie to even when, almost surely, were we to look
> +      * further in the bit stream, there would be a 1 breaking the
> +      * tie.
> +      */
> +     significand |= 1;
> +
> +     /*
> +      * Finally, convert to double (rounding) and scale by
> +      * 2^exponent.
> +      */
> +     return ldexp((double)significand, exponent);
> +}
>  
>  __dead void usage(void);
>  
> @@ -86,7 +155,7 @@ main(int argc, char *argv[])
>  
>       /* Compute a random exit status between 0 and denom - 1. */
>       if (random_exit)
> -             return (arc4random_uniform(denom));
> +             return random_real();

This isn't right. A double between 0 and 1 will result in a return
value of 0. For now you could do 'return random_real() * denom;'

>  
>       /*
>        * Act as a filter, randomly choosing lines of the standard input
> @@ -101,7 +170,7 @@ main(int argc, char *argv[])
>        * 0 (which has a 1 / denom chance of being true), we select the
>        * line.
>        */
> -     selected = arc4random_uniform(denom) == 0;
> +     selected = random_real() < (1 / denom);

I'd omit these parentheses.

>       while ((ch = getchar()) != EOF) {
>               if (selected)
>                       (void)putchar(ch);
> @@ -111,7 +180,7 @@ main(int argc, char *argv[])
>                               err(2, "stdout");
>  
>                       /* Now see if the next line is to be printed. */
> -                     selected = arc4random_uniform(denom) == 0;
> +                     selected = random_real() < (1 / denom);
>               }
>       }
>       if (ferror(stdin))

Reply via email to