On Sat, Jun 13, 2020 at 08:46:13AM -0400, David Higgs wrote:
> On Fri, Jun 12, 2020 at 9:41 AM Theo Buehler <[email protected]> wrote:
> 
> > I finally found the time to think about the mathematics of this some
> > more and I'm now convinced that it's a sound construction. I hope that
> > one or the other observation below will be useful for you.
> >
> > The hash as it is now can be proved to produce values in the full range
> > of uint16_t, so that's good.
> >
> > As we discussed already, you can simplify the construction further.
> >
> 
> [snip]
> 
> 
> > Next, I don't particularly like this magic STOEPLITZ_KEYSEED number, but
> > I guess we can live with it.
> >
> > Another option would be to generate the key seed randomly. You will get
> > a "good" hash that spreads out over all 16 bit values if and only if the
> > random value has an odd number of binary digits.
> >
> > I haven't thought hard about this, but I don't immediately see a better
> > way of generating such numbers than:
> >
> > int
> > stoeplitz_good_seed(uint16_t seed)
> > {
> >         int             ones = 0;
> >
> >         while (seed > 0) {
> >                 ones += seed % 2;
> >                 seed /= 2;
> >         }
> >
> >         return ones % 2;
> > }
> >
> > uint16_t
> > stoeplitz_seed_init(void)
> > {
> >         uint16_t        seed;
> >
> >         do {
> >                 seed = arc4random() & UINT16_MAX;
> >         } while (!stoeplitz_good_seed(seed));
> >
> >         return seed;
> > }
> >
> > This will loop as long as it needs to get a good toeplitz key seed.
> > Each time there is a 50% chance that it will find one, so this will
> > need to loop n times with probability 1 / 2**n. This is basically the
> > same situation as for arc4random_uniform() with an upper bound that is
> > not a power of two.
> >
> 
> I neither read nor understand the math but assuming you've described it
> correctly, you can do this more simply by realizing that a random bad seed
> can be made into a good seed by toggling one (random) bit.

Yes. The thing is that you need to convince yourself that this is still
uniformly distributed over the wanted numbers. But it's correct. In
fact, it's enough to flip a fixed bit, so you can get away with one call
to arc4random().

> You also
> replace stoeplitz_good_seed() with __builtin_parity() and perhaps leverage
> some intrinsics?

Others have pointed out off-list that one can use __builtin_popcount(),
but __builtin_parity() is exactly what I want. Is it available on all
architectures?

> 
> uint16_t
> stoeplitz_seed_init(void) {
>     uint16_t seed;
>     seed = arc4random() & UINT16_MAX;
>     if (!stoeplitz_good_seed(seed))
>         seed ^= 1 << (arc4random() % 16);
> }
> 
> --david

Reply via email to