Re: symmetric toeplitz hashing

David Gwynne Fri, 12 Jun 2020 18:36:33 -0700

On Fri, Jun 12, 2020 at 03:37:59PM +0200, Theo Buehler wrote:
> I finally found the time to think about the mathematics of this some
> more and I'm now convinced that it's a sound construction. I hope that
> one or the other observation below will be useful for you.


Yes, I read everything below and it sounds great and useful. My only
issue is that I'd like to show the application of those changes as
commits in the tree. I already feel the diff I originally posted has
diverged too far from the dfly code and a lot of why I made the changes
have been lost.

> The hash as it is now can be proved to produce values in the full range
> of uint16_t, so that's good.
> 
> As we discussed already, you can simplify the construction further.

Yep, I'm keen.

> One trick is that stoeplitz_cache_entry() is linear in the second
> argument -- you can think of the xor operation ^ as addition of vectors
> (uint8_t and uint16_t) over the field with two elements {0, 1}.  That's
> just the mathematician's way of expressing this relation:
> 
>       stoeplitz_cache_entry(scache, a ^ b)
>           == stoeplitz_cache_entry(scache, a) ^ stoeplitz_cache_entry(scache, 
> b);
> 
> Using this, the stoeplitz hash functions can be rewritten to this.
> I would expect it to be a bit cheaper.

Yes, that's very cool.

> uint16_t
> stoeplitz_hash_ip4(const struct stoeplitz_cache *scache,
>     in_addr_t faddr, in_addr_t laddr)
> {
>       uint16_t lo, hi;
> 
>       lo  = faddr >> 0;
>       lo ^= faddr >> 16;
>       lo ^= laddr >> 0;
>       lo ^= laddr >> 16;
> 
>       hi  = faddr >> 8;
>       hi ^= faddr >> 24;
>       hi ^= laddr >> 8;
>       hi ^= laddr >> 24;
> 
>       return swap16(stoeplitz_cache_entry(scache, lo))
>           ^ stoeplitz_cache_entry(scache, hi);
> }
> 
> or another example:
> 
> uint16_t
> stoeplitz_hash_ip6port(const struct stoeplitz_cache *scache,
>     const struct in6_addr *faddr6, const struct in6_addr * laddr6,
>     in_port_t fport, in_port_t lport)
> {
>       uint16_t hi = 0, lo = 0;
>       size_t i;
> 
>       for (i = 0; i < nitems(faddr6->s6_addr32); i++) {
>               uint32_t faddr = faddr6->s6_addr32[i];
>               uint32_t laddr = laddr6->s6_addr32[i];
> 
>               lo ^= faddr >> 0;
>               lo ^= faddr >> 16;
>               lo ^= laddr >> 0;
>               lo ^= laddr >> 16;
> 
>               hi ^= faddr >> 8;
>               hi ^= faddr >> 24;
>               hi ^= laddr >> 8;
>               hi ^= laddr >> 24;
>       }
> 
>       lo ^= fport >> 0;
>       lo ^= lport >> 0;
> 
>       hi ^= fport >> 8;
>       hi ^= lport >> 8;
> 
>       return swap16(stoeplitz_cache_entry(scache, lo))
>           ^ stoeplitz_cache_entry(scache, hi);
> }
> 
> and so on.

Very cool.

> Next, I don't particularly like this magic STOEPLITZ_KEYSEED number, but
> I guess we can live with it.

My understanding is that it comes from the vector MS provided or used. I
can't easily find their old or original documentation, but you can see
it's the first two bytes in the following:

https://docs.microsoft.com/en-us/windows-hardware/drivers/network/verifying-the-rss-hash-calculation

> Another option would be to generate the key seed randomly. You will get
> a "good" hash that spreads out over all 16 bit values if and only if the
> random value has an odd number of binary digits.

sephe and I have talked about replacing it with a random number, but I
was going to wait until such a change could have an interesting commit
message in a tree. However, my idea of a good seed was "making sure it's
not zero", so I'm super happy there's a much more rigorous definition
we can use.

> I haven't thought hard about this, but I don't immediately see a better
> way of generating such numbers than:
> 
> int
> stoeplitz_good_seed(uint16_t seed)
> {
>       int             ones = 0;
> 
>       while (seed > 0) {
>               ones += seed % 2;
>               seed /= 2;
>       }
> 
>       return ones % 2;
> }
> 
> uint16_t
> stoeplitz_seed_init(void)
> {
>       uint16_t        seed;
> 
>       do {
>               seed = arc4random() & UINT16_MAX;
>       } while (!stoeplitz_good_seed(seed));
> 
>       return seed;
> }
> 
> This will loop as long as it needs to get a good toeplitz key seed.
> Each time there is a 50% chance that it will find one, so this will
> need to loop n times with probability 1 / 2**n. This is basically the
> same situation as for arc4random_uniform() with an upper bound that is
> not a power of two.
> 
> I don't know if something like that is acceptable early in init_main.c.
> If it is, you can use a seed generated by this init function in place
> of STOEPLITZ_KEYSEED.

I think deraadt@ would be fine with us permuting the random number
generator by taking some bytes out of it early.

> Finally, I don't think it would be all bad to eliminate the cache
> altogether and simply do this (and similarly for all the other
> stoeplitz_hash_*() variants). This would be another way of eliminating
> the magic number:
> 
> uint16_t
> simple_hash_ip4(const struct stoeplitz_cache *scache,
>     in_addr_t faddr, in_addr_t laddr)
> {
>       uint16_t lo, hi;
> 
>       lo  = faddr >> 0;
>       lo ^= faddr >> 16;
>       lo ^= laddr >> 0;
>       lo ^= laddr >> 16;
> 
>       hi  = faddr >> 8;
>       hi ^= faddr >> 24;
>       hi ^= laddr >> 8;
>       hi ^= laddr >> 24;
> 
>       return (swap16(lo) ^ hi);
> }
> 
> I don't see how this is fundamentally different from sephe's
> construction although it's not quite a special case.
> 
> If there's a problem with this, I'd be really curious to know why.

I want the network stack to agree with hardware, and all the hardware
that can do multiple rings seems to implement toeplitz as a hashing
algorithm. If I want the stack to arrive at the same values as the
hardware, I need to do toeplitz too, right?

Doing this means that both the incoming and outgoing processing of
connections (eg, tcp) should be steered to the same the same contexts
(cpus, nettqs, lacp ports, nic rings, etc) for processing.

Re: symmetric toeplitz hashing

Reply via email to