Hi Anthony,

On Wed, Mar 13, 2024 at 12:33:54PM -0400, Anthony Deschamps wrote:
> Hi Willy,
> 
> My original concern was that if two servers had values of lb_server_key
> that are close to each other, then there could be some overlap in their
> values of lb_server_key + node_index;, which is why I was looking for a
> reasonable hash function to combine them.

I know but you can't really avoid it anyway, and that's why we place
many nodes. I tried an attempt in the past where nodes were evenly
spaced by dividing by the number of entries etc. It was a disaster.
As soon as a server went down, the next one would take its extra load
for example. So anything you try to do to avoid some corner cases create
other ones, while overall, a random distribution (like a hash does)
gives great results.

> I based it on boost::hash_combine,
> but since writing that I came across https://stackoverflow.com/a/50978188
> explaining why boost::hash_combine isn't necessarily a good hash function.

Hehe. We've all written our own hash functions for a given purpose at one
point in time, and they're usually fine inside their context and bad once
used outside. That's also where XXH and such stuff shines, it brings
generic functions that are good all over the space. In our case, the
full_hash() is just an avalanche non-linear 1:1 mapping between the input
and the output which suffices to avoid resonance effects that come from
repeated additions, it's perfectly suited here I think.

> I agree that it's not too critical to have a few collisions, but I'm reaching
> the limits of my knowledge here. Does this change address your concerns?
> I've attached an updated patch, and here's the diff between them:
> 
> diff --git a/src/lb_chash.c b/src/lb_chash.c
> index c77222854..d7a1b2539 100644
> --- a/src/lb_chash.c
> +++ b/src/lb_chash.c
> @@ -66,9 +66,7 @@ static inline void chash_dequeue_srv(struct server *s)
>   */
>  static inline u32 chash_compute_node_key(struct server *s, unsigned 
> node_index)
>  {
> -    u32 server_key = s->lb_server_key;
> -    u32 index_key = full_hash(node_index);
> -    return server_key ^= index_key + 0x9e3779b9 + (server_key << 6) +
> (server_key >> 2);
> +    return full_hash(s->lb_server_key + node_index);
>  }
> 
>  /* Compute the base server key that will be used to compute node keys. 
> Servers
> @@ -112,8 +110,7 @@ static inline u32 chash_compute_server_key(struct server 
> *s)
>      case SRV_HASH_KEY_ADDR:
>          switch (srv_addr.family) {
>          case AF_INET:
> -            u32 addr_key = full_hash(srv_addr.addr.v4.s_addr);
> -            key ^= addr_key + 0x9e3779b9 + (key << 6) + (key >> 2);
> +            key = full_hash(key + srv_addr.addr.v4.s_addr);
>              break;
>          case AF_INET6:
>              key = XXH32(srv_addr.addr.v6.s6_addr, 16, key);

Yeay I think it addresses everything. I'll re-test your updated patch
tomorrow hoping that this time I'll merge it :-)

Thanks for your patience!
Willy

Reply via email to