Hi all,

I've got a report of consistent hash delivering different hashes since 3.0 with
commit faa8c3e02 ("MEDIUM: lb-chash: Deterministic node hashes based on server
address").

The cause is a mistake in the ID-based key calculation (the hash is applied
twice and the ID range scaling was dropped). The fix is trivial:

  --- a/src/lb_chash.c
  +++ b/src/lb_chash.c
  @@ -123,7 +123,7 @@ static inline u32 chash_compute_server_key(struct server 
*s)
   
          case SRV_HASH_KEY_ID:
          default:
  -               key = full_hash(s->puid);
  +               key = s->puid * SRV_EWGHT_RANGE;
                  break;
          }

but I'm having a problem now: anyone who deployed haproxy with consistent
hashing before 3.0 notices the problem (much higher miss rate on caches)
and would want the fix to be applied, but those having enabled it first
in 3.0+ on hash-key id don't know that something broke, and will be
surprised by the fix which will change everything for them.

To be honest, I really doubt that anyone just started to use consistent
hash recently with 3.0 or 3.2 using server IDs while addresses are
available and more robust. So I'm tempted to apply the fix in order to
fix the situation for all those who are progressively upgrading their
fleet from pre-3.0 to 3.0+.

Another possibility would be to add a 4th hash-key setting to support
pre-3.0 compatibility, but that would remain a mess for those upgrading
anyway.

Hence my question to our users: did anyone just start to use consistent
hashing recently with the default hash-key (id), and would rightfully
want to have a way to keep their keys distributed like this (i.e. with
an incompatible algo), in which case we'd need to add a new setting to
support this ? Or should we consider that a regression is a regression
and should be fixed ?

Thanks for sharing your insights,
Willy


Reply via email to