Hi all, I've got a report of consistent hash delivering different hashes since 3.0 with commit faa8c3e02 ("MEDIUM: lb-chash: Deterministic node hashes based on server address").
The cause is a mistake in the ID-based key calculation (the hash is applied twice and the ID range scaling was dropped). The fix is trivial: --- a/src/lb_chash.c +++ b/src/lb_chash.c @@ -123,7 +123,7 @@ static inline u32 chash_compute_server_key(struct server *s) case SRV_HASH_KEY_ID: default: - key = full_hash(s->puid); + key = s->puid * SRV_EWGHT_RANGE; break; } but I'm having a problem now: anyone who deployed haproxy with consistent hashing before 3.0 notices the problem (much higher miss rate on caches) and would want the fix to be applied, but those having enabled it first in 3.0+ on hash-key id don't know that something broke, and will be surprised by the fix which will change everything for them. To be honest, I really doubt that anyone just started to use consistent hash recently with 3.0 or 3.2 using server IDs while addresses are available and more robust. So I'm tempted to apply the fix in order to fix the situation for all those who are progressively upgrading their fleet from pre-3.0 to 3.0+. Another possibility would be to add a 4th hash-key setting to support pre-3.0 compatibility, but that would remain a mess for those upgrading anyway. Hence my question to our users: did anyone just start to use consistent hashing recently with the default hash-key (id), and would rightfully want to have a way to keep their keys distributed like this (i.e. with an incompatible algo), in which case we'd need to add a new setting to support this ? Or should we consider that a regression is a regression and should be fixed ? Thanks for sharing your insights, Willy