Hello Dan, On Fri, Jul 04, 2014 at 10:24:44PM -0700, Dan Dubovik wrote: > Hello, > > Recently, we were trying to segment our account provisioning using HAProxy. > We are having HAProxy to port NAT traffic to a backend, using the djb2 > hash to select the backend based on the Host header. When attempting to > predict the backend that HAProxy would select, we were unable to come to > the same results as HAProxy did. > > We dove into the code that HAProxy used to implement the djb2 hash, and > discovered a bug in the map_get_server_hash function declaration. Where in > the rest of the code, it uses an unsigned long for the hash value, the > map_get_server_hash function uses an unsigned int.
You spotted an interesting thing, but in fact it's not that easy. Desping "hash" being declared as long in most of these functions (note I said "most" since it appears that get_server_sh() uses an int), it's mostly used as a 32-bit quantity. chash_get_server_hash() uses an unsigned int as well. And full_hash() reduces it to 32-bit. So in practice we should use unsigned ints everywhere instead. > The end result is that we have a consistent value chosen for a backend by > HAProxy, but one that is unpredictable by a standard implementation of djb2. > > Attached is the patch we used that resolved this issue. Unfortunately it will make things even worse, because not only will this change all hashes for all deployed load balancers, which is hardly acceptable, but additionally it will make the hash result dependant on the machine's word size, meaning that people who are currently upgrading their old 32-bit systems to 64-bit will have inconsistent hashing between the two. Thus I'd rather fix all this by ensuring we're using unsigned ints everywhere a hash result is used from backend.c. That will both maintain compatibility with existing setups and ensure small and large systems provide the same hash result. The easiest way to do this would be to modify gen_hash() to return an unsigned int and to replace all "unsigned long hash" occurrences with unsigned ints. Is this something you'd be willing to do ? (it would save me an extra hour). Additionally, since you're checking your hash results, would you be interested in working on a utility to run from the stats socket which would give you the selected server for a given pattern ? I've long wanted to do that but I'm not sure how easy/complex it is now that we can hash many things. It's basically the same as applying the LB algorithm but we want to bypass the data extraction to always hash the same thing. Thus we could do : > get-server-hash backend1 10.20.30.40 server1 > get-server-hash backend2 /index.html server3 And ideally it would report the hash value, the server count (or farm's weight) and the server's index. I've always thought it could be useful, but never had the time to work on this. That seems pretty close to what you're currently doing. Best regards, Willy

