WT6) and optionally apply avalanche

Willy Tarreau Thu, 14 Nov 2013 06:26:10 -0800

Hi Bhaskar,

OK I'm finally done with this. Having reviewed the existing code
allowed me to change my mind on a few points.


1) "hash algorithm" => I realized that this naming is confusing
   because it's used in conjunction with the balance algorithm.
   In practice, both the terms "hash algorithm" or "hash function"
   are used, with the latter being much more common. So I changed
   again the HALG_* flags to HFCN_*.

2) These HFCN_* flags are not contained anymore in the BE_LB_ALGO
   mask, and this significantly simplifies the patches (no more mess
   trying to exclude some masks)

3) In order to make the code more homogenous, there a mask to detect
   the modifier as well, which right now may only be "avalanche".

4) I finally thought that despite "hash-type avalanche" not being used
   anywhere, it was a bit rude to remove it without any information.
   This can happen if someone uses an obsolete doc like those found on
   code.google.com/p/haproxy-docs. So now if "hash-type avalanche" is
   found, the error message explains how to replace it.

5) while running the tests at the end, I found that DJB2 was awful with
   numbers on the input (my test tool simply put a visitor number in a
   URL parameter which I used to balance on). I had 64 servers in the
   farm, and some of them took more than twice the load of the others.
   Worse, when applying consistent hashing on that, only two servers
   got the load for several seconds, then two other ones, then two
   other ones, etc... So in practice the load was spread over all the
   servers over the long term, but the instant load was running only
   on two among 64. Running with 33 servers with map-based resulted
   in only 10 servers taking traffic, as expected. Using sdbm with 64
   servers resulted in only half the servers getting traffic. Adding
   avalanche fixed the issue in both cases, but at the expense of a
   less smooth distribution. So in the end, I changed my mind regarding
   the wt6 function and accepted to reintroduce it because in my tests
   it performed better for such workloads in my tests since I can't
   produce these nasty patterns with it.

Now I have tested all combinations. Everything looks OK and I really
like this new ability to pick the best hashing function for the job
depending on what is being hashed (eg: neither sdbm nor djb2 are good
on IP addresses unless avalanche is used).

I'm attaching the 3 resulting patches here. Please tell me if that's
OK for you, in which case I'm going to merge them.

Thanks!
Willy

>From 07b0c569d23a57b0e5e4c261206fceaf61bf0d4a Mon Sep 17 00:00:00 2001
From: Bhaskar <[email protected]>
Date: Tue, 29 Oct 2013 23:30:51 -0400
Subject: MEDIUM: backend: Enhance hash-type directive with an algorithm
 options

Summary:
In testing at tumblr, we found that using djb2 hashing instead of the
default sdbm hashing resulted is better workload distribution to our backends.

This commit implements a change, that allows the user to specify the hash
function they want to use. It does not limit itself to consistent hashing
scenarios.

The supported hash functions are sdbm (default), and djb2.

For a discussion of the feature and analysis, see mailing list thread
"Consistent hashing alternative to sdbm" :

      http://marc.info/?l=haproxy&m=138213693909219

Note: This change does NOT make changes to new features, for instance,
applying an avalance hashing always being performed before applying
consistent hashing.
---
 Makefile                |  2 +-
 doc/configuration.txt   | 90 ++++++++++++++++++++++++++++++-------------------
 include/common/hash.h   | 28 +++++++++++++++
 include/types/backend.h |  6 ++++
 src/backend.c           | 76 +++++++++++++++++++++++++++--------------
 src/cfgparse.c          | 18 ++++++++--
 src/hash.c              | 61 +++++++++++++++++++++++++++++++++
 7 files changed, 217 insertions(+), 64 deletions(-)
 create mode 100644 include/common/hash.h
 create mode 100644 src/hash.c

diff --git a/Makefile b/Makefile
index 2acaee4..4c30cc0 100644
--- a/Makefile
+++ b/Makefile
@@ -637,7 +637,7 @@ OBJS = src/haproxy.o src/sessionhash.o src/base64.o 
src/protocol.o \
        src/stream_interface.o src/dumpstats.o src/proto_tcp.o \
        src/session.o src/hdr_idx.o src/ev_select.o src/signal.o \
        src/acl.o src/sample.o src/memory.o src/freq_ctr.o src/auth.o \
-       src/compression.o src/payload.o
+       src/compression.o src/payload.o src/hash.o
 
 EBTREE_OBJS = $(EBTREE_DIR)/ebtree.o \
               $(EBTREE_DIR)/eb32tree.o $(EBTREE_DIR)/eb64tree.o \
diff --git a/doc/configuration.txt b/doc/configuration.txt
index ba8057f..643afd9 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -2496,45 +2496,65 @@ grace <time>
   simplify it.
 
 
-hash-type <method>
+hash-type <method> <function>
   Specify a method to use for mapping hashes to servers
   May be used in sections :   defaults | frontend | listen | backend
                                  yes   |    no    |   yes  |   yes
   Arguments :
-    map-based   the hash table is a static array containing all alive servers.
-                The hashes will be very smooth, will consider weights, but will
-                be static in that weight changes while a server is up will be
-                ignored. This means that there will be no slow start. Also,
-                since a server is selected by its position in the array, most
-                mappings are changed when the server count changes. This means
-                that when a server goes up or down, or when a server is added
-                to a farm, most connections will be redistributed to different
-                servers. This can be inconvenient with caches for instance.
-
-    avalanche   this mechanism uses the default map-based hashing described
-                above but applies a full avalanche hash before performing the
-                mapping. The result is a slightly less smooth hash for most
-                situations, but the hash becomes better than pure map-based
-                hashes when the number of servers is a multiple of the size of
-                the input set. When using URI hash with a number of servers
-                multiple of 64, it's desirable to change the hash type to
-                this value.
-
-    consistent  the hash table is a tree filled with many occurrences of each
-                server. The hash key is looked up in the tree and the closest
-                server is chosen. This hash is dynamic, it supports changing
-                weights while the servers are up, so it is compatible with the
-                slow start feature. It has the advantage that when a server
-                goes up or down, only its associations are moved. When a server
-                is added to the farm, only a few part of the mappings are
-                redistributed, making it an ideal algorithm for caches.
-                However, due to its principle, the algorithm will never be very
-                smooth and it may sometimes be necessary to adjust a server's
-                weight or its ID to get a more balanced distribution. In order
-                to get the same distribution on multiple load balancers, it is
-                important that all servers have the same IDs.
-
-  The default hash type is "map-based" and is recommended for most usages.
+    <method> is the method used to select a server from the hash computed by
+             the <function> :
+
+      map-based   the hash table is a static array containing all alive 
servers.
+                  The hashes will be very smooth, will consider weights, but
+                  will be static in that weight changes while a server is up
+                  will be ignored. This means that there will be no slow start.
+                  Also, since a server is selected by its position in the 
array,
+                  most mappings are changed when the server count changes. This
+                  means that when a server goes up or down, or when a server is
+                  added to a farm, most connections will be redistributed to
+                  different servers. This can be inconvenient with caches for
+                  instance.
+
+      avalanche   this mechanism uses the default map-based hashing described
+                  above but applies a full avalanche hash before performing the
+                  mapping. The result is a slightly less smooth hash for most
+                  situations, but the hash becomes better than pure map-based
+                  hashes when the number of servers is a multiple of the size 
of
+                  the input set. When using URI hash with a number of servers
+                  multiple of 64, it's desirable to change the hash type to
+                  this value.
+
+      consistent  the hash table is a tree filled with many occurrences of each
+                  server. The hash key is looked up in the tree and the closest
+                  server is chosen. This hash is dynamic, it supports changing
+                  weights while the servers are up, so it is compatible with 
the
+                  slow start feature. It has the advantage that when a server
+                  goes up or down, only its associations are moved. When a
+                  server is added to the farm, only a few part of the mappings
+                  are redistributed, making it an ideal method for caches.
+                  However, due to its principle, the distribution will never be
+                  very smooth and it may sometimes be necessary to adjust a
+                  server's weight or its ID to get a more balanced 
distribution.
+                  In order to get the same distribution on multiple load
+                  balancers, it is important that all servers have the exact
+                  same IDs. Note: by default, a full avalanche hash is always
+                  performed before applying the consistent hash.
+
+    <function> is the hash function to be used :
+
+       sdbm   this function was created intially for sdbm (a public-domain
+              reimplementation of ndbm) database library. It was found to do
+              well in scrambling bits, causing better distribution of the keys
+              and fewer splits. It also happens to be a good general hashing
+              function with good distribution.
+
+       djb2   this function was first proposed by Dan Bernstein many years ago
+              on comp.lang.c. Studies have shown that for certain workload this
+              function provides a better distribution than sdbm.
+
+  The default hash type is "map-based" and is recommended for most usages. The
+  default function is "sdbm", the selection of a function should be based on
+  the range of the values being hashed.
 
   See also : "balance", "server"
 
diff --git a/include/common/hash.h b/include/common/hash.h
new file mode 100644
index 0000000..f6875c9
--- /dev/null
+++ b/include/common/hash.h
@@ -0,0 +1,28 @@
+/*
+ * include/common/hash.h
+ * Macros for different hashing function.
+ *
+ * Copyright (C) 2000-2011 Willy Tarreau - [email protected]
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation, version 2.1
+ * exclusively.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  
USA
+ */
+
+#ifndef _COMMON_HASH_H_
+#define _COMMON_HASH_H_
+
+unsigned long hash_djb2(const char *key, int len);
+unsigned long hash_sdbm(const char *key, int len);
+
+#endif /* _COMMON_HASH_H_ */
diff --git a/include/types/backend.h b/include/types/backend.h
index 1183b36..1a433cf 100644
--- a/include/types/backend.h
+++ b/include/types/backend.h
@@ -108,6 +108,12 @@
 #define BE_LB_HASH_AVAL   0x200000 /* run an avalanche hash before a map */
 #define BE_LB_HASH_TYPE   0x300000 /* get/clear hash types */
 
+/* BE_LB_HFCN_* is the hash function, to be used with BE_LB_HASH_FUNC */
+#define BE_LB_HFCN_SDBM   0x000000 /* sdbm hash */
+#define BE_LB_HFCN_DJB2   0x400000 /* djb2 hash */
+#define BE_LB_HASH_FUNC   0xC00000 /* get/clear hash function */
+
+
 /* various constants */
 
 /* The scale factor between user weight and effective weight allows smooth
diff --git a/src/backend.c b/src/backend.c
index 48c8761..bf4a81d 100644
--- a/src/backend.c
+++ b/src/backend.c
@@ -23,6 +23,7 @@
 #include <common/compat.h>
 #include <common/config.h>
 #include <common/debug.h>
+#include <common/hash.h>
 #include <common/ticks.h>
 #include <common/time.h>
 
@@ -51,6 +52,25 @@
 #include <proto/stream_interface.h>
 #include <proto/task.h>
 
+/* helper function to invoke the correct hash method */
+static unsigned long gen_hash(const struct proxy* px, const char* key, 
unsigned long len)
+{
+       unsigned long hash;
+
+       switch (px->lbprm.algo & BE_LB_HASH_FUNC) {
+       case BE_LB_HFCN_DJB2:
+               hash = hash_djb2(key, len);
+               break;
+       case BE_LB_HFCN_SDBM:
+               /* this is the default hash function */
+       default:
+               hash = hash_sdbm(key, len);
+               break;
+       }
+
+       return hash;
+}
+
 /*
  * This function recounts the number of usable active and backup servers for
  * proxy <p>. These numbers are returned into the p->srv_act and p->srv_bck.
@@ -153,6 +173,7 @@ struct server *get_server_uh(struct proxy *px, char *uri, 
int uri_len)
        unsigned long hash = 0;
        int c;
        int slashes = 0;
+       const char *start, *end;
 
        if (px->lbprm.tot_weight == 0)
                return NULL;
@@ -164,8 +185,9 @@ struct server *get_server_uh(struct proxy *px, char *uri, 
int uri_len)
        if (px->uri_len_limit)
                uri_len = MIN(uri_len, px->uri_len_limit);
 
+       start = end = uri;
        while (uri_len--) {
-               c = *uri++;
+               c = *end++;
                if (c == '/') {
                        slashes++;
                        if (slashes == px->uri_dirs_depth1) /* depth+1 */
@@ -173,9 +195,10 @@ struct server *get_server_uh(struct proxy *px, char *uri, 
int uri_len)
                }
                else if (c == '?' && !px->uri_whole)
                        break;
-
-               hash = c + (hash << 6) + (hash << 16) - hash;
        }
+
+       hash = gen_hash(px, start, (end - start));
+
        if ((px->lbprm.algo & BE_LB_HASH_TYPE) != BE_LB_HASH_MAP)
                hash = full_hash(hash);
  hash_done:
@@ -197,6 +220,7 @@ struct server *get_server_uh(struct proxy *px, char *uri, 
int uri_len)
 struct server *get_server_ph(struct proxy *px, const char *uri, int uri_len)
 {
        unsigned long hash = 0;
+       const char *start, *end;
        const char *p;
        const char *params;
        int plen;
@@ -223,15 +247,18 @@ struct server *get_server_ph(struct proxy *px, const char 
*uri, int uri_len)
                                 * skip the equal symbol
                                 */
                                p += plen + 1;
+                               start = end = p;
                                uri_len -= plen + 1;
 
-                               while (uri_len && *p != '&') {
-                                       hash = *p + (hash << 6) + (hash << 16) 
- hash;
+                               while (uri_len && *end != '&') {
                                        uri_len--;
-                                       p++;
+                                       end++;
                                }
+                               hash = gen_hash(px, start, (end - start));
+
                                if ((px->lbprm.algo & BE_LB_HASH_TYPE) != 
BE_LB_HASH_MAP)
                                        hash = full_hash(hash);
+
                                if (px->lbprm.algo & BE_LB_LKUP_CHTREE)
                                        return chash_get_server_hash(px, hash);
                                else
@@ -263,6 +290,7 @@ struct server *get_server_ph_post(struct session *s)
        unsigned long    len  = msg->body_len;
        const char      *params = b_ptr(req->buf, (int)(msg->sov - 
req->buf->o));
        const char      *p    = params;
+       const char      *start, *end;
 
        if (len > buffer_len(req->buf) - msg->sov)
                len = buffer_len(req->buf) - msg->sov;
@@ -282,12 +310,13 @@ struct server *get_server_ph_post(struct session *s)
                                 * skip the equal symbol
                                 */
                                p += plen + 1;
+                               start = end = p;
                                len -= plen + 1;
 
-                               while (len && *p != '&') {
+                               while (len && *end != '&') {
                                        if (unlikely(!HTTP_IS_TOKEN(*p))) {
                                                /* if in a POST, body must be 
URI encoded or it's not a URI.
-                                                * Do not interprete any 
possible binary data as a parameter.
+                                                * Do not interpret any 
possible binary data as a parameter.
                                                 */
                                                if (likely(HTTP_IS_LWS(*p))) /* 
eol, uncertain uri len */
                                                        break;
@@ -295,13 +324,15 @@ struct server *get_server_ph_post(struct session *s)
                                                                              * 
This body does not contain parameters.
                                                                              */
                                        }
-                                       hash = *p + (hash << 6) + (hash << 16) 
- hash;
                                        len--;
-                                       p++;
+                                       end++;
                                        /* should we break if vlen exceeds 
limit? */
                                }
+                               hash = gen_hash(px, start, (end - start));
+
                                if ((px->lbprm.algo & BE_LB_HASH_TYPE) != 
BE_LB_HASH_MAP)
                                        hash = full_hash(hash);
+
                                if (px->lbprm.algo & BE_LB_LKUP_CHTREE)
                                        return chash_get_server_hash(px, hash);
                                else
@@ -338,6 +369,7 @@ struct server *get_server_hh(struct session *s)
        unsigned long    len;
        struct hdr_ctx   ctx;
        const char      *p;
+       const char *start, *end;
 
        /* tot_weight appears to mean srv_count */
        if (px->lbprm.tot_weight == 0)
@@ -362,14 +394,11 @@ struct server *get_server_hh(struct session *s)
        len = ctx.vlen;
        p = (char *)ctx.line + ctx.val;
        if (!px->hh_match_domain) {
-               while (len) {
-                       hash = *p + (hash << 6) + (hash << 16) - hash;
-                       len--;
-                       p++;
-               }
+               hash = gen_hash(px, p, len);
        } else {
                int dohash = 0;
                p += len - 1;
+               start = end = p;
                /* special computation, use only main domain name, not tld/host
                 * going back from the end of string, start hashing at first
                 * dot stop at next.
@@ -378,17 +407,20 @@ struct server *get_server_hh(struct session *s)
                 */
                while (len) {
                        if (*p == '.') {
-                               if (!dohash)
+                               if (!dohash) {
                                        dohash = 1;
+                                       start = end = p - 1;
+                               }
                                else
                                        break;
                        } else {
                                if (dohash)
-                                       hash = *p + (hash << 6) + (hash << 16) 
- hash;
+                                       start--;
                        }
                        len--;
                        p--;
                }
+               hash = gen_hash(px, start, (end - start));
        }
        if ((px->lbprm.algo & BE_LB_HASH_TYPE) != BE_LB_HASH_MAP)
                hash = full_hash(hash);
@@ -405,7 +437,6 @@ struct server *get_server_rch(struct session *s)
        unsigned long    hash = 0;
        struct proxy    *px   = s->be;
        unsigned long    len;
-       const char      *p;
        int              ret;
        struct sample    smp;
        int rewind;
@@ -433,12 +464,8 @@ struct server *get_server_rch(struct session *s)
        /* Found a the hh_name in the headers.
         * we will compute the hash based on this value ctx.val.
         */
-       p = smp.data.str.str;
-       while (len) {
-               hash = *p + (hash << 6) + (hash << 16) - hash;
-               len--;
-               p++;
-       }
+       hash = gen_hash(px, smp.data.str.str, len);
+
        if ((px->lbprm.algo & BE_LB_HASH_TYPE) != BE_LB_HASH_MAP)
                hash = full_hash(hash);
  hash_done:
@@ -1318,7 +1345,6 @@ int backend_parse_balance(const char **args, char **err, 
struct proxy *curproxy)
                        }
                        curproxy->hh_match_domain = 1;
                }
-
        }
        else if (!strncmp(args[0], "rdp-cookie", 10)) {
                curproxy->lbprm.algo &= ~BE_LB_ALGO;
diff --git a/src/cfgparse.c b/src/cfgparse.c
index 41c1949..051f600 100644
--- a/src/cfgparse.c
+++ b/src/cfgparse.c
@@ -4085,19 +4085,18 @@ stats_error_parsing:
                }
        }
        else if (!strcmp(args[0], "hash-type")) { /* set hashing method */
+               curproxy->lbprm.algo &= ~(BE_LB_HASH_TYPE | BE_LB_HASH_FUNC);
+
                if (warnifnotcap(curproxy, PR_CAP_BE, file, linenum, args[0], 
NULL))
                        err_code |= ERR_WARN;
 
                if (strcmp(args[1], "consistent") == 0) {       /* use 
consistent hashing */
-                       curproxy->lbprm.algo &= ~BE_LB_HASH_TYPE;
                        curproxy->lbprm.algo |= BE_LB_HASH_CONS;
                }
                else if (strcmp(args[1], "map-based") == 0) {   /* use 
map-based hashing */
-                       curproxy->lbprm.algo &= ~BE_LB_HASH_TYPE;
                        curproxy->lbprm.algo |= BE_LB_HASH_MAP;
                }
                else if (strcmp(args[1], "avalanche") == 0) {   /* use full 
hash before map-based hashing */
-                       curproxy->lbprm.algo &= ~BE_LB_HASH_TYPE;
                        curproxy->lbprm.algo |= BE_LB_HASH_AVAL;
                }
                else {
@@ -4105,6 +4104,19 @@ stats_error_parsing:
                        err_code |= ERR_ALERT | ERR_FATAL;
                        goto out;
                }
+
+               /* set the hash function to use */
+               if (!*args[2]) {
+                       curproxy->lbprm.algo |= BE_LB_HFCN_SDBM;
+               } else if (!strcmp(args[2], "sdbm")) {
+                       curproxy->lbprm.algo |= BE_LB_HFCN_SDBM;
+               } else if (!strcmp(args[2], "djb2")) {
+                       curproxy->lbprm.algo |= BE_LB_HFCN_DJB2;
+               } else {
+                       Alert("parsing [%s:%d] : '%s' only supports 'sdbm' and 
'djb2' hash functions.\n", file, linenum, args[0]);
+                       err_code |= ERR_ALERT | ERR_FATAL;
+                       goto out;
+               }
        }
        else if (!strcmp(args[0], "server") || !strcmp(args[0], 
"default-server")) {  /* server address */
                int cur_arg;
diff --git a/src/hash.c b/src/hash.c
new file mode 100644
index 0000000..8da3003
--- /dev/null
+++ b/src/hash.c
@@ -0,0 +1,61 @@
+/*
+ * Hash function implementation
+ *
+ * See mailing list thread on "Consistent hashing alternative to sdbm"
+ * http://marc.info/?l=haproxy&m=138213693909219
+ *
+ * Copyright 2000-2010 Willy Tarreau <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ *
+ */
+
+
+#include <common/hash.h>
+
+
+unsigned long hash_djb2(const char *key, int len)
+{
+       unsigned long hash = 5381;
+
+       /* the hash unrolled eight times */
+       for (; len >= 8; len -= 8) {
+               hash = ((hash << 5) + hash) + *key++;
+               hash = ((hash << 5) + hash) + *key++;
+               hash = ((hash << 5) + hash) + *key++;
+               hash = ((hash << 5) + hash) + *key++;
+               hash = ((hash << 5) + hash) + *key++;
+               hash = ((hash << 5) + hash) + *key++;
+               hash = ((hash << 5) + hash) + *key++;
+               hash = ((hash << 5) + hash) + *key++;
+       }
+       switch (len) {
+               case 7: hash = ((hash << 5) + hash) + *key++; /* fallthrough... 
*/
+               case 6: hash = ((hash << 5) + hash) + *key++; /* fallthrough... 
*/
+               case 5: hash = ((hash << 5) + hash) + *key++; /* fallthrough... 
*/
+               case 4: hash = ((hash << 5) + hash) + *key++; /* fallthrough... 
*/
+               case 3: hash = ((hash << 5) + hash) + *key++; /* fallthrough... 
*/
+               case 2: hash = ((hash << 5) + hash) + *key++; /* fallthrough... 
*/
+               case 1: hash = ((hash << 5) + hash) + *key++; break;
+               default: /* case 0: */ break;
+       }
+       return hash;
+}
+
+unsigned long hash_sdbm(const char *key, int len)
+{
+       unsigned long hash = 0;
+       int c;
+
+       while (len--) {
+               c = *key++;
+               hash = c + (hash << 6) + (hash << 16) - hash;
+       }
+
+       return hash;
+}
+
+
-- 
1.7.12.1

>From fee609f600ae4e258e39f1fb0a7f4a7826a56543 Mon Sep 17 00:00:00 2001
From: Bhaskar Maddala <[email protected]>
Date: Tue, 5 Nov 2013 11:54:02 -0500
Subject: MEDIUM: backend: Implement avalanche as a modifier of the hashing
 functions.

Summary:
Avalanche is supported not as a native hashing choice, but a modifier
on the hashing function. Note that this means that possible configs
written after 1.5-dev4 using "hash-type avalanche" will get an informative
error instead. But as discussed on the mailing list it seems nobody ever
used it anyway, so let's fix it before the final 1.5 release.

The default values were selected for backward compatibility with previous
releases, as discussed on the mailing list, which means that the consistent
hashing will still apply the avalanche hash by default when no explicit
algorithm is specified.

Examples
  (default) hash-type map-based
        Map based hashing using sdbm without avalanche

  (default) hash-type consistent
        Consistent hashing using sdbm with avalanche

Additional Examples:

  (a) hash-type map-based sdbm
        Same as default for map-based above
  (b) hash-type map-based sdbm avalanche
        Map based hashing using sdbm with avalanche
  (c) hash-type map-based djb2
        Map based hashing using djb2 without avalanche
  (d) hash-type map-based djb2 avalanche
        Map based hashing using djb2 with avalanche
  (e) hash-type consistent sdbm avalanche
        Same as default for consistent above
  (f) hash-type consistent sdbm
        Consistent hashing using sdbm without avalanche
  (g) hash-type consistent djb2
        Consistent hashing using djb2 without avalanche
  (h) hash-type consistent djb2 avalanche
        Consistent hashing using djb2 with avalanche
---
 doc/configuration.txt   | 39 ++++++++++++++++++++++++--------------
 include/types/backend.h |  7 +++++--
 src/backend.c           | 12 ++++++------
 src/cfgparse.c          | 50 ++++++++++++++++++++++++++++++++++++++-----------
 4 files changed, 75 insertions(+), 33 deletions(-)

diff --git a/doc/configuration.txt b/doc/configuration.txt
index 643afd9..c430ec3 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -2496,7 +2496,7 @@ grace <time>
   simplify it.
 
 
-hash-type <method> <function>
+hash-type <method> <function> <modifier>
   Specify a method to use for mapping hashes to servers
   May be used in sections :   defaults | frontend | listen | backend
                                  yes   |    no    |   yes  |   yes
@@ -2515,15 +2515,6 @@ hash-type <method> <function>
                   different servers. This can be inconvenient with caches for
                   instance.
 
-      avalanche   this mechanism uses the default map-based hashing described
-                  above but applies a full avalanche hash before performing the
-                  mapping. The result is a slightly less smooth hash for most
-                  situations, but the hash becomes better than pure map-based
-                  hashes when the number of servers is a multiple of the size 
of
-                  the input set. When using URI hash with a number of servers
-                  multiple of 64, it's desirable to change the hash type to
-                  this value.
-
       consistent  the hash table is a tree filled with many occurrences of each
                   server. The hash key is looked up in the tree and the closest
                   server is chosen. This hash is dynamic, it supports changing
@@ -2537,8 +2528,8 @@ hash-type <method> <function>
                   server's weight or its ID to get a more balanced 
distribution.
                   In order to get the same distribution on multiple load
                   balancers, it is important that all servers have the exact
-                  same IDs. Note: by default, a full avalanche hash is always
-                  performed before applying the consistent hash.
+                  same IDs. Note: consistent hash uses sdbm and avalanche if no
+                  hash function is specified.
 
     <function> is the hash function to be used :
 
@@ -2546,11 +2537,31 @@ hash-type <method> <function>
               reimplementation of ndbm) database library. It was found to do
               well in scrambling bits, causing better distribution of the keys
               and fewer splits. It also happens to be a good general hashing
-              function with good distribution.
+              function with good distribution, unless the total server weight
+              is a multiple of 64, in which case applying the avalanche
+              modifier may help.
 
        djb2   this function was first proposed by Dan Bernstein many years ago
               on comp.lang.c. Studies have shown that for certain workload this
-              function provides a better distribution than sdbm.
+              function provides a better distribution than sdbm. It generally
+              works well with text-based inputs though it can perform extremely
+              poorly with numeric-only input or when the total server weight is
+              a multiple of 33, unless the avalanche modifier is also used.
+
+    <modifier> indicates an optional method applied after hashing the key :
+
+       avalanche   This directive indicates that the result from the hash
+                   function above should not be used in its raw form but that
+                   a 4-byte full avalanche hash must be applied first. The
+                   purpose of this step is to mix the resulting bits from the
+                   previous hash in order to avoid any undesired effect when
+                   the input contains some limited values or when the number of
+                   servers is a multiple of one of the hash's components (64
+                   for SDBM, 33 for DJB2). Enabling avalanche tends to make the
+                   result less predictable, but it's also not as smooth as when
+                   using the original function. Some testing might be needed
+                   with some workloads. This hash is one of the many proposed
+                   by Bob Jenkins.
 
   The default hash type is "map-based" and is recommended for most usages. The
   default function is "sdbm", the selection of a function should be based on
diff --git a/include/types/backend.h b/include/types/backend.h
index 1a433cf..57d17bd 100644
--- a/include/types/backend.h
+++ b/include/types/backend.h
@@ -105,8 +105,11 @@
 /* hash types */
 #define BE_LB_HASH_MAP    0x000000 /* map-based hash (default) */
 #define BE_LB_HASH_CONS   0x100000 /* consistent hashbit to indicate a dynamic 
algorithm */
-#define BE_LB_HASH_AVAL   0x200000 /* run an avalanche hash before a map */
-#define BE_LB_HASH_TYPE   0x300000 /* get/clear hash types */
+#define BE_LB_HASH_TYPE   0x100000 /* get/clear hash types */
+
+/* additional modifier on top of the hash function (only avalanche right now) 
*/
+#define BE_LB_HMOD_AVAL   0x200000  /* avalanche modifier */
+#define BE_LB_HASH_MOD    0x200000  /* get/clear hash modifier */
 
 /* BE_LB_HFCN_* is the hash function, to be used with BE_LB_HASH_FUNC */
 #define BE_LB_HFCN_SDBM   0x000000 /* sdbm hash */
diff --git a/src/backend.c b/src/backend.c
index bf4a81d..e044430 100644
--- a/src/backend.c
+++ b/src/backend.c
@@ -147,7 +147,7 @@ struct server *get_server_sh(struct proxy *px, const char 
*addr, int len)
                h ^= ntohl(*(unsigned int *)(&addr[l]));
                l += sizeof (int);
        }
-       if ((px->lbprm.algo & BE_LB_HASH_TYPE) != BE_LB_HASH_MAP)
+       if ((px->lbprm.algo & BE_LB_HASH_MOD) == BE_LB_HMOD_AVAL)
                h = full_hash(h);
  hash_done:
        if (px->lbprm.algo & BE_LB_LKUP_CHTREE)
@@ -199,7 +199,7 @@ struct server *get_server_uh(struct proxy *px, char *uri, 
int uri_len)
 
        hash = gen_hash(px, start, (end - start));
 
-       if ((px->lbprm.algo & BE_LB_HASH_TYPE) != BE_LB_HASH_MAP)
+       if ((px->lbprm.algo & BE_LB_HASH_MOD) == BE_LB_HMOD_AVAL)
                hash = full_hash(hash);
  hash_done:
        if (px->lbprm.algo & BE_LB_LKUP_CHTREE)
@@ -256,7 +256,7 @@ struct server *get_server_ph(struct proxy *px, const char 
*uri, int uri_len)
                                }
                                hash = gen_hash(px, start, (end - start));
 
-                               if ((px->lbprm.algo & BE_LB_HASH_TYPE) != 
BE_LB_HASH_MAP)
+                               if ((px->lbprm.algo & BE_LB_HASH_MOD) == 
BE_LB_HMOD_AVAL)
                                        hash = full_hash(hash);
 
                                if (px->lbprm.algo & BE_LB_LKUP_CHTREE)
@@ -330,7 +330,7 @@ struct server *get_server_ph_post(struct session *s)
                                }
                                hash = gen_hash(px, start, (end - start));
 
-                               if ((px->lbprm.algo & BE_LB_HASH_TYPE) != 
BE_LB_HASH_MAP)
+                               if ((px->lbprm.algo & BE_LB_HASH_MOD) == 
BE_LB_HMOD_AVAL)
                                        hash = full_hash(hash);
 
                                if (px->lbprm.algo & BE_LB_LKUP_CHTREE)
@@ -422,7 +422,7 @@ struct server *get_server_hh(struct session *s)
                }
                hash = gen_hash(px, start, (end - start));
        }
-       if ((px->lbprm.algo & BE_LB_HASH_TYPE) != BE_LB_HASH_MAP)
+       if ((px->lbprm.algo & BE_LB_HASH_MOD) == BE_LB_HMOD_AVAL)
                hash = full_hash(hash);
  hash_done:
        if (px->lbprm.algo & BE_LB_LKUP_CHTREE)
@@ -466,7 +466,7 @@ struct server *get_server_rch(struct session *s)
         */
        hash = gen_hash(px, smp.data.str.str, len);
 
-       if ((px->lbprm.algo & BE_LB_HASH_TYPE) != BE_LB_HASH_MAP)
+       if ((px->lbprm.algo & BE_LB_HASH_MOD) == BE_LB_HMOD_AVAL)
                hash = full_hash(hash);
  hash_done:
        if (px->lbprm.algo & BE_LB_LKUP_CHTREE)
diff --git a/src/cfgparse.c b/src/cfgparse.c
index 051f600..bb79a34 100644
--- a/src/cfgparse.c
+++ b/src/cfgparse.c
@@ -4085,7 +4085,13 @@ stats_error_parsing:
                }
        }
        else if (!strcmp(args[0], "hash-type")) { /* set hashing method */
-               curproxy->lbprm.algo &= ~(BE_LB_HASH_TYPE | BE_LB_HASH_FUNC);
+               /**
+                * The syntax for hash-type config element is
+                * hash-type {map-based|consistent} [[<algo>] avalanche]
+                *
+                * The default hash function is sdbm for map-based and 
sdbm+avalanche for consistent.
+                */
+               curproxy->lbprm.algo &= ~(BE_LB_HASH_TYPE | BE_LB_HASH_FUNC | 
BE_LB_HASH_MOD);
 
                if (warnifnotcap(curproxy, PR_CAP_BE, file, linenum, args[0], 
NULL))
                        err_code |= ERR_WARN;
@@ -4096,26 +4102,48 @@ stats_error_parsing:
                else if (strcmp(args[1], "map-based") == 0) {   /* use 
map-based hashing */
                        curproxy->lbprm.algo |= BE_LB_HASH_MAP;
                }
-               else if (strcmp(args[1], "avalanche") == 0) {   /* use full 
hash before map-based hashing */
-                       curproxy->lbprm.algo |= BE_LB_HASH_AVAL;
+               else if (strcmp(args[1], "avalanche") == 0) {
+                       Alert("parsing [%s:%d] : experimental feature '%s %s' 
is not supported anymore, please use '%s map-based sdbm avalanche' instead.\n", 
file, linenum, args[0], args[1], args[0]);
+                       err_code |= ERR_ALERT | ERR_FATAL;
+                       goto out;
                }
                else {
-                       Alert("parsing [%s:%d] : '%s' only supports 
'avalanche', 'consistent' and 'map-based'.\n", file, linenum, args[0]);
+                       Alert("parsing [%s:%d] : '%s' only supports 
'consistent' and 'map-based'.\n", file, linenum, args[0]);
                        err_code |= ERR_ALERT | ERR_FATAL;
                        goto out;
                }
 
                /* set the hash function to use */
                if (!*args[2]) {
+                       /* the default algo is sdbm */
                        curproxy->lbprm.algo |= BE_LB_HFCN_SDBM;
-               } else if (!strcmp(args[2], "sdbm")) {
-                       curproxy->lbprm.algo |= BE_LB_HFCN_SDBM;
-               } else if (!strcmp(args[2], "djb2")) {
-                       curproxy->lbprm.algo |= BE_LB_HFCN_DJB2;
+
+                       /* if consistent with no argument, then avalanche 
modifier is also applied */
+                       if ((curproxy->lbprm.algo & BE_LB_HASH_TYPE) == 
BE_LB_HASH_CONS)
+                               curproxy->lbprm.algo |= BE_LB_HMOD_AVAL;
                } else {
-                       Alert("parsing [%s:%d] : '%s' only supports 'sdbm' and 
'djb2' hash functions.\n", file, linenum, args[0]);
-                       err_code |= ERR_ALERT | ERR_FATAL;
-                       goto out;
+                       /* set the hash function */
+                       if (!strcmp(args[2], "sdbm")) {
+                               curproxy->lbprm.algo |= BE_LB_HFCN_SDBM;
+                       }
+                       else if (!strcmp(args[2], "djb2")) {
+                               curproxy->lbprm.algo |= BE_LB_HFCN_DJB2;
+                       }
+                       else {
+                               Alert("parsing [%s:%d] : '%s' only supports 
'sdbm' and 'djb2' hash functions.\n", file, linenum, args[0]);
+                               err_code |= ERR_ALERT | ERR_FATAL;
+                               goto out;
+                       }
+
+                       /* set the hash modifier */
+                       if (!strcmp(args[3], "avalanche")) {
+                               curproxy->lbprm.algo |= BE_LB_HMOD_AVAL;
+                       }
+                       else if (*args[3]) {
+                               Alert("parsing [%s:%d] : '%s' only supports 
'avalanche' as a modifier for hash functions.\n", file, linenum, args[0]);
+                               err_code |= ERR_ALERT | ERR_FATAL;
+                               goto out;
+                       }
                }
        }
        else if (!strcmp(args[0], "server") || !strcmp(args[0], 
"default-server")) {  /* server address */
-- 
1.7.12.1

>From 86844eb8526edc6e8895fc2663786ad6488d1a70 Mon Sep 17 00:00:00 2001
From: Willy Tarreau <[email protected]>
Date: Thu, 14 Nov 2013 14:30:35 +0100
Subject: MEDIUM: backend: add support for the wt6 hash

This function was designed for haproxy while testing other functions
in the past. Initially it was not planned to be used given the not
very interesting numbers it showed on real URL data : it is not as
smooth as the other ones. But later tests showed that the other ones
are extremely sensible to the server count and the type of input data,
especially DJB2 which must not be used on numeric input. So in fact
this function is still a generally average performer and it can make
sense to merge it in the end, as it can provide an alternative to
sdbm+avalanche or djb2+avalanche for consistent hashing or when hashing
on numeric data such as a source IP address or a visitor identifier in
a URL parameter.
---
 doc/configuration.txt   |  8 ++++++++
 include/common/hash.h   |  1 +
 include/types/backend.h |  1 +
 src/backend.c           |  3 +++
 src/cfgparse.c          |  4 +++-
 src/hash.c              | 27 +++++++++++++++++++++++++++
 6 files changed, 43 insertions(+), 1 deletion(-)

diff --git a/doc/configuration.txt b/doc/configuration.txt
index c430ec3..0b4844b 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -2548,6 +2548,14 @@ hash-type <method> <function> <modifier>
               poorly with numeric-only input or when the total server weight is
               a multiple of 33, unless the avalanche modifier is also used.
 
+       wt6    this function was designed for haproxy while testing other
+              functions in the past. It is not as smooth as the other ones, but
+              is much less sensible to the input data set or to the number of
+              servers. It can make sense as an alternative to sdbm+avalanche or
+              djb2+avalanche for consistent hashing or when hashing on numeric
+              data such as a source IP address or a visitor identifier in a URL
+              parameter.
+
     <modifier> indicates an optional method applied after hashing the key :
 
        avalanche   This directive indicates that the result from the hash
diff --git a/include/common/hash.h b/include/common/hash.h
index f6875c9..379bf89 100644
--- a/include/common/hash.h
+++ b/include/common/hash.h
@@ -23,6 +23,7 @@
 #define _COMMON_HASH_H_
 
 unsigned long hash_djb2(const char *key, int len);
+unsigned long hash_wt6(const char *key, int len);
 unsigned long hash_sdbm(const char *key, int len);
 
 #endif /* _COMMON_HASH_H_ */
diff --git a/include/types/backend.h b/include/types/backend.h
index 57d17bd..6325be2 100644
--- a/include/types/backend.h
+++ b/include/types/backend.h
@@ -114,6 +114,7 @@
 /* BE_LB_HFCN_* is the hash function, to be used with BE_LB_HASH_FUNC */
 #define BE_LB_HFCN_SDBM   0x000000 /* sdbm hash */
 #define BE_LB_HFCN_DJB2   0x400000 /* djb2 hash */
+#define BE_LB_HFCN_WT6    0x800000 /* wt6 hash */
 #define BE_LB_HASH_FUNC   0xC00000 /* get/clear hash function */
 
 
diff --git a/src/backend.c b/src/backend.c
index e044430..7689cea 100644
--- a/src/backend.c
+++ b/src/backend.c
@@ -61,6 +61,9 @@ static unsigned long gen_hash(const struct proxy* px, const 
char* key, unsigned
        case BE_LB_HFCN_DJB2:
                hash = hash_djb2(key, len);
                break;
+       case BE_LB_HFCN_WT6:
+               hash = hash_wt6(key, len);
+               break;
        case BE_LB_HFCN_SDBM:
                /* this is the default hash function */
        default:
diff --git a/src/cfgparse.c b/src/cfgparse.c
index bb79a34..28507dd 100644
--- a/src/cfgparse.c
+++ b/src/cfgparse.c
@@ -4128,9 +4128,11 @@ stats_error_parsing:
                        }
                        else if (!strcmp(args[2], "djb2")) {
                                curproxy->lbprm.algo |= BE_LB_HFCN_DJB2;
+                       } else if (!strcmp(args[2], "wt6")) {
+                               curproxy->lbprm.algo |= BE_LB_HFCN_WT6;
                        }
                        else {
-                               Alert("parsing [%s:%d] : '%s' only supports 
'sdbm' and 'djb2' hash functions.\n", file, linenum, args[0]);
+                               Alert("parsing [%s:%d] : '%s' only supports 
'sdbm', 'djb2' or 'wt6' hash functions.\n", file, linenum, args[0]);
                                err_code |= ERR_ALERT | ERR_FATAL;
                                goto out;
                        }
diff --git a/src/hash.c b/src/hash.c
index 8da3003..034685e 100644
--- a/src/hash.c
+++ b/src/hash.c
@@ -17,6 +17,33 @@
 #include <common/hash.h>
 
 
+unsigned long hash_wt6(const char *key, int len)
+{
+       unsigned h0 = 0xa53c965aUL;
+       unsigned h1 = 0x5ca6953aUL;
+       unsigned step0 = 6;
+       unsigned step1 = 18;
+
+       for (; len > 0; len--) {
+               unsigned int t;
+
+               t = ((unsigned int)*key);
+               key++;
+
+               h0 = ~(h0 ^ t);
+               h1 = ~(h1 + t);
+
+               t  = (h1 << step0) | (h1 >> (32-step0));
+               h1 = (h0 << step1) | (h0 >> (32-step1));
+               h0 = t;
+
+               t = ((h0 >> 16) ^ h1) & 0xffff;
+               step0 = t & 0x1F;
+               step1 = t >> 11;
+       }
+       return h0 ^ h1;
+}
+
 unsigned long hash_djb2(const char *key, int len)
 {
        unsigned long hash = 5381;
-- 
1.7.12.1

Re: [PATCH v2] Add ability to select HASH function amongst (SDBM/DJB2/WT6) and optionally apply avalanche

Reply via email to