Re: [PATCH] MEDIUM: lb-chash: Deterministic node hashes based on server address

2024-04-01 Thread Anthony Deschamps
Hi Willy,

Those changes are easy enough to make, so I've attached the patch again
with those changes. I had to make a few small adjustments to the commit
message anyway (some things that changed as a result of reviewing this
path). Let me know if there's anything else that I can help with.

Thank you,
Anthony
From 30896b2f6368cd022bfb574d4d5f647483b79a59 Mon Sep 17 00:00:00 2001
From: Anthony Deschamps 
Date: Fri, 16 Feb 2024 16:00:35 -0500
Subject: [PATCH] MEDIUM: lb-chash: Deterministic node hashes based on server
 address

Motivation: When services are discovered through DNS resolution, the order in
which DNS records get resolved and assigned to servers is arbitrary. Therefore,
even though two HAProxy instances using chash balancing might agree that a
particular request should go to server3, it is likely the case that they have
assigned different IPs and ports to the server in that slot.

This patch adds a server option, "hash-key " which can be set to "id" (the
existing behaviour, default), "addr", or "addr-port". By deriving the keys for
the chash tree nodes from a server's address and port we ensure that independent
HAProxy instances will agree on routing decisions. If an address is not known
then the key is derived from the server's puid as it was previously.

When adjusting a server's weight, we now check whether the server's hash has
changed. If it has, we have to remove all its nodes first, since the node keys
will also have to change.
---
 doc/configuration.txt  | 25 +++
 include/haproxy/server-t.h |  9 
 src/lb_chash.c | 91 +++---
 src/server.c   | 31 +
 4 files changed, 150 insertions(+), 6 deletions(-)

diff --git a/doc/configuration.txt b/doc/configuration.txt
index 1065e6098..0d5649d34 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -4813,6 +4813,8 @@ error-log-format  X  X X -
 force-persist -  - X X
 filter-  X X X
 fullconn  X  - X X
+hash-balance-factor   X  - X X
+hash-key  X  - X X
 hash-type X  - X X
 http-after-response   X (!)  X X X
 http-check commentX  - X X
@@ -6606,6 +6608,29 @@ hash-balance-factor 
   See also : "balance" and "hash-type".
 
 
+hash-key 
+  Specify how "hash-type consistent" node keys are computed
+
+  Arguments :
+may be one of the following :
+
+  id The node keys will be derived from the server's numeric
+ identifier as set from "id" or which defaults to its position
+ in the server list.
+
+  addr   The node keys will be derived from the server's address, when
+ available, or else fall back on "id".
+
+  addr-port  The node keys will be derived from the server's address and
+ port, when available, or else fall back on "id".
+
+  The "addr" and "addr-port" options may be useful in scenarios where multiple
+  HAProxy processes are balancing traffic to the same set of servers. If the
+  server order of each process is different (because, for example, DNS records
+  were resolved in different orders) then this will allow each independent
+  HAProxy processes to agree on routing decisions.
+
+
 hash-type   
   Specify a method to use for mapping hashes to servers
 
diff --git a/include/haproxy/server-t.h b/include/haproxy/server-t.h
index f077ff2f8..811eae712 100644
--- a/include/haproxy/server-t.h
+++ b/include/haproxy/server-t.h
@@ -223,6 +223,13 @@ struct pid_list {
 	int exited;
 };
 
+/* srv methods of computing chash keys */
+enum srv_hash_key {
+	SRV_HASH_KEY_ID = 0, /* derived from server puid */
+	SRV_HASH_KEY_ADDR,   /* derived from server address */
+	SRV_HASH_KEY_ADDR_PORT   /* derived from server address and port */
+};
+
 /* A tree occurrence is a descriptor of a place in a tree, with a pointer back
  * to the server itself.
  */
@@ -366,6 +373,8 @@ struct server {
 	struct tree_occ *lb_nodes;  /* lb_nodes_tot * struct tree_occ */
 	unsigned lb_nodes_tot;  /* number of allocated lb_nodes (C-HASH) */
 	unsigned lb_nodes_now;  /* number of lb_nodes placed in the tree (C-HASH) */
+	enum srv_hash_key hash_key; /* method to compute node hash (C-HASH) */
+	unsigned lb_server_key; /* hash of the values indicated by "hash_key" (C-HASH) */
 
 	const struct netns_entr

Re: [PATCH] MEDIUM: lb-chash: Deterministic node hashes based on server address

2024-03-24 Thread Anthony Deschamps
Hi Willy,

I'm just checking in to see if there's anything left I can help address here.

Thank you,
Anthony

On Wed, Mar 13, 2024 at 4:59 PM Willy Tarreau  wrote:
>
> Hi Anthony,
>
> On Wed, Mar 13, 2024 at 12:33:54PM -0400, Anthony Deschamps wrote:
> > Hi Willy,
> >
> > My original concern was that if two servers had values of lb_server_key
> > that are close to each other, then there could be some overlap in their
> > values of lb_server_key + node_index;, which is why I was looking for a
> > reasonable hash function to combine them.
>
> I know but you can't really avoid it anyway, and that's why we place
> many nodes. I tried an attempt in the past where nodes were evenly
> spaced by dividing by the number of entries etc. It was a disaster.
> As soon as a server went down, the next one would take its extra load
> for example. So anything you try to do to avoid some corner cases create
> other ones, while overall, a random distribution (like a hash does)
> gives great results.
>
> > I based it on boost::hash_combine,
> > but since writing that I came across https://stackoverflow.com/a/50978188
> > explaining why boost::hash_combine isn't necessarily a good hash function.
>
> Hehe. We've all written our own hash functions for a given purpose at one
> point in time, and they're usually fine inside their context and bad once
> used outside. That's also where XXH and such stuff shines, it brings
> generic functions that are good all over the space. In our case, the
> full_hash() is just an avalanche non-linear 1:1 mapping between the input
> and the output which suffices to avoid resonance effects that come from
> repeated additions, it's perfectly suited here I think.
>
> > I agree that it's not too critical to have a few collisions, but I'm 
> > reaching
> > the limits of my knowledge here. Does this change address your concerns?
> > I've attached an updated patch, and here's the diff between them:
> >
> > diff --git a/src/lb_chash.c b/src/lb_chash.c
> > index c77222854..d7a1b2539 100644
> > --- a/src/lb_chash.c
> > +++ b/src/lb_chash.c
> > @@ -66,9 +66,7 @@ static inline void chash_dequeue_srv(struct server *s)
> >   */
> >  static inline u32 chash_compute_node_key(struct server *s, unsigned 
> > node_index)
> >  {
> > -u32 server_key = s->lb_server_key;
> > -u32 index_key = full_hash(node_index);
> > -return server_key ^= index_key + 0x9e3779b9 + (server_key << 6) +
> > (server_key >> 2);
> > +return full_hash(s->lb_server_key + node_index);
> >  }
> >
> >  /* Compute the base server key that will be used to compute node keys. 
> > Servers
> > @@ -112,8 +110,7 @@ static inline u32 chash_compute_server_key(struct 
> > server *s)
> >  case SRV_HASH_KEY_ADDR:
> >  switch (srv_addr.family) {
> >  case AF_INET:
> > -u32 addr_key = full_hash(srv_addr.addr.v4.s_addr);
> > -key ^= addr_key + 0x9e3779b9 + (key << 6) + (key >> 2);
> > +key = full_hash(key + srv_addr.addr.v4.s_addr);
> >  break;
> >  case AF_INET6:
> >  key = XXH32(srv_addr.addr.v6.s6_addr, 16, key);
>
> Yeay I think it addresses everything. I'll re-test your updated patch
> tomorrow hoping that this time I'll merge it :-)
>
> Thanks for your patience!
> Willy



Re: [PATCH] MEDIUM: lb-chash: Deterministic node hashes based on server address

2024-03-13 Thread Anthony Deschamps
Hi Willy,

My original concern was that if two servers had values of lb_server_key
that are close to each other, then there could be some overlap in their
values of lb_server_key + node_index;, which is why I was looking for a
reasonable hash function to combine them. I based it on boost::hash_combine,
but since writing that I came across https://stackoverflow.com/a/50978188
explaining why boost::hash_combine isn't necessarily a good hash function.

I agree that it's not too critical to have a few collisions, but I'm reaching
the limits of my knowledge here. Does this change address your concerns?
I've attached an updated patch, and here's the diff between them:

diff --git a/src/lb_chash.c b/src/lb_chash.c
index c77222854..d7a1b2539 100644
--- a/src/lb_chash.c
+++ b/src/lb_chash.c
@@ -66,9 +66,7 @@ static inline void chash_dequeue_srv(struct server *s)
  */
 static inline u32 chash_compute_node_key(struct server *s, unsigned node_index)
 {
-u32 server_key = s->lb_server_key;
-u32 index_key = full_hash(node_index);
-return server_key ^= index_key + 0x9e3779b9 + (server_key << 6) +
(server_key >> 2);
+return full_hash(s->lb_server_key + node_index);
 }

 /* Compute the base server key that will be used to compute node keys. Servers
@@ -112,8 +110,7 @@ static inline u32 chash_compute_server_key(struct server *s)
 case SRV_HASH_KEY_ADDR:
 switch (srv_addr.family) {
 case AF_INET:
-u32 addr_key = full_hash(srv_addr.addr.v4.s_addr);
-key ^= addr_key + 0x9e3779b9 + (key << 6) + (key >> 2);
+key = full_hash(key + srv_addr.addr.v4.s_addr);
 break;
 case AF_INET6:
 key = XXH32(srv_addr.addr.v6.s6_addr, 16, key);


Thanks,
Anthony
From 3280a39664bf913807200f21d1b952856966d141 Mon Sep 17 00:00:00 2001
From: Anthony Deschamps 
Date: Fri, 16 Feb 2024 16:00:35 -0500
Subject: [PATCH] MEDIUM: lb-chash: Deterministic node hashes based on server
 address

Motivation: When services are discovered through DNS resolution, the order in
which DNS records get resolved and assigned to servers is arbitrary. Therefore,
even though two HAProxy instances using chash balancing might agree that a
particular request should go to server3, it is likely the case that they have
assigned different IPs and ports to the server in that slot.

This patch adds a server option, "hash-key " which can be set to "id" (the
existing behaviour, defaut), "addr", or "addr-port". By deriving the keys for
the chash tree nodes from a server's address and port we ensure that independent
HAProxy instances will agree on routing decisions. If an address is not known
then the key is derived from the server's puid as it was previously.

When adjusting a server's weight, we now unconditionally remove all nodes first,
since the server's address (and therefore the desired node keys) may have changed.
---
 doc/configuration.txt  | 24 ++
 include/haproxy/server-t.h |  9 
 src/lb_chash.c | 90 +++---
 src/server.c   | 31 +
 4 files changed, 148 insertions(+), 6 deletions(-)

diff --git a/doc/configuration.txt b/doc/configuration.txt
index 1065e6098..052d41f9f 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -4813,6 +4813,8 @@ error-log-format  X  X X -
 force-persist -  - X X
 filter-  X X X
 fullconn  X  - X X
+hash-balance-factor   X  - X X
+hash-key  X  - X X
 hash-type X  - X X
 http-after-response   X (!)  X X X
 http-check commentX  - X X
@@ -6606,6 +6608,28 @@ hash-balance-factor 
   See also : "balance" and "hash-type".
 
 
+hash-key 
+  Specify how "hash-type consistent" node keys are computed
+
+  Arguments :
+may be one of the following :
+
+  id The node keys will be derived from the server's position in
+ the server list.
+
+  addr   The node keys will be derived from the server's address, when
+ available, or else fall back on "id".
+
+  addr-port  The node keys will be derived from the server's address and
+ port, when available, or else fall back on "id".
+
+  The "addr" and "addr-port" options may be useful in scenarios where multiple
+  HAProxy processes are balancing traffic to the same set of servers. If the
+  server order of each process is different (because, for example

Re: [PATCH] MEDIUM: lb-chash: Deterministic node hashes based on server address

2024-03-12 Thread Anthony Deschamps
Hi Willy,

Thanks for the feedback. I had been testing with smaller numbers of
servers (usually between 4 and 32) so I hadn't noticed the performance
impact. Here's an updated patch (as an attachment, until I figure out how
to make sure things are formatted correctly!) that should address that.

I added a "lb_server_key" to the server struct. I moved most of the hashing
logic that I previously had in "chash_compute_node_key" into a new function,
"chash_compute_server_key", and I store that value; nodes only need to be
removed and reinserted if this value changes. To compute a node key, I just
combine the server key with a hash of node_index. I considered using
XXH32_round to do this, but that looks like it's intended to be an internal
function, so I inlined something similar instead.

Hopefully this all works. Thanks for taking the time to work through this, and
please let me know if there's anything else to address.

Thanks,
Anthony
From 82acc8ff344abde640941d15fa5217b0dfe9701a Mon Sep 17 00:00:00 2001
From: Anthony Deschamps 
Date: Fri, 16 Feb 2024 16:00:35 -0500
Subject: [PATCH] MEDIUM: lb-chash: Deterministic node hashes based on server
 address

Motivation: When services are discovered through DNS resolution, the order in
which DNS records get resolved and assigned to servers is arbitrary. Therefore,
even though two HAProxy instances using chash balancing might agree that a
particular request should go to server3, it is likely the case that they have
assigned different IPs and ports to the server in that slot.

This patch adds a server option, "hash-key " which can be set to "id" (the
existing behaviour, defaut), "addr", or "addr-port". By deriving the keys for
the chash tree nodes from a server's address and port we ensure that independent
HAProxy instances will agree on routing decisions. If an address is not known
then the key is derived from the server's puid as it was previously.

When adjusting a server's weight, we now unconditionally remove all nodes first,
since the server's address (and therefore the desired node keys) may have changed.
---
 doc/configuration.txt  | 24 ++
 include/haproxy/server-t.h |  9 
 src/lb_chash.c | 93 +++---
 src/server.c   | 31 +
 4 files changed, 151 insertions(+), 6 deletions(-)

diff --git a/doc/configuration.txt b/doc/configuration.txt
index 1065e6098..052d41f9f 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -4813,6 +4813,8 @@ error-log-format  X  X X -
 force-persist -  - X X
 filter-  X X X
 fullconn  X  - X X
+hash-balance-factor   X  - X X
+hash-key  X  - X X
 hash-type X  - X X
 http-after-response   X (!)  X X X
 http-check commentX  - X X
@@ -6606,6 +6608,28 @@ hash-balance-factor 
   See also : "balance" and "hash-type".
 
 
+hash-key 
+  Specify how "hash-type consistent" node keys are computed
+
+  Arguments :
+may be one of the following :
+
+  id The node keys will be derived from the server's position in
+ the server list.
+
+  addr   The node keys will be derived from the server's address, when
+ available, or else fall back on "id".
+
+  addr-port  The node keys will be derived from the server's address and
+ port, when available, or else fall back on "id".
+
+  The "addr" and "addr-port" options may be useful in scenarios where multiple
+  HAProxy processes are balancing traffic to the same set of servers. If the
+  server order of each process is different (because, for example, DNS records
+  were resolved in different orders) then this will allow each independent
+  HAProxy processes to agree on routing decisions.
+
+
 hash-type   
   Specify a method to use for mapping hashes to servers
 
diff --git a/include/haproxy/server-t.h b/include/haproxy/server-t.h
index f077ff2f8..811eae712 100644
--- a/include/haproxy/server-t.h
+++ b/include/haproxy/server-t.h
@@ -223,6 +223,13 @@ struct pid_list {
 	int exited;
 };
 
+/* srv methods of computing chash keys */
+enum srv_hash_key {
+	SRV_HASH_KEY_ID = 0, /* derived from server puid */
+	SRV_HASH_KEY_ADDR,   /* derived from server address */
+	SRV_HASH_KEY_ADDR_PORT   /* derived from server address and port */
+};
+
 /* A tree occurrence is a descriptor of a place in a tree, with a pointer back
  * to the serve

Re: [PATCH] MEDIUM: lb-chash: Deterministic node hashes based on server address

2024-03-11 Thread Anthony Deschamps
Sorry for the trouble! I'll have to sort out what's happening. Here it is
as an attachment.

Thanks!
Anthony


On Mon, Mar 11, 2024 at 11:06 PM Willy Tarreau  wrote:

> Hi Anthony,
>
> On Mon, Mar 11, 2024 at 08:58:17PM -0400, Anthony Deschamps wrote:
> > > I'm not sure the scripts will help me (at least :-)). I was thinking
> that
> > > a test could just be "set server XXX addr YYY" on the CLI to change the
> > > server's address and verify that the hashing follows the address not
> the
> > > server number.
> >
> > Oh, good point -- I wasn't testing what happens when changing the address
> > via the CLI. It turns out I wasn't handling that case. I added a call in
> > _srv_set_inetaddr_port() to update the hashes, and tested that it works.
> >
> > I also moved the docs and added an entry to the table.
> >
> > Here's the updated patch.
>
> Thanks!
>
> However there's still a problem with your mailer mangling the patch
> see below:
>
> > + /* If hash-key is addr or addr-port then we need the address, but if we
> > + * can't determine the address then we fall back on hashing the puid.
> > + */
> > + switch (s->hash_key) {
> > + case SRV_HASH_KEY_ADDR:
> > + case SRV_HASH_KEY_ADDR_PORT:
> > + server_get_inetaddr(s, _addr);
> > + if (srv_addr.family != AF_INET && srv_addr.family != AF_INET6) {
> > +  hash_key = SRV_HASH_KEY_ID;
> > + }
> > + break;
> (...)
>
> Can you please resend it as an attachment instead ?
>
> Thanks!
> Willy
>
From 464504a3c8560a952602d6f28009727f29e7c289 Mon Sep 17 00:00:00 2001
From: Anthony Deschamps 
Date: Fri, 16 Feb 2024 16:00:35 -0500
Subject: [PATCH] MEDIUM: lb-chash: Deterministic node hashes based on server
 address

Motivation: When services are discovered through DNS resolution, the order in
which DNS records get resolved and assigned to servers is arbitrary. Therefore,
even though two HAProxy instances using chash balancing might agree that a
particular request should go to server3, it is likely the case that they have
assigned different IPs and ports to the server in that slot.

This patch adds a server option, "hash-key " which can be set to "id" (the
existing behaviour, defaut), "addr", or "addr-port". By deriving the keys for
the chash tree nodes from a server's address and port we ensure that independent
HAProxy instances will agree on routing decisions. If an address is not known
then the key is derived from the server's puid as it was previously.

When adjusting a server's weight, we now unconditionally remove all nodes first,
since the server's address (and therefore the desired node keys) may have changed.
---
 doc/configuration.txt  |  24 +
 include/haproxy/server-t.h |   8 +++
 src/lb_chash.c | 101 +++--
 src/server.c   |  31 
 4 files changed, 138 insertions(+), 26 deletions(-)

diff --git a/doc/configuration.txt b/doc/configuration.txt
index 1065e6098..052d41f9f 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -4813,6 +4813,8 @@ error-log-format  X  X X -
 force-persist -  - X X
 filter-  X X X
 fullconn  X  - X X
+hash-balance-factor   X  - X X
+hash-key  X  - X X
 hash-type X  - X X
 http-after-response   X (!)  X X X
 http-check commentX  - X X
@@ -6606,6 +6608,28 @@ hash-balance-factor 
   See also : "balance" and "hash-type".
 
 
+hash-key 
+  Specify how "hash-type consistent" node keys are computed
+
+  Arguments :
+may be one of the following :
+
+  id The node keys will be derived from the server's position in
+ the server list.
+
+  addr   The node keys will be derived from the server's address, when
+ available, or else fall back on "id".
+
+  addr-port  The node keys will be derived from the server's address and
+ port, when available, or else fall back on "id".
+
+  The "addr" and "addr-port" options may be useful in scenarios where multiple
+  HAProxy processes are balancing traffic to the same set of servers. If the
+  server order of each process is different (because, for example, DNS records
+  were resolved in different orders) then this will allow each independent
+  HAProxy pro

Re: [PATCH] MEDIUM: lb-chash: Deterministic node hashes based on server address

2024-03-11 Thread Anthony Deschamps
> I'm not sure the scripts will help me (at least :-)). I was thinking that
> a test could just be "set server XXX addr YYY" on the CLI to change the
> server's address and verify that the hashing follows the address not the
> server number.

Oh, good point -- I wasn't testing what happens when changing the address
via the CLI. It turns out I wasn't handling that case. I added a call in
_srv_set_inetaddr_port() to update the hashes, and tested that it works.

I also moved the docs and added an entry to the table.

Here's the updated patch.

Thanks!
Anthony

>From 3412e86c222ad7b231d687f1402fbe30de312974 Mon Sep 17 00:00:00 2001
From: Anthony Deschamps 
Date: Fri, 16 Feb 2024 16:00:35 -0500
Subject: [PATCH] MEDIUM: lb-chash: Deterministic node hashes based on server
 address

Motivation: When services are discovered through DNS resolution, the order
in
which DNS records get resolved and assigned to servers is arbitrary.
Therefore,
even though two HAProxy instances using chash balancing might agree that a
particular request should go to server3, it is likely the case that they
have
assigned different IPs and ports to the server in that slot.

This patch adds a server option, "hash-key " which can be set to "id"
(the
existing behaviour, defaut), "addr", or "addr-port". By deriving the keys
for
the chash tree nodes from a server's address and port we ensure that
independent
HAProxy instances will agree on routing decisions. If an address is not
known
then the key is derived from the server's puid as it was previously.

When adjusting a server's weight, we now unconditionally remove all nodes
first,
since the server's address (and therefore the desired node keys) may have
changed.
---
 doc/configuration.txt  |  24 +
 include/haproxy/server-t.h |   8 +++
 src/lb_chash.c | 101 +++--
 src/server.c   |  31 
 4 files changed, 138 insertions(+), 26 deletions(-)

diff --git a/doc/configuration.txt b/doc/configuration.txt
index 1065e6098..052d41f9f 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -4813,6 +4813,8 @@ error-log-format  X
 X X -
 force-persist -  - X X
 filter-  X X X
 fullconn  X  - X X
+hash-balance-factor   X  - X X
+hash-key  X  - X X
 hash-type X  - X X
 http-after-response   X (!)  X X X
 http-check commentX  - X X
@@ -6606,6 +6608,28 @@ hash-balance-factor 
   See also : "balance" and "hash-type".


+hash-key 
+  Specify how "hash-type consistent" node keys are computed
+
+  Arguments :
+may be one of the following :
+
+  id The node keys will be derived from the server's position
in
+ the server list.
+
+  addr   The node keys will be derived from the server's address,
when
+ available, or else fall back on "id".
+
+  addr-port  The node keys will be derived from the server's address
and
+ port, when available, or else fall back on "id".
+
+  The "addr" and "addr-port" options may be useful in scenarios where
multiple
+  HAProxy processes are balancing traffic to the same set of servers. If
the
+  server order of each process is different (because, for example, DNS
records
+  were resolved in different orders) then this will allow each independent
+  HAProxy processes to agree on routing decisions.
+
+
 hash-type   
   Specify a method to use for mapping hashes to servers

diff --git a/include/haproxy/server-t.h b/include/haproxy/server-t.h
index f077ff2f8..1a2735c2a 100644
--- a/include/haproxy/server-t.h
+++ b/include/haproxy/server-t.h
@@ -223,6 +223,13 @@ struct pid_list {
  int exited;
 };

+/* srv methods of computing chash keys */
+enum srv_hash_key {
+ SRV_HASH_KEY_ID = 0, /* derived from server puid */
+ SRV_HASH_KEY_ADDR,   /* derived from server address */
+ SRV_HASH_KEY_ADDR_PORT   /* derived from server address and port */
+};
+
 /* A tree occurrence is a descriptor of a place in a tree, with a pointer
back
  * to the server itself.
  */
@@ -366,6 +373,7 @@ struct server {
  struct tree_occ *lb_nodes;  /* lb_nodes_tot * struct tree_occ
*/
  unsigned lb_nodes_tot;  /* number of allocated lb_nodes
(C-HASH) */
  unsigned lb_nodes_now;  /* number of lb_nodes placed in
the tree (C-HASH) */
+ enum srv_hash_key hash_key; /* method to compute node hash
(C-HAS

Re: [PATCH] MINOR: lb-chash: Respect maxconn when selecting a server

2024-02-21 Thread Anthony Deschamps
Hi Willy,

I wonder if I could accomplish what I'm looking to do by changing the
behaviour of "maxqueue" (without making a breaking change, ideally).
Currently, "maxqueue 0" is interpreted as "unlimited". However, the system
I'm working with has long-running WebSocket connections, so a queued
request is likely to sit in the queue for too long.

Is there a way to configure "maxqueue 0" for real? If not, then perhaps
that's a patch I should make first, and then come back to this one. Since
"maxqueue 0" already means "unlimited", maybe a "no-queue" option
would be best in order to avoid making a breaking change.

Thanks,
Anthony


Re: [PATCH] MEDIUM: lb-chash: Deterministic node hashes based on server address

2024-02-21 Thread Anthony Deschamps
Hi Willy, thanks for the thoughtful feedback.

Here's a new patch that makes this configurable via a "hash-key" server
argument, which defaults to "id" as you suggested.

I'm struggling to test the last case you described. A quick description of
my test setup: I have multiple instances of a simple echo server running
on a range of ports, which I spawn/kill manually. I then have a local consul
agent providing DNS, and multiple proxy instances running, and my actual
test is a short Python script that makes a series of requests to each of the
proxies and checks whether they were all routed to the same echo server
(they identify themselves by setting a header in the response). I can share
all the test scripts/config if it's helpful.

When I configure my proxy with long health checks -- by setting
 "server-template ... check inter 6" -- I still find that when I
kill/spawn an
instance of the mock service, a server will first go down when an entry is
removed from the SRV record, and then get rehashed when a new SRV
entry is assigned to the server. Is there a different sequence of events I
should be testing?

Here is the updated patch.

Thanks,
Anthony

>From 7d8a63803017c9b7630ec5cb3a154778fdff4d86 Mon Sep 17 00:00:00 2001
From: Anthony Deschamps 
Date: Fri, 16 Feb 2024 16:00:35 -0500
Subject: [PATCH] MEDIUM: lb-chash: Deterministic node hashes based on server
 address

Motivation: When services are discovered through DNS resolution, the order
in
which DNS records get resolved and assigned to servers is arbitrary.
Therefore,
even though two HAProxy instances using chash balancing might agree that a
particular request should go to server3, it is likely the case that they
have
assigned different IPs and ports to the server in that slot.

This patch adds a server option, "hash-key " which can be set to "id"
(the
existing behaviour, defaut), "addr", or "addr-port". By deriving the keys
for
the chash tree nodes from a server's address and port we ensure that
independent
HAProxy instances will agree on routing decisions. If an address is not
known
then the key is derived from the server's puid as it was previously.

When adjusting a server's weight, we now unconditionally remove all nodes
first,
since the server's address (and therefore the desired node keys) may have
changed.
---
 doc/configuration.txt  | 21 
 include/haproxy/server-t.h |  8 
 src/lb_chash.c | 98 +-
 src/server.c   | 28 +++
 4 files changed, 132 insertions(+), 23 deletions(-)

diff --git a/doc/configuration.txt b/doc/configuration.txt
index 1065e6098..a0710395a 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -15763,6 +15763,27 @@ group 
   "gid" setting except that the group name is used instead of its gid. This
   setting is ignored by non UNIX sockets.

+hash-key 
+  Specify how "hash-type consistent" node keys are computed
+
+  Arguments :
+may be one of the following :
+
+  id The node keys will be derived from the server's position
in
+ the server list.
+
+  addr   The node keys will be derived from the server's address,
when
+ available, or else fall back on "id".
+
+  addr-port  The node keys will be derived from the server's address
and
+ port, when available, or else fall back on "id".
+
+  The "addr" and "addr-port" options may be useful in scenarios where
multiple
+  HAProxy processes are balancing traffic to the same set of servers. If
the
+  server order of each process is different (because, for example, DNS
records
+  were resolved in different orders) then this will allow each independent
+  HAProxy processes to agree on routing decisions.
+
 id 
   Fixes the socket ID. By default, socket IDs are automatically assigned,
but
   sometimes it is more convenient to fix them to ease monitoring. This
value
diff --git a/include/haproxy/server-t.h b/include/haproxy/server-t.h
index f077ff2f8..1a2735c2a 100644
--- a/include/haproxy/server-t.h
+++ b/include/haproxy/server-t.h
@@ -223,6 +223,13 @@ struct pid_list {
  int exited;
 };

+/* srv methods of computing chash keys */
+enum srv_hash_key {
+ SRV_HASH_KEY_ID = 0, /* derived from server puid */
+ SRV_HASH_KEY_ADDR,   /* derived from server address */
+ SRV_HASH_KEY_ADDR_PORT   /* derived from server address and port */
+};
+
 /* A tree occurrence is a descriptor of a place in a tree, with a pointer
back
  * to the server itself.
  */
@@ -366,6 +373,7 @@ struct server {
  struct tree_occ *lb_nodes;  /* lb_nodes_tot * struct tree_occ
*/
  unsigned lb_nodes_tot;  /* number of allocated lb_nodes
(C-HASH) */
  unsigned lb_nodes_now;  /* number of lb_nodes placed in
the tree (C-HASH) */
+ enum srv_hash_key hash_key;

[PATCH] MEDIUM: lb-chash: Deterministic node hashes based on server address

2024-02-16 Thread Anthony Deschamps
>From a031cf97da759eb2c2f9b6e191065ad503f821ed Mon Sep 17 00:00:00 2001
From: Anthony Deschamps 
Date: Fri, 16 Feb 2024 16:00:35 -0500
Subject: [PATCH] MEDIUM: lb-chash: Deterministic node hashes based on server
 address

Motivation: When services are discovered through DNS resolution, the order
in
which DNS records get resolved and assigned to servers is arbitrary.
Therefore,
even though two HAProxy instances using chash balancing might agree that a
particular request should go to server3, it is likely the case that they
have
assigned different IPs and ports to the server in that slot.

By deriving the keys for the chash tree nodes from a server's address and
port
we ensure that independent HAProxy instances will agree on routing
decisions. If
an address is not known then the key is derived from the server's puid as
it was
previously.

When adjusting a server's weight, we now unconditionally remove all nodes
first,
since the server's address (and therefore the desired node keys) may have
changed.
---
 src/lb_chash.c | 71 ++
 1 file changed, 48 insertions(+), 23 deletions(-)

diff --git a/src/lb_chash.c b/src/lb_chash.c
index 1c8034d89..bd39a07f8 100644
--- a/src/lb_chash.c
+++ b/src/lb_chash.c
@@ -21,8 +21,9 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
+#include 

 /* Return next tree node after  which must still be in the tree, or
be
  * NULL. Lookup wraps around the end to the beginning. If the next node is
the
@@ -58,6 +59,37 @@ static inline void chash_dequeue_srv(struct server *s)
  }
 }

+/* Compute a key that can be used to insert a node into the CHASH tree. If
the
+ * server has a known address then the key will be determined from its
address
+ * and port. This means that independent HAProxy instances will agree on
routing
+ * decisions. If an address is not known then the key will be determined
from
+ * the server's puid.
+ */
+static inline u32 chash_compute_node_key(struct server *s, unsigned
node_index)
+{
+ u32 key;
+ struct server_inetaddr srv_addr;
+
+ server_get_inetaddr(s, _addr);
+
+ switch (srv_addr.family) {
+ case AF_INET:
+ key = full_hash(srv_addr.addr.v4.s_addr);
+ key ^= full_hash(srv_addr.port.svc);
+ key ^= full_hash(node_index);
+ break;
+ case AF_INET6:
+ unsigned seed = (srv_addr.port.svc << 16) + node_index;
+ key = XXH32(srv_addr.addr.v6.s6_addr, 16, seed);
+ break;
+ default:
+ key = full_hash(s->puid * SRV_EWGHT_RANGE + node_index);
+ break;
+ }
+
+ return key;
+}
+
 /* Adjust the number of entries of a server in its tree. The server must
appear
  * as many times as its weight indicates it. If it's there too often, we
remove
  * the last occurrences. If it's not there enough, we add more
occurrences. To
@@ -67,39 +99,32 @@ static inline void chash_dequeue_srv(struct server *s)
  */
 static inline void chash_queue_dequeue_srv(struct server *s)
 {
- while (s->lb_nodes_now > s->next_eweight) {
- if (s->lb_nodes_now >= s->lb_nodes_tot) // should always be false anyway
- s->lb_nodes_now = s->lb_nodes_tot;
- s->lb_nodes_now--;
- if (s->proxy->lbprm.chash.last == >lb_nodes[s->lb_nodes_now].node)
- s->proxy->lbprm.chash.last = chash_skip_node(s->lb_tree,
s->proxy->lbprm.chash.last);
- eb32_delete(>lb_nodes[s->lb_nodes_now].node);
- }
+ unsigned int j;
+
+ /* First we need to remove all of the server's entries from its tree
+ * because a) the realloc will change all node pointers and b) the keys
+ * may change.
+ */
+ chash_dequeue_srv(s);

  /* Attempt to increase the total number of nodes, if the user
  * increased the weight beyond the original weight
  */
  if (s->lb_nodes_tot < s->next_eweight) {
  struct tree_occ *new_nodes;
-
- /* First we need to remove all server's entries from its tree
- * because the realloc will change all nodes pointers */
- chash_dequeue_srv(s);
-
  new_nodes = realloc(s->lb_nodes, s->next_eweight * sizeof(*new_nodes));
  if (new_nodes) {
- unsigned int j;
-
  s->lb_nodes = new_nodes;
- memset(>lb_nodes[s->lb_nodes_tot], 0,
-(s->next_eweight - s->lb_nodes_tot) * sizeof(*s->lb_nodes));
- for (j = s->lb_nodes_tot; j < s->next_eweight; j++) {
- s->lb_nodes[j].server = s;
- s->lb_nodes[j].node.key = full_hash(s->puid * SRV_EWGHT_RANGE + j);
- }
  s->lb_nodes_tot = s->next_eweight;
  }
  }
+
+ memset(s->lb_nodes, 0, s->next_eweight * sizeof(*s->lb_nodes));
+ for (j = 0; j < s->next_eweight; j++) {
+ s->lb_nodes[j].server = s;
+ s->lb_nodes[j].node.key = chash_compute_node_key(s, j);
+ }
+
  while (s->lb_nodes_now < s->next_eweight) {
  if (s->lb_nodes_now >= s->lb_nodes_tot) // should always be false anyway
  break;
@@ -507,7 +532,7 @@ int chash_init_server_tree(struct proxy *p)
  }
  for (node = 0; node < srv->lb_nodes_tot; node++) {
  srv->lb_nodes[node].server = srv;
- srv->lb_nodes[node].node.key = full_hash(srv->puid * SRV_EWGHT_RANGE +
node);
+ srv->lb_nodes[node].node.key = chash_compute_node_key(srv, node);
  }

  if (srv_currently_usable(srv))
-- 
2.43.2


[PATCH] MINOR: lb-chash: Respect maxconn when selecting a server

2024-02-13 Thread Anthony Deschamps
>From 3fc983b719bd4d8af80037c36e7032e0af383557 Mon Sep 17 00:00:00 2001
From: Anthony Deschamps 
Date: Tue, 13 Feb 2024 18:11:56 -0500
Subject: [PATCH] MINOR: lb-chash: Respect maxconn when selecting a server

This is useful in a situation where hash-balance-factor isn't quite
optimal. For example:

If we have two servers at times of low traffic, if server A has one active
request and server B has
zero, we might prefer that a second request for the same resource also be
routed to server A; with
hash-balance-factor set to anything less than 200, that request would
routed to server B, causing it
to dedicate resources to a resource that is already loaded/warm in server
A.  Meanwhile, at times of
high traffic, we still want multiple servers to be able to share the load
of highly used resources.
---
 src/lb_chash.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/lb_chash.c b/src/lb_chash.c
index 4e8fb1536..1c8034d89 100644
--- a/src/lb_chash.c
+++ b/src/lb_chash.c
@@ -371,7 +371,7 @@ struct server *chash_get_server_hash(struct proxy *p,
unsigned int hash, const s
  }

  loop = 0;
- while (nsrv == avoid || (p->lbprm.hash_balance_factor &&
!chash_server_is_eligible(nsrv))) {
+ while (nsrv == avoid || (nsrv->maxconn && nsrv->served >= nsrv->maxconn)
|| (p->lbprm.hash_balance_factor && !chash_server_is_eligible(nsrv))) {
  next = eb32_next(next);
  if (!next) {
  next = eb32_first(root);
-- 
2.43.0