[PATCH] MINOR: cli: add option to modify close-spread-time

2024-04-08 Thread Abhijeet Rastogi
Hi HAproxy community,

Let's assume that HAproxy starts with non-zero values for close-spread-time
and hard-stop-after, and soft-stop is used to initiate the shutdown during
deployments.
There are times where we need to modify these values, eg, in case the
operator needs to restart HAproxy faster, but still use the soft-stop
workflow.

So, it will be nice to have the ability to modify these values via CLI.

RFC:-
- Does this seem like a fair use-case?
- If this is good, I can also work on the patch for hard-stop-after.

Patch questions:-
- I've added reg-tests under a new folder "cli" because I was unable to
find a better match. Let me know if that should be moved.
- If that's a concern, there is code duplication b/w proxy.c [1]  and this
cli.c. I am unable to find a file where we can create a utility function.
Mostly, the concern is to modify "global.tune.options" whenever
"global.close_spread_time" changes.
- I noticed global struct is accessed without any locks, is that like a
"known" race condition for these simple operations? I don't primarily
develop C code, and this was confusing to me.

Please find the patch attached with this email.

1.
https://github.com/haproxy/haproxy/blob/70251a2aeb5930f3fc25aadf979d5ce5007d0f9d/src/proxy.c#L2072

--
Cheers,
Abhijeet (https://abhi.host)
From 896d3a100faa8369928c080d5aa67f9cb44204c4 Mon Sep 17 00:00:00 2001
From: Abhijeet Rastogi 
Date: Mon, 8 Apr 2024 15:30:44 -0700
Subject: [PATCH] MINOR: cli: add option to modify close-spread-time

close-spread-time is only set during config parse stage, and then there
is no API available today to modify it.
If there is a requirement to speed-up the soft-stop for HAproxy, it is
not possible to dynamically lower this value before initiating
soft-stop. This CLI feature now provides ability to modify the
close-spread-time value dynamically, after HAProxy process has already
started.

A new `notice` message is also added to showcase the current value of
close-spread-time when it is non-zero.

New reg-tests dir `cli` is created as there is currently no appropriate
director where this test falls in.
---
 reg-tests/cli/cli_set_close_spread_time.vtc | 55 
 src/cli.c   | 56 +
 src/proxy.c |  2 +
 3 files changed, 113 insertions(+)
 create mode 100644 reg-tests/cli/cli_set_close_spread_time.vtc

diff --git a/reg-tests/cli/cli_set_close_spread_time.vtc b/reg-tests/cli/cli_set_close_spread_time.vtc
new file mode 100644
index 0..33db4915b
--- /dev/null
+++ b/reg-tests/cli/cli_set_close_spread_time.vtc
@@ -0,0 +1,55 @@
+varnishtest "Set close-spread-time via CLI"
+
+feature ignore_unknown_macro
+
+# for "set close-spread-time "
+# for "get close-spread-time"
+#REGTEST_TYPE=devel
+
+# Do nothing. Is there only to create s1_* macros
+server s1 {
+} -start
+
+haproxy h1 -conf {
+global
+close-spread-time 10s
+
+defaults
+mode http
+timeout connect "${HAPROXY_TEST_TIMEOUT-5s}"
+timeout client  "${HAPROXY_TEST_TIMEOUT-5s}"
+timeout server  "${HAPROXY_TEST_TIMEOUT-5s}"
+
+frontend myfrontend
+bind "fd@${my_fe}"
+default_backend test
+
+backend test
+server www1 ${s1_addr}:${s1_port}
+} -start
+
+haproxy h1 -cli {
+# Starts with 10s
+send "get close-spread-time"
+expect ~ "close-spread-time=1ms"
+
+send "set close-spread-time 1s"
+expect ~ ""
+send "get close-spread-time"
+expect ~ "close-spread-time=1000ms"
+
+# Disabling close-spread-time is possible
+send "set close-spread-time 0"
+expect ~ ""
+send "get close-spread-time"
+expect ~ "close-spread-time=0"
+
+# Negative value is error
+send "set close-spread-time -1"
+expect ~ "Invalid"
+send "set close-spread-time 25d"
+expect ~ "Timer overflow"
+send "get close-spread-time"
+expect ~ "close-spread-time=0"
+} -wait
+
diff --git a/src/cli.c b/src/cli.c
index 02cb06843..4939efe94 100644
--- a/src/cli.c
+++ b/src/cli.c
@@ -1716,6 +1716,60 @@ static int cli_parse_show_fd(char **args, char *payload, struct appctx *appctx,
 	return 0;
 }
 
+/* parse a "get close_spread_time" CLI request. It always returns 1. */
+static int cli_parse_get_close_spread_time(char **args, char *payload, struct appctx *appctx, void *private)
+{
+	char *output = NULL;
+
+	if (!cli_has_level(appctx, ACCESS_LVL_ADMIN))
+		return 1;
+
+	memprintf(, "close-spread-time=%dms\n", global.close_spread_time);
+
+	cli_dynmsg(appctx, LOG_INFO, output);
+
+	return 1;
+}
+
+
+/* parse a "set close_spread_time" CLI request. It always returns 1. */
+static int cli_parse_set_close_spread_time(char **args, char *payload, struct appctx *appctx, void *private)
+{
+	const char *res;
+
+	if (!cli_has_level(appctx, ACCESS_LVL_ADMIN))
+		return 1;
+
+	if (!*args[2])
+		return cli_err(appctx, "Expects an integer value or infinite.\n");
+
+	if (strcmp(args[2], "infinite") == 0) {

Haproxy Technologies - VoIP phone service?

2024-04-08 Thread Polly Dodson
Hello,

Following up since we're assisting business owners who change telephone
companies and reduce their expenses up to thirty percent.

We assisted one company in your State recently reduce expenses up to 400
usd annually.

Ever looked into this?


Look forward to your thoughts,

Polly Dodson
Smungan


Re: haproxy backend server template service discovery questions

2024-04-08 Thread Andrii Ustymenko

I couldn't find this behavior described in the documentation, indeed.

I am not sure to what extent documentation should cover this. For us it 
is important to understand how it works, so we can build more 
predictable/reliable setup of the load-balancing layer.


Also, if the behavior has changed/is going to be changed in new 
versions, maybe, indeed some words would be nice to have in the 
documentation as well.


On 08/04/2024 10:45, Илья Шипицин wrote:

am I right that you consider that as a documentation bug ?

пн, 8 апр. 2024 г. в 10:44, Andrii Ustymenko :

Yes, for the 1) question indeed.
Basically I have tested with local "out of sync" custom
nameserver. And I was observing some inconsistent results of the
backend-servers table. That led to this question.

Most of the time I was seeing the state of only from the local
nameserver. However sometimes I have seen the "merged" state where
all the replies existed together in the table.

It was also observed that amount of the requests made by haproxy
to all nameservers is the same even though the local one normally
replies faster.

And sorry, forgot to mention we are running haproxy version 2.8.7

On 08/04/2024 10:31, Илья Шипицин wrote:

and particularly your question is "does HAProxy merge all
responses or pick the first one or use some other approach" ?

пн, 8 апр. 2024 г. в 10:23, Andrii Ustymenko
:

I guess indeed it is not a case of consul-template
specifically, but more of rendered templates and how haproxy
maintains it.

On 06/04/2024 20:15, Илья Шипицин wrote:

Consul template is something done by consul itself, after
that haproxy.conf is rendered

Do you mean "how haproxy deals with rendered template"?

On Fri, Apr 5, 2024, 15:02 Andrii Ustymenko
 wrote:

Dear list!

My name is Andrii. I work for Adyen. We are using
haproxy as our main
software loadbalancer at quite large scale.
Af of now our main use-case for backends routing based on
server-template and dynamic dns from consul as service
discovery. Below
is the example of simple backend configuration:

```
backend example
   balance roundrobin
   server-template application 10
_application._tcp.consul resolvers
someresolvers init-addr last,libc,none resolve-opts
allow-dup-ip
resolve-prefer ipv4 check ssl verify none
```

and in global configuration

```
resolvers someresolvers
   nameserver ns1 10.10.10.10:53 
   nameserver ns2 10.10.10.11:53 
```

As we see haproxy will create internal table for
backends with some
be_id and be_name=application and allocate 10 records
for each server
with se_id from 1 to 10. Then those records get
populated and updated
with the data from resolvers.
I would like to understand couple of things with regards
to this
structure and how it works, which I could not figure out
myself from the
source code:

1) In tcpdump for dns queries we see that haproxy
asynchronously polls
all the nameservers simultaneously. For instance:

```
11:06:17.587798 eth2  Out ifindex 4 aa:aa:aa:aa:aa:aa
ethertype IPv4
(0x0800), length 108: 10.10.10.50.24050 >
10.10.10.10.53: 34307+ [1au]
SRV? _application._tcp.consul. (60)
11:06:17.587802 eth2  Out ifindex 4 aa:aa:aa:aa:aa:aa
ethertype IPv4
(0x0800), length 108: 10.10.10.50.63155 >
10.10.10.11.53: 34307+ [1au]
SRV? _application._tcp.consul. (60)
11:06:17.588097 eth2  In  ifindex 4 ff:ff:ff:ff:ff:ff
ethertype IPv4
(0x0800), length 205: 10.10.10.10.53 >
10.10.10.50.24050: 2194 2/0/1 SRV
0a5099e5.addr.consul.:25340 1 1, SRV
0a509934.addr.consul.:26010 1 1 (157)
11:06:17.588097 eth2  In  ifindex 4 ff:ff:ff:ff:ff:ff
ethertype IPv4
(0x0800), length 205: 10.10.10.11.53 >
10.10.10.50.63155: 2194 2/0/1 SRV
0a5099e5.addr.consul.:25340 1 1, SRV
0a509934.addr.consul.:26010 1 1 (157)
```

Both nameservers reply with the same response. But what
if they are out
of sync? Let's say one says: server1, server2 and the
second one says
server2, server3? So far testing this locally - I see
sometimes the
reply overrides the table, but sometimes it seems to

Re: haproxy backend server template service discovery questions

2024-04-08 Thread Илья Шипицин
am I right that you consider that as a documentation bug ?

пн, 8 апр. 2024 г. в 10:44, Andrii Ustymenko :

> Yes, for the 1) question indeed.
> Basically I have tested with local "out of sync" custom nameserver. And I
> was observing some inconsistent results of the backend-servers table. That
> led to this question.
>
> Most of the time I was seeing the state of only from the local nameserver.
> However sometimes I have seen the "merged" state where all the replies
> existed together in the table.
>
> It was also observed that amount of the requests made by haproxy to all
> nameservers is the same even though the local one normally replies faster.
>
> And sorry, forgot to mention we are running haproxy version 2.8.7
> On 08/04/2024 10:31, Илья Шипицин wrote:
>
> and particularly your question is "does HAProxy merge all responses or
> pick the first one or use some other approach" ?
>
> пн, 8 апр. 2024 г. в 10:23, Andrii Ustymenko :
>
>> I guess indeed it is not a case of consul-template specifically, but more
>> of rendered templates and how haproxy maintains it.
>> On 06/04/2024 20:15, Илья Шипицин wrote:
>>
>> Consul template is something done by consul itself, after that
>> haproxy.conf is rendered
>>
>> Do you mean "how haproxy deals with rendered template"?
>>
>> On Fri, Apr 5, 2024, 15:02 Andrii Ustymenko 
>> wrote:
>>
>>> Dear list!
>>>
>>> My name is Andrii. I work for Adyen. We are using haproxy as our main
>>> software loadbalancer at quite large scale.
>>> Af of now our main use-case for backends routing based on
>>> server-template and dynamic dns from consul as service discovery. Below
>>> is the example of simple backend configuration:
>>>
>>> ```
>>> backend example
>>>balance roundrobin
>>>server-template application 10 _application._tcp.consul resolvers
>>> someresolvers init-addr last,libc,none resolve-opts allow-dup-ip
>>> resolve-prefer ipv4 check ssl verify none
>>> ```
>>>
>>> and in global configuration
>>>
>>> ```
>>> resolvers someresolvers
>>>nameserver ns1 10.10.10.10:53
>>>nameserver ns2 10.10.10.11:53
>>> ```
>>>
>>> As we see haproxy will create internal table for backends with some
>>> be_id and be_name=application and allocate 10 records for each server
>>> with se_id from 1 to 10. Then those records get populated and updated
>>> with the data from resolvers.
>>> I would like to understand couple of things with regards to this
>>> structure and how it works, which I could not figure out myself from the
>>> source code:
>>>
>>> 1) In tcpdump for dns queries we see that haproxy asynchronously polls
>>> all the nameservers simultaneously. For instance:
>>>
>>> ```
>>> 11:06:17.587798 eth2  Out ifindex 4 aa:aa:aa:aa:aa:aa ethertype IPv4
>>> (0x0800), length 108: 10.10.10.50.24050 > 10.10.10.10.53: 34307+ [1au]
>>> SRV? _application._tcp.consul. (60)
>>> 11:06:17.587802 eth2  Out ifindex 4 aa:aa:aa:aa:aa:aa ethertype IPv4
>>> (0x0800), length 108: 10.10.10.50.63155 > 10.10.10.11.53: 34307+ [1au]
>>> SRV? _application._tcp.consul. (60)
>>> 11:06:17.588097 eth2  In  ifindex 4 ff:ff:ff:ff:ff:ff ethertype IPv4
>>> (0x0800), length 205: 10.10.10.10.53 > 10.10.10.50.24050: 2194 2/0/1 SRV
>>> 0a5099e5.addr.consul.:25340 1 1, SRV 0a509934.addr.consul.:26010 1 1
>>> (157)
>>> 11:06:17.588097 eth2  In  ifindex 4 ff:ff:ff:ff:ff:ff ethertype IPv4
>>> (0x0800), length 205: 10.10.10.11.53 > 10.10.10.50.63155: 2194 2/0/1 SRV
>>> 0a5099e5.addr.consul.:25340 1 1, SRV 0a509934.addr.consul.:26010 1 1
>>> (157)
>>> ```
>>>
>>> Both nameservers reply with the same response. But what if they are out
>>> of sync? Let's say one says: server1, server2 and the second one says
>>> server2, server3? So far testing this locally - I see sometimes the
>>> reply overrides the table, but sometimes it seems to just gets merged
>>> with the rest.
>>>
>>> 2) Each entry from SRV reply will be placed into the table under
>>> specific se_id. Most of the times that placement won't change. So, for
>>> the example above the most likely 0a5099e5.addr.consul. and
>>> 0a509934.addr.consul. will have se_id 1 and 2 respectively. However
>>> sometimes we have the following scenario:
>>>
>>> 1. We admistratively disable the server (drain traffic) with the next
>>> command:
>>>
>>> ```
>>> echo "set server example/application1 state maint" | nc -U
>>> /var/lib/haproxy/stats
>>> ```
>>>
>>> the MAINT flag will be added to the record with se_id 1
>>>
>>> 2. Instance of application goes down and gets de-registered from consul,
>>> so also evicted from srv replies and out of discovery of haproxy.
>>>
>>> 3. Instance of application goes up and gets registered by consul and
>>> discovered by haproxy, but haproxy allocates different se_id for it.
>>> Haproxy healthchecks will control the traffic to it in this case.
>>>
>>> 4. We will still have se_id 1 with MAINT flag and application instance
>>> dns record placed into different se_id.
>>>
>>> The problem comes that any new discovered record which get placed into
>>> 

Re: haproxy backend server template service discovery questions

2024-04-08 Thread Andrii Ustymenko

Yes, for the 1) question indeed.
Basically I have tested with local "out of sync" custom nameserver. And 
I was observing some inconsistent results of the backend-servers table. 
That led to this question.


Most of the time I was seeing the state of only from the local 
nameserver. However sometimes I have seen the "merged" state where all 
the replies existed together in the table.


It was also observed that amount of the requests made by haproxy to all 
nameservers is the same even though the local one normally replies faster.


And sorry, forgot to mention we are running haproxy version 2.8.7

On 08/04/2024 10:31, Илья Шипицин wrote:
and particularly your question is "does HAProxy merge all responses or 
pick the first one or use some other approach" ?


пн, 8 апр. 2024 г. в 10:23, Andrii Ustymenko :

I guess indeed it is not a case of consul-template specifically,
but more of rendered templates and how haproxy maintains it.

On 06/04/2024 20:15, Илья Шипицин wrote:

Consul template is something done by consul itself, after that
haproxy.conf is rendered

Do you mean "how haproxy deals with rendered template"?

On Fri, Apr 5, 2024, 15:02 Andrii Ustymenko
 wrote:

Dear list!

My name is Andrii. I work for Adyen. We are using haproxy as
our main
software loadbalancer at quite large scale.
Af of now our main use-case for backends routing based on
server-template and dynamic dns from consul as service
discovery. Below
is the example of simple backend configuration:

```
backend example
   balance roundrobin
   server-template application 10 _application._tcp.consul
resolvers
someresolvers init-addr last,libc,none resolve-opts allow-dup-ip
resolve-prefer ipv4 check ssl verify none
```

and in global configuration

```
resolvers someresolvers
   nameserver ns1 10.10.10.10:53 
   nameserver ns2 10.10.10.11:53 
```

As we see haproxy will create internal table for backends
with some
be_id and be_name=application and allocate 10 records for
each server
with se_id from 1 to 10. Then those records get populated and
updated
with the data from resolvers.
I would like to understand couple of things with regards to this
structure and how it works, which I could not figure out
myself from the
source code:

1) In tcpdump for dns queries we see that haproxy
asynchronously polls
all the nameservers simultaneously. For instance:

```
11:06:17.587798 eth2  Out ifindex 4 aa:aa:aa:aa:aa:aa
ethertype IPv4
(0x0800), length 108: 10.10.10.50.24050 > 10.10.10.10.53:
34307+ [1au]
SRV? _application._tcp.consul. (60)
11:06:17.587802 eth2  Out ifindex 4 aa:aa:aa:aa:aa:aa
ethertype IPv4
(0x0800), length 108: 10.10.10.50.63155 > 10.10.10.11.53:
34307+ [1au]
SRV? _application._tcp.consul. (60)
11:06:17.588097 eth2  In  ifindex 4 ff:ff:ff:ff:ff:ff
ethertype IPv4
(0x0800), length 205: 10.10.10.10.53 > 10.10.10.50.24050:
2194 2/0/1 SRV
0a5099e5.addr.consul.:25340 1 1, SRV
0a509934.addr.consul.:26010 1 1 (157)
11:06:17.588097 eth2  In  ifindex 4 ff:ff:ff:ff:ff:ff
ethertype IPv4
(0x0800), length 205: 10.10.10.11.53 > 10.10.10.50.63155:
2194 2/0/1 SRV
0a5099e5.addr.consul.:25340 1 1, SRV
0a509934.addr.consul.:26010 1 1 (157)
```

Both nameservers reply with the same response. But what if
they are out
of sync? Let's say one says: server1, server2 and the second
one says
server2, server3? So far testing this locally - I see
sometimes the
reply overrides the table, but sometimes it seems to just
gets merged
with the rest.

2) Each entry from SRV reply will be placed into the table under
specific se_id. Most of the times that placement won't
change. So, for
the example above the most likely 0a5099e5.addr.consul. and
0a509934.addr.consul. will have se_id 1 and 2 respectively.
However
sometimes we have the following scenario:

1. We admistratively disable the server (drain traffic) with
the next
command:

```
echo "set server example/application1 state maint" | nc -U
/var/lib/haproxy/stats
```

the MAINT flag will be added to the record with se_id 1

2. Instance of application goes down and gets de-registered
from consul,
so also evicted from srv replies and out of discovery of haproxy.

3. Instance of application goes up and gets registered by
consul and

Re: haproxy backend server template service discovery questions

2024-04-08 Thread Илья Шипицин
and particularly your question is "does HAProxy merge all responses or pick
the first one or use some other approach" ?

пн, 8 апр. 2024 г. в 10:23, Andrii Ustymenko :

> I guess indeed it is not a case of consul-template specifically, but more
> of rendered templates and how haproxy maintains it.
> On 06/04/2024 20:15, Илья Шипицин wrote:
>
> Consul template is something done by consul itself, after that
> haproxy.conf is rendered
>
> Do you mean "how haproxy deals with rendered template"?
>
> On Fri, Apr 5, 2024, 15:02 Andrii Ustymenko 
> wrote:
>
>> Dear list!
>>
>> My name is Andrii. I work for Adyen. We are using haproxy as our main
>> software loadbalancer at quite large scale.
>> Af of now our main use-case for backends routing based on
>> server-template and dynamic dns from consul as service discovery. Below
>> is the example of simple backend configuration:
>>
>> ```
>> backend example
>>balance roundrobin
>>server-template application 10 _application._tcp.consul resolvers
>> someresolvers init-addr last,libc,none resolve-opts allow-dup-ip
>> resolve-prefer ipv4 check ssl verify none
>> ```
>>
>> and in global configuration
>>
>> ```
>> resolvers someresolvers
>>nameserver ns1 10.10.10.10:53
>>nameserver ns2 10.10.10.11:53
>> ```
>>
>> As we see haproxy will create internal table for backends with some
>> be_id and be_name=application and allocate 10 records for each server
>> with se_id from 1 to 10. Then those records get populated and updated
>> with the data from resolvers.
>> I would like to understand couple of things with regards to this
>> structure and how it works, which I could not figure out myself from the
>> source code:
>>
>> 1) In tcpdump for dns queries we see that haproxy asynchronously polls
>> all the nameservers simultaneously. For instance:
>>
>> ```
>> 11:06:17.587798 eth2  Out ifindex 4 aa:aa:aa:aa:aa:aa ethertype IPv4
>> (0x0800), length 108: 10.10.10.50.24050 > 10.10.10.10.53: 34307+ [1au]
>> SRV? _application._tcp.consul. (60)
>> 11:06:17.587802 eth2  Out ifindex 4 aa:aa:aa:aa:aa:aa ethertype IPv4
>> (0x0800), length 108: 10.10.10.50.63155 > 10.10.10.11.53: 34307+ [1au]
>> SRV? _application._tcp.consul. (60)
>> 11:06:17.588097 eth2  In  ifindex 4 ff:ff:ff:ff:ff:ff ethertype IPv4
>> (0x0800), length 205: 10.10.10.10.53 > 10.10.10.50.24050: 2194 2/0/1 SRV
>> 0a5099e5.addr.consul.:25340 1 1, SRV 0a509934.addr.consul.:26010 1 1 (157)
>> 11:06:17.588097 eth2  In  ifindex 4 ff:ff:ff:ff:ff:ff ethertype IPv4
>> (0x0800), length 205: 10.10.10.11.53 > 10.10.10.50.63155: 2194 2/0/1 SRV
>> 0a5099e5.addr.consul.:25340 1 1, SRV 0a509934.addr.consul.:26010 1 1 (157)
>> ```
>>
>> Both nameservers reply with the same response. But what if they are out
>> of sync? Let's say one says: server1, server2 and the second one says
>> server2, server3? So far testing this locally - I see sometimes the
>> reply overrides the table, but sometimes it seems to just gets merged
>> with the rest.
>>
>> 2) Each entry from SRV reply will be placed into the table under
>> specific se_id. Most of the times that placement won't change. So, for
>> the example above the most likely 0a5099e5.addr.consul. and
>> 0a509934.addr.consul. will have se_id 1 and 2 respectively. However
>> sometimes we have the following scenario:
>>
>> 1. We admistratively disable the server (drain traffic) with the next
>> command:
>>
>> ```
>> echo "set server example/application1 state maint" | nc -U
>> /var/lib/haproxy/stats
>> ```
>>
>> the MAINT flag will be added to the record with se_id 1
>>
>> 2. Instance of application goes down and gets de-registered from consul,
>> so also evicted from srv replies and out of discovery of haproxy.
>>
>> 3. Instance of application goes up and gets registered by consul and
>> discovered by haproxy, but haproxy allocates different se_id for it.
>> Haproxy healthchecks will control the traffic to it in this case.
>>
>> 4. We will still have se_id 1 with MAINT flag and application instance
>> dns record placed into different se_id.
>>
>> The problem comes that any new discovered record which get placed into
>> se_id 1 will never be active until either command:
>>
>> ```
>> echo "set server example/application1 state ready" | nc -U
>> /var/lib/haproxy/stats
>> ```
>>
>> executed or haproxy gets reloaded without state file. With this pattern
>> we basically have persistent "records pollution" due to operations made
>> directly with control socket.
>>
>> I am not sure is there anything to do about this. Maybe, if haproxy
>> could cache the state not only of se_id but also associated record with
>> that and then if that gets changed - re-schedule healtchecks. Or instead
>> of integer ids use some hashed ids based on dns/ip-addresses of
>> discovered records, in this case binding will happen exactly in the same
>> slot.
>>
>> Thanks in advance!
>>
>> --
>>
>> Best regards,
>>
>> Andrii Ustymenko
>>
>>
>> --
>
> Andrii Ustymenko
> Platform Reliability Engineer
>
> office +31 20 240 

Re: haproxy backend server template service discovery questions

2024-04-08 Thread Andrii Ustymenko
I guess indeed it is not a case of consul-template specifically, but 
more of rendered templates and how haproxy maintains it.


On 06/04/2024 20:15, Илья Шипицин wrote:
Consul template is something done by consul itself, after that 
haproxy.conf is rendered


Do you mean "how haproxy deals with rendered template"?

On Fri, Apr 5, 2024, 15:02 Andrii Ustymenko 
 wrote:


Dear list!

My name is Andrii. I work for Adyen. We are using haproxy as our main
software loadbalancer at quite large scale.
Af of now our main use-case for backends routing based on
server-template and dynamic dns from consul as service discovery.
Below
is the example of simple backend configuration:

```
backend example
   balance roundrobin
   server-template application 10 _application._tcp.consul resolvers
someresolvers init-addr last,libc,none resolve-opts allow-dup-ip
resolve-prefer ipv4 check ssl verify none
```

and in global configuration

```
resolvers someresolvers
   nameserver ns1 10.10.10.10:53 
   nameserver ns2 10.10.10.11:53 
```

As we see haproxy will create internal table for backends with some
be_id and be_name=application and allocate 10 records for each server
with se_id from 1 to 10. Then those records get populated and updated
with the data from resolvers.
I would like to understand couple of things with regards to this
structure and how it works, which I could not figure out myself
from the
source code:

1) In tcpdump for dns queries we see that haproxy asynchronously
polls
all the nameservers simultaneously. For instance:

```
11:06:17.587798 eth2  Out ifindex 4 aa:aa:aa:aa:aa:aa ethertype IPv4
(0x0800), length 108: 10.10.10.50.24050 > 10.10.10.10.53: 34307+
[1au]
SRV? _application._tcp.consul. (60)
11:06:17.587802 eth2  Out ifindex 4 aa:aa:aa:aa:aa:aa ethertype IPv4
(0x0800), length 108: 10.10.10.50.63155 > 10.10.10.11.53: 34307+
[1au]
SRV? _application._tcp.consul. (60)
11:06:17.588097 eth2  In  ifindex 4 ff:ff:ff:ff:ff:ff ethertype IPv4
(0x0800), length 205: 10.10.10.10.53 > 10.10.10.50.24050: 2194
2/0/1 SRV
0a5099e5.addr.consul.:25340 1 1, SRV 0a509934.addr.consul.:26010 1
1 (157)
11:06:17.588097 eth2  In  ifindex 4 ff:ff:ff:ff:ff:ff ethertype IPv4
(0x0800), length 205: 10.10.10.11.53 > 10.10.10.50.63155: 2194
2/0/1 SRV
0a5099e5.addr.consul.:25340 1 1, SRV 0a509934.addr.consul.:26010 1
1 (157)
```

Both nameservers reply with the same response. But what if they
are out
of sync? Let's say one says: server1, server2 and the second one says
server2, server3? So far testing this locally - I see sometimes the
reply overrides the table, but sometimes it seems to just gets merged
with the rest.

2) Each entry from SRV reply will be placed into the table under
specific se_id. Most of the times that placement won't change. So,
for
the example above the most likely 0a5099e5.addr.consul. and
0a509934.addr.consul. will have se_id 1 and 2 respectively. However
sometimes we have the following scenario:

1. We admistratively disable the server (drain traffic) with the next
command:

```
echo "set server example/application1 state maint" | nc -U
/var/lib/haproxy/stats
```

the MAINT flag will be added to the record with se_id 1

2. Instance of application goes down and gets de-registered from
consul,
so also evicted from srv replies and out of discovery of haproxy.

3. Instance of application goes up and gets registered by consul and
discovered by haproxy, but haproxy allocates different se_id for it.
Haproxy healthchecks will control the traffic to it in this case.

4. We will still have se_id 1 with MAINT flag and application
instance
dns record placed into different se_id.

The problem comes that any new discovered record which get placed
into
se_id 1 will never be active until either command:

```
echo "set server example/application1 state ready" | nc -U
/var/lib/haproxy/stats
```

executed or haproxy gets reloaded without state file. With this
pattern
we basically have persistent "records pollution" due to operations
made
directly with control socket.

I am not sure is there anything to do about this. Maybe, if haproxy
could cache the state not only of se_id but also associated record
with
that and then if that gets changed - re-schedule healtchecks. Or
instead
of integer ids use some hashed ids based on dns/ip-addresses of
discovered records, in this case binding will happen exactly in
the same
slot.

Thanks in advance!

--

Best regards,

Andrii Ustymenko



--

Andrii Ustymenko
Platform Reliability Engineer

office +31 20 240 12 40

Adyen Headquarters
Simon 

Proszę o kontakt

2024-04-08 Thread Marcin Gard
Dzień dobry,

Czy jest możliwość nawiązania współpracy z Państwem?

Z chęcią porozmawiam z osobą zajmującą się działaniami związanymi ze sprzedażą.

Pomagamy skutecznie pozyskiwać nowych klientów.

Zapraszam do kontaktu.


Pozdrawiam serdecznie
Marcin Gard



Re: [ANNOUNCE] haproxy-3.0-dev7

2024-04-08 Thread Willy Tarreau
Hi Ilya,

On Sun, Apr 07, 2024 at 08:34:18PM +0200,  ??? wrote:
> ??, 6 ???. 2024 ?. ? 17:53, Willy Tarreau :
> >   - a new "guid" keyword was added for servers, listeners and proxies.
> > The purpose will be to make it possible for external APIs to assign
> > a globally unique object identifier to each of them in stats dumps
> > or CLI accesses, and to later reliably recognize a server upon
> > reloads. For now the identifier is not exploited.
> >
> 
> I have a question about the UUID version. it is not specified. Is it UUID
> version 6 ?

It's not a UUID, it's a free string, up to 128 chars long. This way
the API client can use whatever it wants, including a UUID.

Regarding UUIDs, though, I've recently come across UUIDv7 which I found
particularly interesting, and that I think would be nice to implement
in the uuid() sample fetch function before 3.0 is released.

Cheers,
Willy