Re: [PR] Improve wording of http-request wait-for-body

2023-04-14 Thread William Dauchy
Hi Craig,

Thank you for your patch.
Do you think you could come back with a patch which is only modifying
the section you want? It seems you did a change on the full file
instead which makes the review quite hard.

-- 
William



Re: clarify close behaviour on http-request rules

2023-02-07 Thread William Dauchy
Hi Christopher,

Thanks for your answer.

On Tue, Feb 7, 2023 at 9:09 AM Christopher Faulet  wrote:
> The tarpit action is final. So it cannot be used in addition to a return or a
> deny action. For the wait-for-body action, indeed, it will wait for the body 
> or
> a full buffer for 1 second. Thus in this case, if the whole request can be
> stored in buffer and is received fast enough, this will mitigate your issue.

ah yes good point indeed, I forgot the behaviour of tarpit. I will
give it a try with `wait-for-body`.

> FYI, we are refactoring the way errors, aborts and shutdowns are handled
> internally. And the data draining at the mux level is definitely a subject we
> should address. It is a painful work but we hope to include it in the 2.8, at
> least partially.

thanks!
-- 
William



Re: clarify close behaviour on http-request rules

2023-02-06 Thread William Dauchy
Hi Christopher,

On Fri, Feb 3, 2023 at 7:59 PM William Dauchy  wrote:
> On Tue, Oct 18, 2022 at 4:15 PM Christopher Faulet  
> wrote:
> > On all HTX versions, K/A and close modes are handled in the H1 multiplexer.
> > Thus, on these versions, http_reply_and_close() is only closing the stream. 
> > The
> > multiplexer is responsible to close the client connection or not.
> >
> > On pre-HTX versions, when http_reply_and_close() is used, the client 
> > connection
> > is also closed. It is a limitation of of HAProxy versions using the legacy 
> > HTTP.
> >
> > Note there is a case where the connection client may be closed. If the 
> > HAProxy
> > response is returned before the end of the request, the client connection is
> > closed. There is no (not yet) draining mode at the mux level.
>
> coming back on this very late:
>
> `http-request wait-for-body time` or a `http-request tarpit` mitigate
> the draining issue?
> I am trying to find a workaround on a setup where we are behind
> another L7 LB where we unexpectedly close the connection.

saying another way, what is going the behaviour of `http-request
return` if I have:

http-request wait-for-body time 1s if CONDITION_A
http-request deny if CONDITION_A

Is it going to wait for the request, and so mitigate the mentioned
drain limitation we currently have in the mux when `CONDITION_A`
matches?

-- 
William



Re: clarify close behaviour on http-request rules

2023-02-03 Thread William Dauchy
On Tue, Oct 18, 2022 at 4:15 PM Christopher Faulet  wrote:
> On all HTX versions, K/A and close modes are handled in the H1 multiplexer.
> Thus, on these versions, http_reply_and_close() is only closing the stream. 
> The
> multiplexer is responsible to close the client connection or not.
>
> On pre-HTX versions, when http_reply_and_close() is used, the client 
> connection
> is also closed. It is a limitation of of HAProxy versions using the legacy 
> HTTP.
>
> Note there is a case where the connection client may be closed. If the HAProxy
> response is returned before the end of the request, the client connection is
> closed. There is no (not yet) draining mode at the mux level.

coming back on this very late:

`http-request wait-for-body time` or a `http-request tarpit` mitigate
the draining issue?
I am trying to find a workaround on a setup where we are behind
another L7 LB where we unexpectedly close the connection.

Thanks!
-- 
William



Re: [PATCH 0/2] BUG/MINOR: promex: fix haproxy_backend_agg_server_check_status

2022-12-06 Thread William Dauchy
On Tue, Dec 6, 2022 at 3:27 AM Christopher Faulet  wrote:
> IMHO, we should keep the current metric and its description should be updated 
> to
> mention it is deprecated. This way, we will be able to remove it in the 2.9 or
> 3.0. This means we should find new names for the aggregated servers status and
> the aggregated check status. "ST_F_AGG_SRV_STATUS" is pretty good for the 
> first
> one. I may propose "ST_F_AGG_CHECK_STATUS" for the second one.
>
> I guess the two new metrics may be backported. They are not used by the stats
> applet.
>
> What do you think about it?

I am aligned with this plan of creating two new metrics

Thanks,
-- 
William



Re: [PATCH 0/2] BUG/MINOR: promex: fix haproxy_backend_agg_server_check_status

2022-11-29 Thread William Dauchy
On Tue, Nov 29, 2022 at 11:52 AM William Dauchy  wrote:
> On Mon, Nov 28, 2022 at 8:46 AM Cedric Paillet  wrote:
> > As described in github issue #1312, the first intention of patch 42d7c402d
> > was to aggregate haproxy_server_check_status. But we aggregated
> > haproxy_server_status instead.
> > To fix that, rename haproxy_backend_agg_server_check_status to
> > haproxy_backend_agg_server_status and use right data source for
> > haproxy_backend_agg_server_check_status.
>
> Thank you for looking into this. This has been on my todo list for too long.
> I am fine with those changes as long as we backport them until the
> version where this was introduced (this was introduced in v2.5 and
> then backported in v2.4 if my checks are correct).
> I understand that's a breaking change, some people started to build
> tooling on top of that can't deal with behaviour change between
> versions (I am thinking about dashboards in my case where we deal with
> different versions). In that regard, I believe it can be ok to have to
> do a one off change to fix this past mistake. If we are not aligned
> with a backport, we will need to find another name.

I realised I was not necessarily very clear on my position:
- either we merge this patch, and then backport it until 2.4
- either we find a new name for the new metric and don't introduce a
breaking change

I am mostly worried about metrics collectors who are not able to deal
with different behaviour from one version to another. In my own case,
we have different haproxy versions running, and there is no global way
we can adapt the collector depending on the software version it is
collecting. The collector expects the same behaviour across versions.
Having a breaking change which affects all versions is fine as it will
require a unique change on our side, but having a situation in the
middle is not acceptable for us.

-- 
William



Re: [PATCH 0/2] BUG/MINOR: promex: fix haproxy_backend_agg_server_check_status

2022-11-29 Thread William Dauchy
Hello Cedric,

On Mon, Nov 28, 2022 at 8:46 AM Cedric Paillet  wrote:
> As described in github issue #1312, the first intention of patch 42d7c402d
> was to aggregate haproxy_server_check_status. But we aggregated
> haproxy_server_status instead.
> To fix that, rename haproxy_backend_agg_server_check_status to
> haproxy_backend_agg_server_status and use right data source for
> haproxy_backend_agg_server_check_status.

Thank you for looking into this. This has been on my todo list for too long.
I am fine with those changes as long as we backport them until the
version where this was introduced (this was introduced in v2.5 and
then backported in v2.4 if my checks are correct).
I understand that's a breaking change, some people started to build
tooling on top of that can't deal with behaviour change between
versions (I am thinking about dashboards in my case where we deal with
different versions). In that regard, I believe it can be ok to have to
do a one off change to fix this past mistake. If we are not aligned
with a backport, we will need to find another name.
-- 
William



change accept() behaviour under maxconn

2022-10-19 Thread William Dauchy
Hello,

I was looking at maxconn behaviour on frontend side. I understood we
don't trigger accept() once maxconn is reached
https://git.haproxy.org/?p=haproxy.git;a=blob;f=src/listener.c;h=412af94a160662aa11802dd58512c6410c9b56da;hb=HEAD#l920
I wondered whether it would be an acceptable patch to add an option in
order to accept() and close() the connection right after during such
event. The context would be to "fail fast" on some environment where
it is preferable to let know the client as soon as possible we can't
accept the connection. In an environment where the haproxy instance is
the second level of proxies, we want to fail fast instead of waiting
for the end of tcp timeout to connect.

Thanks,
-- 
William



Re: clarify close behaviour on http-request rules

2022-10-18 Thread William Dauchy
On Tue, Oct 18, 2022 at 4:15 PM Christopher Faulet  wrote:
> On all HTX versions, K/A and close modes are handled in the H1 multiplexer.
> Thus, on these versions, http_reply_and_close() is only closing the stream. 
> The
> multiplexer is responsible to close the client connection or not.

ok, thanks for the details.

> Note there is a case where the connection client may be closed. If the HAProxy
> response is returned before the end of the request, the client connection is
> closed. There is no (not yet) draining mode at the mux level.

That particular point is very interesting to me. Would you have a
pointer so that I can better understand where it is happening in the
code?

-- 
William



Re: clarify close behaviour on http-request rules

2022-10-18 Thread William Dauchy
Hi Christopher,

Thanks for your reply.

On Tue, Oct 18, 2022 at 10:22 AM Christopher Faulet  wrote:
> Since the 2.2, HAProxy responses don't close the client connections in K/A 
> mode,
> except for 400-Bad-Request and 408-Request-Time-out. It is the behavior with
> default error messages. Of course, if a "Connection: close" header is found 
> in a
> custom reply, the client connection is closed.
>
> Thus, in your case, if there is no "Connection: close" header in the response,
> both http-request action (return and deny) should do the same and keep the
> client connection alive.
>
> So if you observe connection closing, it may be for something else. Maybe a
> timeout. Or a bug...

ok I will investigate more.
To come back on the code path, was I wrong for the
`http_reply_and_close` difference?
Would you have a pointer so I may better understand where
`http_reply_and_close` would not close the connection in K/A mode?

-- 
William



clarify close behaviour on http-request rules

2022-10-17 Thread William Dauchy
Hello,

I am trying to clarify in which case a tcp connection might be closed
following those rules:
- http-request return
- http-request deny
unless I missed something I have not been able to see the answer within the doc.

General context being, we are using `option http-keep-alive`; also our
haproxy is behind another L7 proxy; it means the later do not expect
the connection to be loosely closed.

[cloud L7 LB] <--> [haproxy] <--> backend

>From what I read in
https://git.haproxy.org/?p=haproxy.git;a=blob;f=src/http_ana.c;h=2b2cfdc56103f313d766143f9016d91200065092;hb=HEAD#l354
I got:
- `http-request return` gets `HTTP_RULE_RES_ABRT`; this flag leads to
`return_prx_cond`, which is not calling `http_reply_and_close`
- `http-request deny` gets `HTTP_RULE_RES_DENY`; this flag leads to
`deny`, which is calling `http_reply_and_close`

We are using haproxy v2.4.x but I think the behaviour did not change
in an earlier version.

Some followup questions:

- do we confirm `http-request return` does not close the connection
while `http-request deny` does?
meaning:
- http-request return status 403
- http-request deny
don't have the same behaviour?
- from what I understood, using http-keep-alive does not influence the
behaviour of `http_reply_and_close`, is that correct?
- while using `http-request deny` which might close the connection, we
see some requests on the cloud LB side being ended in error 502. We
suspect those are the next requests, where the cloud LB tried to use
the same previous connection. For now it is hard to understand why the
cloud LB would not detect the close. Is there a moment where haproxy
would loosely close the connection which might explain the behavior?

Thanks in advance,
-- 
William



Re: [PATCH] BUG/MEDIUM: server: avoid changing healthcheck ctx with set server ssl

2022-01-17 Thread William Dauchy
Hello Christopher,

On Wed, Jan 12, 2022 at 12:45 PM William Dauchy  wrote:
> my approach was to say:
> - remove the implicit behavior
> - then work on the missing commands for the health checks

Do you think we can conclude on it?

-- 
William



Re: [PATCH] BUG/MEDIUM: server: avoid changing healthcheck ctx with set server ssl

2022-01-12 Thread William Dauchy
On Wed, Jan 12, 2022 at 10:02 AM Christopher Faulet  wrote:
> I don't know what is the expected behavior on the stable releases for users.
> Honestly, I've misread you patch and kept in mind you alternative solution...
> But as said, if there is no implicit change (and I'm fine with this 
> solution), a
> new command or an option to current ones must be introduced to be able to 
> change
> SSL setting for health-check too. Otherwise, it will be unusable.

my approach was to say:
- remove the implicit behavior
- then work on the missing commands for the health checks
-- 
William



Re: [PATCH] BUG/MEDIUM: server: avoid changing healthcheck ctx with set server ssl

2022-01-11 Thread William Dauchy
Hello Christopher,

Thanks for your research,

On Tue, Jan 11, 2022 at 6:55 PM Christopher Faulet  wrote:
> Le 1/10/22 à 23:19, Willy Tarreau a écrit :
> > At this point I'm starting to think that we should probably avoid as
> > much as possible to use implicit settings for whatever is dynamic.
> > Originally a lot of settings were implicit because we don't want to
> > have huge config lines to enforce lots of obvious settings. On the CLI
> > it's less of a problem and for example if "ssl" only deals with the
> > traffic without manipulating the checks, I'm personally not shocked,
> > because these are normally sent by bots and we don't care if they
> > have to set a few more settings to avoid multiple implicit changes
> > that may not always be desired. This is where the CLI (or any future
> > API) might differ a bit from the config, and an agent writing some
> > config might have to explicitly state certain things like "no-check-ssl"
> > for example to stay safe and avoid such implicit dependencies.
> >
>
> I agree with Willy on this point. Especially because, depending the order of
> commands, the result can be different because of implicit changes and it is
> pretty hard to predict how it will behave in all cases.

yup totally agree with both of you. I can't really remember why I did
that in the first place.

> For instance, to fix William's bug about "set server ssl" command, in a way or
> another, we must stop to dynamically update the health-check if its port or 
> its
> address is explicitly specified. With this change, the result of following set
> of commands will be different:
>
>$ set server BK/SRV ssl on
>$ set server BK/SRV check-port XXX
>
> ==> SSL is enabled for the server and the health-check

interesting one :))

>$ set server BK/SRV check-port XXX
>$ set server BK/SRV ssl on
>
> ==> because the check's port was updated first, the SSL is only enabled for 
> the
> server.

> Agree. But, if possible, a warning may be added in the documentation to warn
> about implicit changes.

>From the discussion, I would be tempted to say the opposite, as I feel
like keeping implicit things for this command is worse.

> About the fix, I checked the code, and, at first glance, there is no reason to
> change "srv->check.use_ssl" value when the health-check uses the server
> settings. Thus, we may fix "set server ssl" command this way:
>
> diff --git a/src/check.c b/src/check.c
> index f0ae81504..8cf8a1c5b 100644
> --- a/src/check.c
> +++ b/src/check.c
> @@ -1565,10 +1565,8 @@ int init_srv_check(struct server *srv)
>   * default, unless one is specified.
>   */
>  if (!srv->check.port && !is_addr(>check.addr)) {
> -   if (!srv->check.use_ssl && srv->use_ssl != -1) {
> -   srv->check.use_ssl = srv->use_ssl;
> -   srv->check.xprt= srv->xprt;
> -   }
> +   if (!srv->check.use_ssl && srv->use_ssl != -1)
> +   srv->check.xprt = srv->xprt;
>  else if (srv->check.use_ssl == 1)
>  srv->check.xprt = xprt_get(XPRT_SSL);
>  srv->check.send_proxy |= (srv->pp_opts);

indeed this is simplified that way

> diff --git a/src/server.c b/src/server.c
> index 2061153bc..a18836a71 100644
> --- a/src/server.c
> +++ b/src/server.c
> @@ -2113,10 +2113,11 @@ void srv_set_ssl(struct server *s, int use_ssl)
>  return;
>
>  s->use_ssl = use_ssl;
> -   if (s->use_ssl)
> -   s->xprt = xprt_get(XPRT_SSL);
> -   else
> -   s->xprt = s->check.xprt = s->agent.xprt = xprt_get(XPRT_RAW);
> +   s->xprt = (s->use_ssl == 1) ? xprt_get(XPRT_SSL) : xprt_get(XPRT_RAW);
> +
> +   if ((s->check.state & CHK_ST_CONFIGURED) && !s->check.use_ssl &&
> +   !s->check.port && !is_addr(>check.addr))
> +   s->check.xprt = s->xprt;
>   }

as mentioned above I am not sure I am aligned here; I would rather
remove the side effect of changing s->check.

>   #endif /* USE_OPENSSL */
>
> This may be done with the following 3 steps:
>
>* First, stop to change "srv->check.use_ssl" value
>
>* Then, stop to change the agent settings in srv_set_ssl() because there is
> no ssl support for agent-check.

good point, I did not know

>* Finally, fix the bug identified by William, adding the according 
> documentation.
>
> Note I don't know if all this stuff works properly with the server-state 
> file...

I am always scared to look at the server state functionality...

-- 
William



Re: [PATCH] BUG/MEDIUM: server: avoid changing healthcheck ctx with set server ssl

2022-01-10 Thread William Dauchy
On Sat, Jan 8, 2022 at 3:03 PM Tim Düsterhus  wrote:
> Causes issues when applying the patch, because git gets confused and
> believes this to be the patch.
> I tend to indent this type of "literal code block" within my commit
> message with 4 spaces for clarity.

indeed, good point, will fix if I have to resend a v2

On Mon, Jan 10, 2022 at 7:51 AM Willy Tarreau  wrote:
> It's important to always keep in mind that checks are not necessarily
> related to the production traffic, and that configuring one part should
> not have any impact on the other one. By default a server running in SSL
> will not be checked using SSL unless "check-ssl" is set.

note it is only true in your example if you use another port.

> You could for
> example have a server forwarding to multiple ports (say 80 and 443) and
> decide to check only one of them, or even check another one.
>
> As such, I think your patch is correct as it only affects what the user
> attempts to modify. I suspect that the reason for your initial choice was
> that it was not yet possible by then to enable SSL checks manually,

sorry what do you mean by manually?
"check-ssl" has been available for a long time, so that's not the
reason behind it, but I guess you were referring to something else. I
suspect I did a dumb copy/paste from the new_server function and
probably never thought was possibly wrong as my previous production
never had any check using tls.

> it
> would be worth rechecking, because if that's the case, maybe we should
> not backport it to 2.4 and only document a behavior change between 2.4
> and 2.5.
> If you could have a double-check at the history behind this, that would
> be nice so that we know how far to backport it. By the way, maybe your
> proposed alternative would be acceptable for older versions which do not
> allow to enable SSL health checks on the CLI.

unless I missed something, for me the current behavior is broken as
you can't come back to a working state if you are using tls on both
traffic and health check path. The only working setup is when you are
using `no-check-ssl` in your default server. In that sense I believe
it should be backported to v2.4.
--
William



[PATCH] BUG/MEDIUM: server: avoid changing healthcheck ctx with set server ssl

2022-01-06 Thread William Dauchy
While giving a fresh try to `set server ssl` (which I wrote), I realised
the behavior is a bit inconsistent. Indeed when using this command over
a server with ssl enabled for the data path but also for the health
check path we have:

- data and health check done using tls
- emit `set server be_foo/srv0 ssl off`
- data path and health check path becomes plain text
- emit `set server be_foo/srv0 ssl on`
- data path becomes tls and health check path remains plain text

while I thought the end result would be:
- data path and health check path comes back in tls

In the current code we indeed erase all connections while deactivating,
but restore only the data path while activating.  I made this mistake in
the past because I was testing with a case where the health check plain
text by default.

There are several ways to solve this issue. The cleanest one would
probably be to avoid changing the health check connection when we use
`set server ssl` command, and create a new command `set server
ssl-check` to change this. For now I assumed this would be ok to simply
avoid changing the health check path and be more consistent.

This patch tries to address that and also update the documentation. It
should not break the existing usage with health check on plain text, as
in this case they should have `no-check-ssl` in defaults.  Without this
patch, it makes the command unusable in an env where you have a list of
server to add along the way with initial `server-template`, and all
using tls for data and healthcheck path.

For 2.6 we should probably reconsider and add `set server ssl-check`
command for better granularity of cases.

If this solution is accepted, this patch should be backported up to >=
2.4.

The alternative solution was to restore the previous state, but I
believe this will create even more confusion in the future:

--- a/src/server.c
+++ b/src/server.c
@@ -2113,8 +2113,11 @@ void srv_set_ssl(struct server *s, int use_ssl)
return;

s->use_ssl = use_ssl;
-   if (s->use_ssl)
+   if (use_ssl) {
s->xprt = xprt_get(XPRT_SSL);
+   if (s->check.use_ssl)
+   s->check.xprt = s->agent.xprt = xprt_get(XPRT_SSL);
+   }
else
s->xprt = s->check.xprt = s->agent.xprt = xprt_get(XPRT_RAW);
 }

Signed-off-by: William Dauchy 
---
 doc/management.txt | 2 ++
 src/server.c   | 2 +-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/doc/management.txt b/doc/management.txt
index 147e059f6..b96cbc878 100644
--- a/doc/management.txt
+++ b/doc/management.txt
@@ -2187,6 +2187,8 @@ set server / fqdn 
 
 set server / ssl [ on | off ]
   This option configures SSL ciphering on outgoing connections to the server.
+  When switch off, all traffic becomes plain text; health check path is not
+  changed.
 
 set severity-output [ none | number | string ]
   Change the severity output format of the stats socket connected to for the
diff --git a/src/server.c b/src/server.c
index 2061153bc..d165f50b9 100644
--- a/src/server.c
+++ b/src/server.c
@@ -2116,7 +2116,7 @@ void srv_set_ssl(struct server *s, int use_ssl)
if (s->use_ssl)
s->xprt = xprt_get(XPRT_SSL);
else
-   s->xprt = s->check.xprt = s->agent.xprt = xprt_get(XPRT_RAW);
+   s->xprt = xprt_get(XPRT_RAW);
 }
 
 #endif /* USE_OPENSSL */
-- 
2.34.1




[PATCH] MINOR: proxy: add option idle-close-on-response

2022-01-05 Thread William Dauchy
Avoid closing idle connections if a soft stop is in progress.

By default, idle connections will be closed during a soft stop. In some
environments, a client talking to the proxy may have prepared some idle
connections in order to send requests later. If there is no proper retry
on write errors, this can result in errors while haproxy is reloading.
Even though a proper implementation should retry on connection/write
errors, this option was introduced to support back compat with haproxy <
v2.4. Indeed before v2.4, we were waiting for a last request to be able
to add a "connection: close" header and advice the client to close the
connection.

In a real life example, this behavior was seen in AWS using the ALB in
front of a haproxy. The end result was ALB sending 502 during haproxy
reloads.
This patch was tested on haproxy v2.4, with a regular reload on the
process, and a constant trend of requests coming in. Before the patch,
we see regular 502 returned to the client; when activating the option,
the 502 disappear.

This patch should help fixing github issue #1506.
In order to unblock some v2.3 to v2.4 migraton, this patch should be
backported up to v2.4 branch.

Signed-off-by: William Dauchy 
---
 doc/configuration.txt | 21 +
 include/haproxy/proxy-t.h |  2 +-
 src/mux_h1.c  |  3 ++-
 src/proxy.c   |  1 +
 4 files changed, 25 insertions(+), 2 deletions(-)

diff --git a/doc/configuration.txt b/doc/configuration.txt
index cd8ab4b72..66571a566 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -3756,6 +3756,7 @@ option tcp-smart-connect (*)  X  -
 X X
 option tcpka  X  X X X
 option tcplog X  X X X
 option transparent   (*)  X  - X X
+option idle-close-on-response(*)  X  X X -
 external-check commandX  - X X
 external-check path   X  - X X
 persist rdp-cookieX  - X X
@@ -9836,6 +9837,26 @@ no option transparent
 "transparent" option of the "bind" keyword.
 
 
+option idle-close-on-response
+no option idle-close-on-response
+  Avoid closing idle connections if a stop stop is in progress
+  May be used in sections :   defaults | frontend | listen | backend
+ yes   |yes   |   yes  |   no
+  Arguments : none
+
+  By default, idle connections will be closed during a soft stop. In some
+  environments, a client talking to the proxy may have prepared some idle
+  connections in order to send requests later. If there is no proper retry on
+  write errors, this can result in errors while haproxy is reloading. Even
+  though a proper implementation should retry on connection/write errors, this
+  option was introduced to support back compat with haproxy < v2.4. Indeed
+  before v2.4, we were waiting for a last request to be able to add a
+  "connection: close" header and advice the client to close the connection.
+
+  In a real life example, this behavior was seen in AWS using the ALB in front
+  of a haproxy. The end result was ALB sending 502 during haproxy reloads.
+
+
 external-check command 
   Executable to run when performing an external-check
   May be used in sections :   defaults | frontend | listen | backend
diff --git a/include/haproxy/proxy-t.h b/include/haproxy/proxy-t.h
index 8ca4e818d..421f900e2 100644
--- a/include/haproxy/proxy-t.h
+++ b/include/haproxy/proxy-t.h
@@ -82,7 +82,7 @@ enum PR_SRV_STATE_FILE {
 #define PR_O_REUSE_ALWS 0x000C  /* always reuse a shared connection */
 #define PR_O_REUSE_MASK 0x000C  /* mask to retrieve shared connection 
preferences */
 
-/* unused: 0x10 */
+#define PR_O_IDLE_CLOSE_RESP 0x0010 /* avoid closing idle connections 
during a soft stop */
 #define PR_O_PREF_LAST  0x0020  /* prefer last server */
 #define PR_O_DISPATCH   0x0040  /* use dispatch mode */
 #define PR_O_FORCED_ID  0x0080  /* proxy's ID was forced in the 
configuration */
diff --git a/src/mux_h1.c b/src/mux_h1.c
index 7b6ab946d..1ec6cb77c 100644
--- a/src/mux_h1.c
+++ b/src/mux_h1.c
@@ -2999,7 +2999,8 @@ static int h1_process(struct h1c * h1c)
 */
if (!(h1c->flags & H1C_F_IS_BACK)) {
if (unlikely(h1c->px->flags & (PR_FL_DISABLED|PR_FL_STOPPED))) {
-   if (h1c->flags & H1C_F_WAIT_NEXT_REQ)
+   if (!(h1c->px->options & PR_O_IDLE_CLOSE_RESP) &&
+   h1c->flags & H1C_F_WAIT_NEXT_REQ)
goto release;
}
}
diff --git a/src/proxy.c b/src/proxy.c
index ff5e35e2c..245b6

Re: Remaining issues on 2.5 for the release

2021-11-15 Thread William Dauchy
Hi Willy,

On Fri, Nov 12, 2021 at 3:48 PM Willy Tarreau  wrote:
> I intended to emit the final 2.5 this week-end, but a few users having
> upgraded to the latest 2.4, 2.3 or 2.2 reported strange issues that we
> couldn't reproduce and for which we don't have more info yet. Some seem
> related to connections taking longer to vanish, others to possibly
> truncated responses. These include some of the recent 2.5 fixes, and
> it's very possible that a bug has been hiding another one, and that
> fixing one revealed the other one.
> For now I'm failing to reproduce any such problem on any of these
> versions, and the information gathered for now are insufficient to draw
> any conclusion. I'm not thrilled at the idea of releasing 2.5 and having
> to emit yet another a few days one later. We're not late in the cycle so
> I prefer that we focus our efforts on finding that problem and fixing
> the affected versions first.
> For the record, the related issues are #1448, #1451 and #1453 for the
> strange behavior on the connections, and #1450 for the truncation. If
> anyone manages to set up any reliable reproducer, do not hesitate to
> share it!

Do we know whether v2.4.7 is affected by this regression?

Thanks,
-- 
William



Re: [PATCH] MINOR: promex: backend aggregated server check status

2021-11-08 Thread William Dauchy
On Mon, Nov 8, 2021 at 1:52 PM Willy Tarreau  wrote:
> Just to be sure, is this something you want to merge into 2.5 or is it
> to be queued next ? I'm fine with both, but I prefer to ask as it's not
> just a one-liner and I don't know the impact on promex.

I know Christopher was possibly thinking about a more advanced
implementation but I thought it could be ok for a first version.
Probably a good idea to wait for Christopher feedbacks anyway.
I was indeed targeting a late v2.5, despite being a new thing. Sorry
for that and I forgot to add a message about it.
If it works well enough, I will also probably ask for a backport in
2.4 before the end of the year as I know large clusters are waiting
for this patch to lower their memory consumption in prometheus.

Thanks,
-- 
William



[PATCH] MINOR: promex: backend aggregated server check status

2021-11-07 Thread William Dauchy
- add new metric: `haproxy_backend_agg_server_check_status`
  it counts the number of servers matching a specific check status
  this permits to exclude per server check status as the usage is often
  to rely on the total. Indeed in large setup having thousands of
  servers per backend the memory impact is not neglible to store the per
  server metric.
- realign promex_str_metrics array

quite simple implementation - we could improve it later by adding an
internal state to the prometheus exporter, thus to avoid counting at
every dump.

this patch is an attempt to close github issue #1312

Signed-off-by: William Dauchy 
---
 addons/promex/service-prometheus.c | 229 -
 include/haproxy/stats-t.h  |   1 +
 src/stats.c|   4 +
 3 files changed, 131 insertions(+), 103 deletions(-)

diff --git a/addons/promex/service-prometheus.c 
b/addons/promex/service-prometheus.c
index 221e40705..ef217007d 100644
--- a/addons/promex/service-prometheus.c
+++ b/addons/promex/service-prometheus.c
@@ -189,106 +189,107 @@ const struct promex_metric 
promex_global_metrics[INF_TOTAL_FIELDS] = {
 
 /* frontend/backend/server fields */
 const struct promex_metric promex_st_metrics[ST_F_TOTAL_FIELDS] = {
-   //[ST_F_PXNAME] ignored
-   //[ST_F_SVNAME] ignored
-   [ST_F_QCUR]   = { .n = IST("current_queue"),
.type = PROMEX_MT_GAUGE,.flags = (  
 PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) },
-   [ST_F_QMAX]   = { .n = IST("max_queue"),
.type = PROMEX_MT_GAUGE,.flags = (  
 PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) },
-   [ST_F_SCUR]   = { .n = IST("current_sessions"), 
.type = PROMEX_MT_GAUGE,.flags = (PROMEX_FL_FRONT_METRIC | 
PROMEX_FL_LI_METRIC | PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) },
-   [ST_F_SMAX]   = { .n = IST("max_sessions"), 
.type = PROMEX_MT_GAUGE,.flags = (PROMEX_FL_FRONT_METRIC | 
PROMEX_FL_LI_METRIC | PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) },
-   [ST_F_SLIM]   = { .n = IST("limit_sessions"),   
.type = PROMEX_MT_GAUGE,.flags = (PROMEX_FL_FRONT_METRIC | 
PROMEX_FL_LI_METRIC | PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) },
-   [ST_F_STOT]   = { .n = IST("sessions_total"),   
.type = PROMEX_MT_COUNTER,  .flags = (PROMEX_FL_FRONT_METRIC | 
PROMEX_FL_LI_METRIC | PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) },
-   [ST_F_BIN]= { .n = IST("bytes_in_total"),   
.type = PROMEX_MT_COUNTER,  .flags = (PROMEX_FL_FRONT_METRIC | 
PROMEX_FL_LI_METRIC | PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) },
-   [ST_F_BOUT]   = { .n = IST("bytes_out_total"),  
.type = PROMEX_MT_COUNTER,  .flags = (PROMEX_FL_FRONT_METRIC | 
PROMEX_FL_LI_METRIC | PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) },
-   [ST_F_DREQ]   = { .n = IST("requests_denied_total"),
.type = PROMEX_MT_COUNTER,  .flags = (PROMEX_FL_FRONT_METRIC | 
PROMEX_FL_LI_METRIC | PROMEX_FL_BACK_METRIC   ) },
-   [ST_F_DRESP]  = { .n = IST("responses_denied_total"),   
.type = PROMEX_MT_COUNTER,  .flags = (PROMEX_FL_FRONT_METRIC | 
PROMEX_FL_LI_METRIC | PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) },
-   [ST_F_EREQ]   = { .n = IST("request_errors_total"), 
.type = PROMEX_MT_COUNTER,  .flags = (PROMEX_FL_FRONT_METRIC | 
PROMEX_FL_LI_METRIC   ) },
-   [ST_F_ECON]   = { .n = IST("connection_errors_total"),  
.type = PROMEX_MT_COUNTER,  .flags = (  
 PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) },
-   [ST_F_ERESP]  = { .n = IST("response_errors_total"),
.type = PROMEX_MT_COUNTER,  .flags = (  
 PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) },
-   [ST_F_WRETR]  = { .n = IST("retry_warnings_total"), 
.type = PROMEX_MT_COUNTER,  .flags = (  
 PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) },
-   [ST_F_WREDIS] = { .n = IST("redispatch_warnings_total"),
.type = PROMEX_MT_COUNTER,  .flags = (  
 PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) },
-   [ST_F_STATUS] = { .n = IST("status"),   
.type = PROMEX_MT_GAUGE,.flags = (PROMEX_FL_FRONT_METRIC | 
PROMEX_FL_LI_METRIC | PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) },
-   [ST_F_WEIGHT]   

[PATCH] DOC: stats: fix location of the text representation

2021-11-06 Thread William Dauchy
`info_field_names` and `stat_field_names` no longer exist and have been
moved in stats.c
To avoid changing this comment, just mention the name of the new table
`info_fields` and `stat_fields`

Signed-off-by: William Dauchy 
---
 include/haproxy/stats-t.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/haproxy/stats-t.h b/include/haproxy/stats-t.h
index 312013364..9ac875e73 100644
--- a/include/haproxy/stats-t.h
+++ b/include/haproxy/stats-t.h
@@ -253,8 +253,8 @@ enum field_scope {
FS_MASK = 0xFF00,
 };
 
-/* Show Info fields for CLI output. For any field added here, please add the 
text
- * representation in the info_field_names array below. Please only append at 
the end,
+/* Show info fields for CLI output. For any field added here, please add the
+ * text representation in the info_fields array. Please only append at the end,
  * before the INF_TOTAL_FIELDS entry, and never insert anything in the middle
  * nor at the beginning.
  */
@@ -338,7 +338,7 @@ enum info_field {
 
 
 /* Stats fields for CSV output. For any field added here, please add the text
- * representation in the stat_field_names array below. Please only append at 
the end,
+ * representation in the stat_fields array. Please only append at the end,
  * before the ST_F_TOTAL_FIELDS entry, and never insert anything in the middle
  * nor at the beginning.
  */
-- 
2.30.2




Re: http-request return bytes_read from v2.3 to v2.4

2021-07-26 Thread William Dauchy
On Mon, Jul 26, 2021 at 8:35 AM Christopher Faulet  wrote:
> The response size reported in the log messages is related to the internal
> representation of the HTTP message. It is a limitation of the current design. 
> It
> means this size is not equal to the raw size of the message and may change 
> when
> the HTX representation changes. In this case, in 2.4, the start-line
> representation was changed, a 32-bits integer was removed from hx_sl 
> structure.
> In addition, the end-of-message block (EOM) was removed from the HTX
> representation. It counted for 1 byte. This explains the difference of 5 bytes
> between 2.3 and 2.4.

Thanks Christophe, crystal clear explanation!

-- 
William



http-request return bytes_read from v2.3 to v2.4

2021-07-25 Thread William Dauchy
Hello,

While upgrading from v2.3.x to v2.4.x I noticed a difference of bytes
read in requests http logs with this simple frontend config:

frontend foo
bind 127.0.0.1:8000
http-request return

in v2.3, the logs (%B) tells me 54 bytes, in v2.4, I am getting 49. So
a difference of 5 bytes per request.

I was simply curious to understand that behavior change. Is it a
change in the way we count bytes in stats, or is there really a
difference in the payload we write?
When I look at a dump, the TCP payloads are identical with 57 bytes in
both versions. And the overall length of the packet is 123 bytes.
Maybe that's expected, but I wanted some clarification.

Thanks,
-- 
William



Re: [ANNOUNCE] haproxy-2.4.0

2021-05-26 Thread William Dauchy
On Fri, May 14, 2021 at 11:58 AM Willy Tarreau  wrote:
>   - interoperability / protocol support: WebSocket over HTTP/2 (RFC8441)
> is now supported on both sides, regardless of the version on the other
> side. The cache now supports the "Vary" header with a few commonly
> used headers, including "Accept-encoding" which gets normalized for
> optimal cache hit ratio. The Prometheus exporter got a significant
> liftup, requires less tricks on the Prometheus side, and supports
> listing only certain metrics for faster retrieval. Optional native
> support for Opentracing was also integrated (via USE_OT=1). The DNS
> resolvers now support talking to servers over TCP. Basic support for
> extracting information from MQTT and FIX protocol was added. Timeouts
> can now be adjusted on the fly and per-request in order to adapt to
> particuarly slow servers or special protocols.

Regarding prometheus, it should probably noted some major changes
regarding some metrics, such as for health check, where the value is
now located in a label, instead of the value of the metric itself:
see also 
http://git.haproxy.org/?p=haproxy.git;a=commit;h=de3c32638925c2071a5a84cbdafe2f112d2c4261
-- 
William



Re: [ANNOUNCE] haproxy-2.3.9

2021-03-31 Thread William Dauchy
On Tue, Mar 30, 2021 at 6:59 PM Willy Tarreau  wrote:
> HAProxy 2.3.9 was released on 2021/03/30. It added 5 new commits
> after version 2.3.8.
>
> This essentially fixes the rate counters issue that popped up in 2.3.8
> after the previous fix for the rate counters already.
>
> What happened is that the internal time in millisecond wraps every 49.7
> days and that the new global counter used to make sure rate counters are
> now stable across threads starts at zero and is initialized when older
> than the current thread's current date. It just happens that the wrapping
> happened a few hours ago at "Mon Mar 29 23:59:46 CEST 2021" exactly and
> that any process started since this date and for the next 24 days doesn't
> validate this condition anymore, hence doesn't rotate its rate counters
> anymore.

Thanks Willy for the quick update. That's a good example to avoid
pushing stable versions at the same time, so we have opportunities to
find those regressions.

-- 
William



Re: Table sticky counters decrementation problem

2021-03-30 Thread William Dauchy
On Tue, Mar 30, 2021 at 5:57 PM Willy Tarreau  wrote:
> out of curiosity I wanted to check when the overflow happened:
>
> $ date --date=@$$(date +%s) * 1000) & -0x800) / 1000))
> Mon Mar 29 23:59:46 CEST 2021
>
> So it only affects processes started since today. I'm quite tempted not
> to wait further and to emit 2.3.9 urgently to fix this before other
> people get trapped after reloading their process. Any objection ?

I do confirm the timestamp on our side but do not have the necessary
tooling to test the fix.

Thanks,
-- 
William



Re: "[ANNOUNCE] haproxy-2.3.6

2021-03-05 Thread William Dauchy
Hi,

On Wed, Mar 3, 2021 at 4:09 PM Christopher Faulet  wrote:
>- An issue leading to possible infinite loops because of a double locking
>  effect in the mt lists was fixed by Olivier. If MT_LIST_TRY_ADDQ()
>  macro, it was possible to try to lock twice the same element, making the
>  second lock attempt to fail in loop.
> Olivier Houchard (1):
>BUG/MEDIUM: lists: Avoid an infinite loop in MT_LIST_TRY_ADDQ().

not very clear in which conditions it can be triggered. Do you have
more details about it?

Thanks,

-- 
William



[PATCH] BUG/MEDIUM: contrib/prometheus-exporter: fix segfault in listener name dump

2021-02-24 Thread William Dauchy
We need to check whether listener is empty before doing anything; in
that case, we were trying to dump listerner name while name is null. So
simply move the counters check above, which validate all possible cases
when the listener is empty. This is very similar to what is done in
stats.c

see also the trace:

  Thread 1 "haproxy" received signal SIGSEGV, Segmentation fault.
  __strlen_sse2 () at ../sysdeps/x86_64/multiarch/../strlen.S:120
  120     ../sysdeps/x86_64/multiarch/../strlen.S: No such file or directory.
  (gdb) bt
  #0  __strlen_sse2 () at ../sysdeps/x86_64/multiarch/../strlen.S:120
  #1  0x555b716b in promex_dump_listener_metrics (htx=0x558fadf0, 
appctx=0x55926070) at contrib/prometheus-exporter/service-prometheus.c:722
  #2  promex_dump_metrics (htx=0x558fadf0, si=0x55925920, 
appctx=0x55926070) at contrib/prometheus-exporter/service-prometheus.c:1200
  #3  promex_appctx_handle_io (appctx=0x55926070) at 
contrib/prometheus-exporter/service-prometheus.c:1477
  #4  0x556f0c94 in task_run_applet (t=0x55926180, 
context=0x55926070, state=) at src/applet.c:88
  #5  0x556bc6d8 in run_tasks_from_lists 
(budgets=budgets@entry=0x7fffe374) at src/task.c:548
  #6  0x556bd1a0 in process_runnable_tasks () at src/task.c:750
  #7  0x55696cdd in run_poll_loop () at src/haproxy.c:2870
  #8  0x55697025 in run_thread_poll_loop (data=data@entry=0x0) at 
src/haproxy.c:3035
  #9  0x55596c90 in main (argc=, argv=0x7fffe818) at 
src/haproxy.c:3723
  quit)

this bug was introduced by commit
e3f7bd5ae9e969cbfe87e4130d06bff7a3e814c6 ("MEDIUM:
contrib/prometheus-exporter: add listen stats"), which is present for
2.4 only, so no backport needed.

Signed-off-by: William Dauchy 
---
 contrib/prometheus-exporter/service-prometheus.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/contrib/prometheus-exporter/service-prometheus.c 
b/contrib/prometheus-exporter/service-prometheus.c
index 7cf30c1f3..e6023d353 100644
--- a/contrib/prometheus-exporter/service-prometheus.c
+++ b/contrib/prometheus-exporter/service-prometheus.c
@@ -718,12 +718,12 @@ static int promex_dump_listener_metrics(struct appctx 
*appctx, struct htx *htx)
li = appctx->ctx.stats.obj2;
list_for_each_entry_from(li, >conf.listeners, 
by_fe) {
 
-   labels[1].name  = ist("listener");
-   labels[1].value = ist2(li->name, 
strlen(li->name));
-
if (!li->counters)
continue;
 
+   labels[1].name  = ist("listener");
+   labels[1].value = ist2(li->name, 
strlen(li->name));
+
if (!stats_fill_li_stats(px, li, 0, stats,
 ST_F_TOTAL_FIELDS, 
&(appctx->st2)))
return -1;
-- 
2.30.0




[PATCH 2/3] REGTESTS: contrib/prometheus-exporter: test NaN values

2021-02-18 Thread William Dauchy
In order to make sure we detect when we change default behaviour for
some metrics, test the NaN value when it is expected.

Those metrics were listed since our last rework as their default value
changed, unless the appropriate config is set.

Signed-off-by: William Dauchy 
---
 reg-tests/contrib/prometheus.vtc | 30 ++
 1 file changed, 30 insertions(+)

diff --git a/reg-tests/contrib/prometheus.vtc b/reg-tests/contrib/prometheus.vtc
index cdd0f0f55..1ebeb29cb 100644
--- a/reg-tests/contrib/prometheus.vtc
+++ b/reg-tests/contrib/prometheus.vtc
@@ -10,6 +10,11 @@ server s1 {
txresp
 } -repeat 2 -start
 
+server s2 {
+   rxreq
+   txresp
+} -repeat 2 -start
+
 haproxy h1 -conf {
 defaults
mode http
@@ -29,11 +34,13 @@ haproxy h1 -conf {
 backend be
stick-table type ip size 1m expire 10s store http_req_rate(10s)
server s1 ${s1_addr}:${s1_port}
+   server s2 ${s2_addr}:${s2_port} check maxqueue 10 maxconn 12 
pool-max-conn 42
 } -start
 
 client c1 -connect ${h1_stats_sock} {
txreq -url "/metrics"
rxresp
+   # test general metrics
expect resp.status == 200
expect resp.body ~ ".*haproxy_process.*"
expect resp.body ~ ".*haproxy_frontend.*"
@@ -42,6 +49,29 @@ client c1 -connect ${h1_stats_sock} {
expect resp.body ~ ".*haproxy_server.*"
expect resp.body ~ ".*haproxy_sticktable.*"
 
+   # test expected NaN values
+   expect resp.body ~ 
".*haproxy_server_check_failures_total{proxy=\"be\",server=\"s1\"} NaN.*"
+   expect resp.body ~ 
".*haproxy_server_check_up_down_total{proxy=\"be\",server=\"s1\"} NaN.*"
+   expect resp.body ~ 
".*haproxy_server_check_failures_total{proxy=\"be\",server=\"s2\"} 0.*"
+   expect resp.body ~ 
".*haproxy_server_check_up_down_total{proxy=\"be\",server=\"s2\"} 0.*"
+
+   expect resp.body ~ 
".*haproxy_server_queue_limit{proxy=\"be\",server=\"s1\"} NaN.*"
+   expect resp.body ~ 
".*haproxy_server_queue_limit{proxy=\"be\",server=\"s2\"} 10.*"
+
+   expect resp.body ~ 
".*haproxy_server_limit_sessions{proxy=\"be\",server=\"s1\"} NaN.*"
+   expect resp.body ~ 
".*haproxy_server_limit_sessions{proxy=\"be\",server=\"s2\"} 12.*"
+
+   expect resp.body ~ 
".*haproxy_backend_downtime_seconds_total{proxy=\"stats\"} NaN.*"
+   expect resp.body ~ 
".*haproxy_backend_downtime_seconds_total{proxy=\"be\"} 0.*"
+   expect resp.body ~ 
".*haproxy_server_downtime_seconds_total{proxy=\"be\",server=\"s1\"} NaN.*"
+   expect resp.body ~ 
".*haproxy_server_downtime_seconds_total{proxy=\"be\",server=\"s2\"} 0.*"
+
+   expect resp.body ~ 
".*haproxy_server_current_throttle{proxy=\"be\",server=\"s1\"} NaN.*"
+
+   expect resp.body ~ 
".*haproxy_server_idle_connections_limit{proxy=\"be\",server=\"s1\"} NaN.*"
+   expect resp.body ~ 
".*haproxy_server_idle_connections_limit{proxy=\"be\",server=\"s2\"} 42.*"
+
+   # test scope
txreq -url "/metrics?scope="
rxresp
expect resp.status == 200
-- 
2.30.0




[PATCH 1/3] DOC: contrib/prometheus-exporter: remove htx reference

2021-02-18 Thread William Dauchy
now that htx is the default everywhere, we can remove the need to put
htx as a mandatory option to setup prometheus.

Signed-off-by: William Dauchy 
---
 contrib/prometheus-exporter/README | 1 -
 1 file changed, 1 deletion(-)

diff --git a/contrib/prometheus-exporter/README 
b/contrib/prometheus-exporter/README
index 30154de95..fdbc50203 100644
--- a/contrib/prometheus-exporter/README
+++ b/contrib/prometheus-exporter/README
@@ -27,7 +27,6 @@ and the corresponding HTTP proxy must enable the HTX support. 
For instance:
 frontend test
 mode http
 ...
-option http-use-htx
 http-request use-service prometheus-exporter if { path /metrics }
 ...
 
-- 
2.30.0




[PATCH 3/3] REGTESTS: contrib/prometheus-exporter: test well known labels

2021-02-18 Thread William Dauchy
as we previously briefly broke labels handling, test them to make sure
we don't introduce regressions in the future.
see also commit 040b1195f70d6a24204ede081451fd1dd71e6a34 ("BUG/MINOR:
contrib/prometheus-exporter: Restart labels dump at the right pos") for
reference

Signed-off-by: William Dauchy 
---
 reg-tests/contrib/prometheus.vtc | 9 +
 1 file changed, 9 insertions(+)

diff --git a/reg-tests/contrib/prometheus.vtc b/reg-tests/contrib/prometheus.vtc
index 1ebeb29cb..ebe0b8753 100644
--- a/reg-tests/contrib/prometheus.vtc
+++ b/reg-tests/contrib/prometheus.vtc
@@ -71,6 +71,15 @@ client c1 -connect ${h1_stats_sock} {
expect resp.body ~ 
".*haproxy_server_idle_connections_limit{proxy=\"be\",server=\"s1\"} NaN.*"
expect resp.body ~ 
".*haproxy_server_idle_connections_limit{proxy=\"be\",server=\"s2\"} 42.*"
 
+   # test well known labels presence
+   expect resp.body ~ ".*haproxy_process_build_info{version=\".*\"} 1.*"
+   expect resp.body ~ 
".*haproxy_frontend_http_responses_total{proxy=\"stats\",code=\"4xx\"} 0.*"
+   expect resp.body ~ 
".*haproxy_frontend_status{proxy=\"fe\",state=\"UP\"} 1.*"
+   expect resp.body ~ 
".*haproxy_listener_status{proxy=\"stats\",listener=\"sock-1\",state=\"WAITING\"}
 0.*"
+   expect resp.body ~ ".*haproxy_backend_status{proxy=\"be\",state=\"UP\"} 
1.*"
+   expect resp.body ~ 
".*haproxy_server_status{proxy=\"be\",server=\"s1\",state=\"DOWN\"} 0.*"
+   expect resp.body ~ 
".*haproxy_server_check_status{proxy=\"be\",server=\"s2\",state=\"HANA\"} 0.*"
+
# test scope
txreq -url "/metrics?scope="
rxresp
-- 
2.30.0




[PATCH] MINOR: cli: add missing agent commands for set server

2021-02-15 Thread William Dauchy
we previously forgot to add `agent-*` commands.
Take this opportunity to rewrite the help string in a simpler way for
readability (mainly removing simple quotes)

Signed-off-by: William Dauchy 
---
 src/server.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/src/server.c b/src/server.c
index da6ee52ad..1c1eeab73 100644
--- a/src/server.c
+++ b/src/server.c
@@ -4618,9 +4618,10 @@ static int cli_parse_set_server(char **args, char 
*payload, struct appctx *appct
 #endif
} else {
cli_err(appctx,
-   "'set server ' only supports 'agent', 'health', "
-   "'state', 'weight', 'addr', 'fqdn', 'check-addr', "
-   "'check-port' and 'ssl'.\n");
+   "usage: set server / "
+   "addr | agent | agent-addr | agent-port | agent-send | "
+   "check-addr | check-port | fqdn | health | ssl | "
+   "state | weight\n");
}
  out_unlock:
HA_SPIN_UNLOCK(SERVER_LOCK, >lock);
-- 
2.30.0




[PATCH 2/3] MINOR: stats: add helper to get status string

2021-02-14 Thread William Dauchy
move listen status to a helper, defining both status enum and string
definition.
this will be helpful to be reused in prometheus code. It also removes
this hard-to-read nested ternary.

Signed-off-by: William Dauchy 
---
 include/haproxy/listener-t.h |  9 +
 include/haproxy/listener.h   |  3 +++
 src/listener.c   | 18 ++
 src/stats.c  |  2 +-
 4 files changed, 31 insertions(+), 1 deletion(-)

diff --git a/include/haproxy/listener-t.h b/include/haproxy/listener-t.h
index ce5ed408f..7d3998ece 100644
--- a/include/haproxy/listener-t.h
+++ b/include/haproxy/listener-t.h
@@ -84,6 +84,15 @@ enum li_state {
  * not rely on this state.
  */
 
+/* listener status for stats */
+enum li_status {
+   LI_STATUS_WAITING = 0,
+   LI_STATUS_OPEN,
+   LI_STATUS_FULL,
+
+   LI_STATE_COUNT /* must be last */
+};
+
 /* listener socket options */
 #define LI_O_NONE   0x
 #define LI_O_NOLINGER   0x0001  /* disable linger on this socket */
diff --git a/include/haproxy/listener.h b/include/haproxy/listener.h
index 1be8551c3..f229efae4 100644
--- a/include/haproxy/listener.h
+++ b/include/haproxy/listener.h
@@ -218,6 +218,9 @@ struct task *manage_global_listener_queue(struct task *t, 
void *context, unsigne
 
 extern struct accept_queue_ring accept_queue_rings[MAX_THREADS] 
__attribute__((aligned(64)));
 
+extern const char* li_status_st[LI_STATE_COUNT];
+enum li_status get_li_status(struct listener *l);
+
 #endif /* _HAPROXY_LISTENER_H */
 
 /*
diff --git a/src/listener.c b/src/listener.c
index 0b929b906..9ca910b37 100644
--- a/src/listener.c
+++ b/src/listener.c
@@ -46,6 +46,12 @@ static struct bind_kw_list bind_keywords = {
 static struct mt_list global_listener_queue = 
MT_LIST_HEAD_INIT(global_listener_queue);
 static struct task *global_listener_queue_task;
 
+/* listener status for stats */
+const char* li_status_st[LI_STATE_COUNT] = {
+   [LI_STATUS_WAITING] = "WAITING",
+   [LI_STATUS_OPEN]= "OPEN",
+   [LI_STATUS_FULL]= "FULL",
+};
 
 #if defined(USE_THREAD)
 
@@ -183,6 +189,18 @@ REGISTER_CONFIG_POSTPARSER("multi-threaded accept queue", 
accept_queue_init);
 
 #endif // USE_THREAD
 
+/* helper to get listener status for stats */
+enum li_status get_li_status(struct listener *l)
+{
+   if (!l->maxconn || l->nbconn < l->maxconn) {
+   if (l->state == LI_LIMITED)
+   return LI_STATUS_WAITING;
+   else
+   return LI_STATUS_OPEN;
+   }
+   return LI_STATUS_FULL;
+}
+
 /* adjust the listener's state and its proxy's listener counters if needed.
  * It must be called under the listener's lock, but uses atomic ops to change
  * the proxy's counters so that the proxy lock is not needed.
diff --git a/src/stats.c b/src/stats.c
index 1e0db0b24..a63178d5a 100644
--- a/src/stats.c
+++ b/src/stats.c
@@ -1898,7 +1898,7 @@ int stats_fill_li_stats(struct proxy *px, struct listener 
*l, int flags,
metric = mkf_u64(FN_COUNTER, 
l->counters->denied_sess);
break;
case ST_F_STATUS:
-   metric = mkf_str(FO_STATUS, (!l->maxconn || 
l->nbconn < l->maxconn) ? (l->state == LI_LIMITED) ? "WAITING" : "OPEN" : 
"FULL");
+   metric = mkf_str(FO_STATUS, 
li_status_st[get_li_status(l)]);
break;
case ST_F_PID:
metric = mkf_u32(FO_KEY, relative_pid);
-- 
2.30.0




[PATCH 1/3] MEDIUM: stats: allow to select one field in `stats_fill_li_stats`

2021-02-14 Thread William Dauchy
prometheus approach requires to output all values for a given metric
name; meaning we iterate through all metrics, and then iterate in the
inner loop on all objects for this metric.
In order to allow more code reuse, adapt the stats API to be able to
select one field or fill them all otherwise.
>From this patch it should be possible to add support for listen stats in
prometheus.

Signed-off-by: William Dauchy 
---
 include/haproxy/stats.h |   2 +-
 src/hlua_fcn.c  |   3 +-
 src/stats.c | 166 +++-
 3 files changed, 116 insertions(+), 55 deletions(-)

diff --git a/include/haproxy/stats.h b/include/haproxy/stats.h
index 6115dca8f..cbc33f279 100644
--- a/include/haproxy/stats.h
+++ b/include/haproxy/stats.h
@@ -49,7 +49,7 @@ int stats_fill_info(struct field *info, int len);
 int stats_fill_fe_stats(struct proxy *px, struct field *stats, int len,
enum stat_field *selected_field);
 int stats_fill_li_stats(struct proxy *px, struct listener *l, int flags,
-struct field *stats, int len);
+struct field *stats, int len, enum stat_field 
*selected_field);
 int stats_fill_sv_stats(struct proxy *px, struct server *sv, int flags,
 struct field *stats, int len, enum stat_field 
*selected_field);
 int stats_fill_be_stats(struct proxy *px, int flags, struct field *stats, int 
len,
diff --git a/src/hlua_fcn.c b/src/hlua_fcn.c
index 6dd9efb21..38a9cfd21 100644
--- a/src/hlua_fcn.c
+++ b/src/hlua_fcn.c
@@ -866,7 +866,8 @@ int hlua_listener_get_stats(lua_State *L)
return 1;
}
 
-   stats_fill_li_stats(li->bind_conf->frontend, li, STAT_SHLGNDS, stats, 
STATS_LEN);
+   stats_fill_li_stats(li->bind_conf->frontend, li, STAT_SHLGNDS, stats,
+   STATS_LEN, NULL);
 
lua_newtable(L);
for (i=0; i with the listener statistics.  is
- * preallocated array of length . The length of the array
- * must be at least ST_F_TOTAL_FIELDS. If this length is less
- * then this value, the function returns 0, otherwise, it
- * returns 1.  can take the value STAT_SHLGNDS.
+/* Fill  with the listener statistics.  is preallocated array of
+ * length . The length of the array must be at least ST_F_TOTAL_FIELDS. If
+ * this length is less then this value, the function returns 0, otherwise, it
+ * returns 1.  If selected_field is != NULL, only fill this one.  can
+ * take the value STAT_SHLGNDS.
  */
 int stats_fill_li_stats(struct proxy *px, struct listener *l, int flags,
-struct field *stats, int len)
+struct field *stats, int len, enum stat_field 
*selected_field)
 {
+   enum stat_field current_field = (selected_field != NULL ? 
*selected_field : 0);
struct buffer *out = get_trash_chunk();
 
if (len < ST_F_TOTAL_FIELDS)
@@ -1850,54 +1851,112 @@ int stats_fill_li_stats(struct proxy *px, struct 
listener *l, int flags,
 
chunk_reset(out);
 
-   stats[ST_F_PXNAME]   = mkf_str(FO_KEY|FN_NAME|FS_SERVICE, px->id);
-   stats[ST_F_SVNAME]   = mkf_str(FO_KEY|FN_NAME|FS_SERVICE, l->name);
-   stats[ST_F_MODE] = mkf_str(FO_CONFIG|FS_SERVICE, 
proxy_mode_str(px->mode));
-   stats[ST_F_SCUR] = mkf_u32(0, l->nbconn);
-   stats[ST_F_SMAX] = mkf_u32(FN_MAX, l->counters->conn_max);
-   stats[ST_F_SLIM] = mkf_u32(FO_CONFIG|FN_LIMIT, l->maxconn);
-   stats[ST_F_STOT] = mkf_u64(FN_COUNTER, l->counters->cum_conn);
-   stats[ST_F_BIN]  = mkf_u64(FN_COUNTER, l->counters->bytes_in);
-   stats[ST_F_BOUT] = mkf_u64(FN_COUNTER, l->counters->bytes_out);
-   stats[ST_F_DREQ] = mkf_u64(FN_COUNTER, l->counters->denied_req);
-   stats[ST_F_DRESP]= mkf_u64(FN_COUNTER, l->counters->denied_resp);
-   stats[ST_F_EREQ] = mkf_u64(FN_COUNTER, l->counters->failed_req);
-   stats[ST_F_DCON] = mkf_u64(FN_COUNTER, l->counters->denied_conn);
-   stats[ST_F_DSES] = mkf_u64(FN_COUNTER, l->counters->denied_sess);
-   stats[ST_F_STATUS]   = mkf_str(FO_STATUS, (!l->maxconn || l->nbconn < 
l->maxconn) ? (l->state == LI_LIMITED) ? "WAITING" : "OPEN" : "FULL");
-   stats[ST_F_PID]  = mkf_u32(FO_KEY, relative_pid);
-   stats[ST_F_IID]  = mkf_u32(FO_KEY|FS_SERVICE, px->uuid);
-   stats[ST_F_SID]  = mkf_u32(FO_KEY|FS_SERVICE, l->luid);
-   stats[ST_F_TYPE] = mkf_u32(FO_CONFIG|FS_SERVICE, STATS_TYPE_SO);
-   stats[ST_F_WREW] = mkf_u64(FN_COUNTER, 
l->counters->failed_rewrites);
-   stats[ST_F_EINT] = mkf_u64(FN_COUNTER, 
l->counters->internal_errors);
-
-   if (flags & STAT_SHLGNDS) {
-   char str[INET6_ADDRSTRLEN];
-   int port;
-
-   port = get_host_port(>rx.addr);

[PATCH 0/3] prometheus: add listen stats

2021-02-14 Thread William Dauchy
Hello Christopher,

I know I'm a bit late regarding the merge window; this is however the
logical followup of my prometheus work, now adding listen stats.
patch 2/3 is a proposition but I'm ok to revisit.

William Dauchy (3):
  MEDIUM: stats: allow to select one field in `stats_fill_li_stats`
  MINOR: stats: add helper to get status string
  MEDIUM: contrib/prometheus-exporter: add listen stats

 contrib/prometheus-exporter/README|  24 +-
 .../prometheus-exporter/service-prometheus.c  | 274 --
 include/haproxy/listener-t.h  |   9 +
 include/haproxy/listener.h|   3 +
 include/haproxy/stats.h   |   2 +-
 reg-tests/contrib/prometheus.vtc  |   4 +
 src/hlua_fcn.c|   3 +-
 src/listener.c|  18 ++
 src/stats.c   | 166 +++
 9 files changed, 365 insertions(+), 138 deletions(-)

-- 
2.30.0




[PATCH 3/3] MEDIUM: contrib/prometheus-exporter: add listen stats

2021-02-14 Thread William Dauchy
this was a missing piece for a while now even though it was planned. This
patch adds listen stats.
Nothing in particular but we make use of the status helper previously
added.  `promex_st_metrics` diff also looks scary, but I had to realign
all lines.

Signed-off-by: William Dauchy 
---
 contrib/prometheus-exporter/README|  24 +-
 .../prometheus-exporter/service-prometheus.c  | 274 --
 reg-tests/contrib/prometheus.vtc  |   4 +
 3 files changed, 219 insertions(+), 83 deletions(-)

diff --git a/contrib/prometheus-exporter/README 
b/contrib/prometheus-exporter/README
index d882b092f..f08cceba6 100644
--- a/contrib/prometheus-exporter/README
+++ b/contrib/prometheus-exporter/README
@@ -70,6 +70,7 @@ exported. Here are examples:
 
   /metrics?scope=server # ==> server metrics will be exported
   /metrics?scope=frontend=backend # ==> Frontend and backend metrics 
will be exported
+  /metrics?scope=listen # ==> listen metrics will be exported
   /metrics?scope=*=   # ==> no metrics will be exported
   /metrics?scope==global  # ==> global metrics will be exported
   /metrics?scope=sticktable # ==> stick tables metrics will be 
exported
@@ -102,7 +103,7 @@ except the server_check_status, you may configure 
prometheus that way:
 - 
metric_relabel_configs:
- source_labels: ['__name__']
-  regex: 'haproxy_(process_|frontend_|backend_|server_check_status).*'
+  regex: 
'haproxy_(process_|frontend_|listen_|backend_|server_check_status).*'
   action: keep
 
 Exported metrics
@@ -212,6 +213,27 @@ See prometheus export for the description of each field.
 | haproxy_frontend_internal_errors_total  |
 +-+
 
+* Listen metrics
+
++-+
+|Metric name  |
++-+
+| haproxy_listen_current_sessions |
+| haproxy_listen_max_sessions |
+| haproxy_listen_limit_sessions   |
+| haproxy_listen_sessions_total   |
+| haproxy_listen_bytes_in_total   |
+| haproxy_listen_bytes_out_total  |
+| haproxy_listen_requests_denied_total|
+| haproxy_listen_responses_denied_total   |
+| haproxy_listen_request_errors_total |
+| haproxy_listen_status   |
+| haproxy_listen_denied_connections_total |
+| haproxy_listen_denied_sessions_total|
+| haproxy_listen_failed_header_rewriting_total|
+| haproxy_listen_internal_errors_total|
++-+
+
 * Backend metrics
 
 +-+
diff --git a/contrib/prometheus-exporter/service-prometheus.c 
b/contrib/prometheus-exporter/service-prometheus.c
index 403f5a619..70ea1b7b4 100644
--- a/contrib/prometheus-exporter/service-prometheus.c
+++ b/contrib/prometheus-exporter/service-prometheus.c
@@ -63,17 +63,19 @@ enum {
 #define PROMEX_FL_FRONT_METRIC  0x0004
 #define PROMEX_FL_BACK_METRIC   0x0008
 #define PROMEX_FL_SRV_METRIC0x0010
-#define PROMEX_FL_SCOPE_GLOBAL  0x0020
-#define PROMEX_FL_SCOPE_FRONT   0x0040
-#define PROMEX_FL_SCOPE_BACK0x0080
-#define PROMEX_FL_SCOPE_SERVER  0x0100
-#define PROMEX_FL_NO_MAINT_SRV  0x0200
-#define PROMEX_FL_STICKTABLE_METRIC 0x0400
-#define PROMEX_FL_SCOPE_STICKTABLE  0x0800
+#define PROMEX_FL_LI_METRIC 0x0020
+#define PROMEX_FL_STICKTABLE_METRIC 0x0040
+#define PROMEX_FL_SCOPE_GLOBAL  0x0080
+#define PROMEX_FL_SCOPE_FRONT   0x0100
+#define PROMEX_FL_SCOPE_BACK0x0200
+#define PROMEX_FL_SCOPE_SERVER  0x0400
+#define PROMEX_FL_SCOPE_LI  0x0800
+#define PROMEX_FL_SCOPE_STICKTABLE  0x1000
+#define PROMEX_FL_NO_MAINT_SRV  0x2000
 
 #define PROMEX_FL_SCOPE_ALL (PROMEX_FL_SCOPE_GLOBAL | PROMEX_FL_SCOPE_FRONT | \
-PROMEX_FL_SCOPE_BACK | PROMEX_FL_SCOPE_SERVER | \
-PROMEX_FL_SCOPE_STICKTABLE)
+PROMEX_FL_SCOPE_LI | PROMEX_FL_SCOPE_BACK | \
+PROMEX_FL_SCOPE_SERVER | 
PROMEX_FL_SCOPE_STICKTABLE)
 
 /* Promtheus metric type (gauge or counter) */
 enum promex_mt_type {
@@ -187,66 +189,66 @@ const struct promex_metric 
promex_global_metrics[INF_TOTAL_FIELDS] = {
 const struct promex_metric promex_st_metrics[ST_F_TOTAL_FIELDS] = {
//[ST_F_PXNAME] ignored
//[ST_F_SVNAME] ignored
-   [ST_F_QCUR]   = { .n = IST("current_queue"),
.type = PROMEX_MT_GAUGE,.flags = ( 
PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) },
-   [ST_F_QMAX]  

[PATCH 2/2] CLEANUP: contrib/prometheus-exporter: align for with srv status case

2021-02-14 Thread William Dauchy
the for loop was wrongly indented following our recent rework

Signed-off-by: William Dauchy 
---
 contrib/prometheus-exporter/service-prometheus.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/contrib/prometheus-exporter/service-prometheus.c 
b/contrib/prometheus-exporter/service-prometheus.c
index 403f5a619..1b08fcfb4 100644
--- a/contrib/prometheus-exporter/service-prometheus.c
+++ b/contrib/prometheus-exporter/service-prometheus.c
@@ -861,7 +861,7 @@ static int promex_dump_srv_metrics(struct appctx *appctx, 
struct htx *htx)
switch (appctx->st2) {
case ST_F_STATUS:
state = promex_srv_status(sv);
-   for (; appctx->ctx.stats.st_code < 
PROMEX_SRV_STATE_COUNT; appctx->ctx.stats.st_code++) {
+   for (; 
appctx->ctx.stats.st_code < PROMEX_SRV_STATE_COUNT; 
appctx->ctx.stats.st_code++) {
val = 
mkf_u32(FO_STATUS, state == appctx->ctx.stats.st_code);
labels[2].name = 
ist("state");
labels[2].value = 
promex_srv_st[appctx->ctx.stats.st_code];
-- 
2.30.0




[PATCH 1/2] CLEANUP: check: fix get_check_status_info declaration

2021-02-14 Thread William Dauchy
we always put a \n between function name and `{`

Signed-off-by: William Dauchy 
---
 src/check.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/check.c b/src/check.c
index d37a55201..d17f747fe 100644
--- a/src/check.c
+++ b/src/check.c
@@ -178,8 +178,8 @@ const char *get_check_status_description(short 
check_status) {
 }
 
 /* Converts check_status code to short info */
-const char *get_check_status_info(short check_status) {
-
+const char *get_check_status_info(short check_status)
+{
const char *info;
 
if (check_status < HCHK_STATUS_SIZE)
-- 
2.30.0




[PATCH] DOC: tune: explain the origin of block size for ssl.cachesize

2021-02-12 Thread William Dauchy
A user could eventually ask himself where those 200 bytes block size are
coming from. This patch tries to better explain the origin in case
people are curious or want to double check the reality.

Signed-off-by: William Dauchy 
---
 doc/configuration.txt | 21 +++--
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/doc/configuration.txt b/doc/configuration.txt
index 391e074a7..b21c56091 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -2451,16 +2451,17 @@ tune.sndbuf.server 
 
 tune.ssl.cachesize 
   Sets the size of the global SSL session cache, in a number of blocks. A block
-  is large enough to contain an encoded session without peer certificate.
-  An encoded session with peer certificate is stored in multiple blocks
-  depending on the size of the peer certificate. A block uses approximately
-  200 bytes of memory. The default value may be forced at build time, otherwise
-  defaults to 2. When the cache is full, the most idle entries are purged
-  and reassigned. Higher values reduce the occurrence of such a purge, hence
-  the number of CPU-intensive SSL handshakes by ensuring that all users keep
-  their session as long as possible. All entries are pre-allocated upon startup
-  and are shared between all processes if "nbproc" is greater than 1. Setting
-  this value to 0 disables the SSL session cache.
+  is large enough to contain an encoded session without peer certificate.  An
+  encoded session with peer certificate is stored in multiple blocks depending
+  on the size of the peer certificate. A block uses approximately 200 bytes of
+  memory (based on `sizeof(struct sh_ssl_sess_hdr) + SHSESS_BLOCK_MIN_SIZE`
+  calculation used for `shctx_init` function). The default value may be forced
+  at build time, otherwise defaults to 2. When the cache is full, the most
+  idle entries are purged and reassigned. Higher values reduce the occurrence
+  of such a purge, hence the number of CPU-intensive SSL handshakes by ensuring
+  that all users keep their session as long as possible. All entries are
+  pre-allocated upon startup and are shared between all processes if "nbproc"
+  is greater than 1. Setting this value to 0 disables the SSL session cache.
 
 tune.ssl.force-private-cache
   This option disables SSL session cache sharing between all processes. It
-- 
2.30.0




Re: [PATCH v3 0/5] cli commands for checks and agent

2021-02-12 Thread William Dauchy
On Fri, Feb 12, 2021 at 3:06 PM Christopher Faulet  wrote:
> I just slightly amended the 3rd patch to handle the v2 in 
> apply_server_state().
> There is a test on the version when a state-file is local to a proxy. Just a
> minor change.

ok, thanks for that.

> And in the last one, I removed the "chunk_appendf(msg, "\n");" to
> move the LF in ha_warning() calls.

yes ok it is probably clearer that way.

Glad to see the end of this series, now I'm ready to receive related
bug reports ;)

-- 
William



Re: stats / "show servers conn" looses counter after reload

2021-02-12 Thread William Dauchy
Hi Christian,

On Fri, Feb 12, 2021 at 11:59 AM Christian Ruppert  wrote:
> Is this a bug? Can you confirm this behavior? Is there any other way I
> could figure out whether a backend is currently in use?

unfortunately reload does not recover stats values; it is a known
problem; see also https://github.com/haproxy/haproxy/issues/954

-- 
William



[PATCH v3 4/5] MEDIUM: server: support {check,agent}_addr, agent_port in server state

2021-02-11 Thread William Dauchy
logical followup from cli commands addition, so that the state server
file stays compatible with the changes made at runtime; use previously
added helper to load server attributes.

also alloc a specific chunk to avoid mixing with other called functions
using it

Signed-off-by: William Dauchy 
---
 doc/management.txt|  5 +-
 include/haproxy/server-t.h|  9 ++-
 .../checks/1be_40srv_odd_health_checks.vtc|  2 +-
 .../checks/40be_2srv_odd_health_checks.vtc|  2 +-
 reg-tests/checks/4be_1srv_health_checks.vtc   |  6 +-
 src/proxy.c   | 41 ++-
 src/server.c  | 68 +--
 7 files changed, 87 insertions(+), 46 deletions(-)

diff --git a/doc/management.txt b/doc/management.txt
index 423c614b2..60e25c7e1 100644
--- a/doc/management.txt
+++ b/doc/management.txt
@@ -2455,7 +2455,10 @@ show servers state []
  srv_port:Server port.
  srvrecord:   DNS SRV record associated to this SRV.
  srv_use_ssl: use ssl for server connections.
- srv_check_port:  Server check port.
+ srv_check_port:  Server health check port.
+ srv_check_addr:  Server health check address.
+ srv_agent_addr:  Server health agent address.
+ srv_agent_port:  Server health agent port.
 
 show sess
   Dump all known sessions. Avoid doing this on slow connections as this can
diff --git a/include/haproxy/server-t.h b/include/haproxy/server-t.h
index 2ec70dde4..11cb71489 100644
--- a/include/haproxy/server-t.h
+++ b/include/haproxy/server-t.h
@@ -126,11 +126,14 @@ enum srv_initaddr {
 "srv_port "   \
 "srvrecord "  \
 "srv_use_ssl "\
-"srv_check_port"
+"srv_check_port " \
+"srv_check_addr " \
+"srv_agent_addr " \
+"srv_agent_port"
 
-#define SRV_STATE_FILE_MAX_FIELDS 22
+#define SRV_STATE_FILE_MAX_FIELDS 25
 #define SRV_STATE_FILE_NB_FIELDS_VERSION_1 20
-#define SRV_STATE_FILE_NB_FIELDS_VERSION_2 22
+#define SRV_STATE_FILE_NB_FIELDS_VERSION_2 25
 #define SRV_STATE_LINE_MAXLEN 512
 
 /* server flags -- 32 bits */
diff --git a/reg-tests/checks/1be_40srv_odd_health_checks.vtc 
b/reg-tests/checks/1be_40srv_odd_health_checks.vtc
index f01205295..c279972aa 100644
--- a/reg-tests/checks/1be_40srv_odd_health_checks.vtc
+++ b/reg-tests/checks/1be_40srv_odd_health_checks.vtc
@@ -112,6 +112,6 @@ syslog S -wait
 
 haproxy h1 -cli {
 send "show servers state"
-expect ~ "# be_id be_name srv_id srv_name srv_addr srv_op_state 
srv_admin_state srv_uweight srv_iweight srv_time_since_last_change 
srv_check_status srv_check_result srv_check_health srv_check_state 
srv_agent_state bk_f_forced_id srv_f_forced_id srv_fqdn srv_port srvrecord 
srv_use_ssl srv_check_port\n2 be1 1 srv0 ${s0_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 
0 0 0 0 - ${s0_port} - 0 0\n2 be1 2 srv1 ${s1_addr} 2 0 1 1 [[:digit:]]+ 6 
([[:digit:]]+ ){3}0 0 0 - ${s1_port} - 0 0\n2 be1 3 srv2 ${s2_addr} 2 0 1 1 
[[:digit:]]+ 1 0 1 0 0 0 0 - ${s2_port} - 0 0\n2 be1 4 srv3 ${s3_addr} 2 0 1 1 
[[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s3_port} - 0 0\n2 be1 5 srv4 
${s4_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s4_port} - 0 0\n2 be1 6 srv5 
${s5_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s5_port} - 0 0\n2 
be1 7 srv6 ${s6_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s6_port} - 0 0\n2 
be1 8 srv7 ${s7_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - 
${s7_port} - 0 0\n2 be1 9 srv8 ${s8_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - 
${s8_port} - 0 0\n2 be1 10 srv9 ${s9_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ 
){3}0 0 0 - ${s9_port} - 0 0\n2 be1 11 srv10 ${s10_addr} 2 0 1 1 [[:digit:]]+ 1 
0 1 0 0 0 0 - ${s10_port} - 0 0\n2 be1 12 srv11 ${s11_addr} 2 0 1 1 
[[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s11_port} - 0 0\n2 be1 13 srv12 
${s12_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s12_port} - 0 0\n2 be1 14 
srv13 ${s13_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s13_port} 
- 0 0\n2 be1 15 srv14 ${s14_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - 
${s14_port} - 0 0\n2 be1 16 srv15 ${s15_addr} 2 0 1 1 [[:digit:]]+ 6 
([[:digit:]]+ ){3}0 0 0 - ${s15_port} - 0 0\n2 be1 17 srv16 ${s16_addr} 2 0 1 1 
[[:digit:]]+ 1 0 1 0 0 0 0 - ${s16_port} - 0 0\n2 be1 18 srv17 ${s17_addr} 2 0 
1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s17_port} - 0 0\n2 be1 19 srv18 
${s18_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s18_port} - 0 0\n2 be1 20 
srv19 ${s19_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s19_port} 
- 0 0\n2 be1 21 srv20 ${s20_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - 
${s20_port} - 0 0\n2 be1 22 srv21 ${s21_addr} 2 0 1 1 [[:digit:]]+ 6 
([[:digit:]]+ ){3}0 0 0 - ${s21_port} - 0 0\n2 b

[PATCH v3 1/5] MEDIUM: cli: add check-addr command

2021-02-11 Thread William Dauchy
this patch allows to set server health check address at runtime. In
order to align with `addr` command, also allow to set port optionnaly.
This led to a small refactor in order to use the same function for both
`check-addr` and `check-port` commands.
for `check-port`, we however don't permit the change anymore if checks
are not enabled on the server.

This command becomes more and more useful for people having a consul
like architecture:
- the backend server is located on a container with its own IP
- the health checks are done the consul instance located on the host
  with the host IP

Signed-off-by: William Dauchy 
---
 doc/management.txt |  4 ++
 src/server.c   | 95 ++
 2 files changed, 84 insertions(+), 15 deletions(-)

diff --git a/doc/management.txt b/doc/management.txt
index b74aba769..bff770e4e 100644
--- a/doc/management.txt
+++ b/doc/management.txt
@@ -1842,6 +1842,10 @@ set server / health [ up | stopping | 
down ]
   switch a server's state regardless of some slow health checks for example.
   Note that the change is propagated to tracking servers if any.
 
+set server / check-addr  [port ]
+  Change the IP address used for server health checks.
+  Optionally, change the port used for server health checks.
+
 set server / check-port 
   Change the port used for health checking to 
 
diff --git a/src/server.c b/src/server.c
index 453c59866..74bd972e2 100644
--- a/src/server.c
+++ b/src/server.c
@@ -54,6 +54,8 @@ static int srv_set_fqdn(struct server *srv, const char *fqdn, 
int dns_locked);
 static void srv_state_parse_line(char *buf, const int version, char **params, 
char **srv_params);
 static int srv_state_get_version(FILE *f);
 static void srv_cleanup_connections(struct server *srv);
+static const char *update_server_check_addr_port(struct server *s, const char 
*addr,
+const char *port);
 
 /* List head of all known server keywords */
 static struct srv_kw_list srv_keywords = {
@@ -3574,6 +3576,59 @@ int update_server_addr(struct server *s, void *ip, int 
ip_sin_family, const char
return 0;
 }
 
+/* update server health check address and port
+ * addr must be ip4 or ip6, it won't be resolved
+ * if one error occurs, don't apply anything
+ * must be called with the server lock held.
+ */
+static const char *update_server_check_addr_port(struct server *s, const char 
*addr,
+const char *port)
+{
+   struct sockaddr_storage sk;
+   struct buffer *msg;
+   int new_port;
+
+   msg = get_trash_chunk();
+   chunk_reset(msg);
+
+   if (!(s->check.state & CHK_ST_ENABLED)) {
+   chunk_strcat(msg, "health checks are not enabled on this 
server");
+   goto out;
+   }
+   if (addr) {
+   memset(, 0, sizeof(struct sockaddr_storage));
+   if (str2ip2(addr, , 0) == NULL) {
+   chunk_appendf(msg, "invalid addr '%s'", addr);
+   goto out;
+   }
+   }
+   if (port) {
+   if (strl2irc(port, strlen(port), _port) != 0) {
+   chunk_appendf(msg, "provided port is not an integer");
+   goto out;
+   }
+   if (new_port < 0 || new_port > 65535) {
+   chunk_appendf(msg, "provided port is invalid");
+   goto out;
+   }
+   /* prevent the update of port to 0 if MAPPORTS are in use */
+   if ((s->flags & SRV_F_MAPPORTS) && new_port == 0) {
+   chunk_appendf(msg, "can't unset 'port' since MAPPORTS 
is in use");
+   goto out;
+   }
+   }
+out:
+   if (msg->data)
+   return msg->area;
+   else {
+   if (addr)
+   s->check.addr = sk;
+   if (port)
+   s->check.port = new_port;
+   }
+   return NULL;
+}
+
 /*
  * This function update a server's addr and port only for AF_INET and AF_INET6 
families.
  *
@@ -4406,23 +4461,32 @@ static int cli_parse_set_server(char **args, char 
*payload, struct appctx *appct
cli_err(appctx, "cannot allocate memory for new 
string.\n");
}
}
-   else if (strcmp(args[3], "check-port") == 0) {
-   int i = 0;
-   if (strl2irc(args[4], strlen(args[4]), ) != 0) {
-   cli_err(appctx, "'set server  check-port' expects 
an integer as argument.\n");
-   goto out_unlock;
-   }
-   if ((i < 0) || (i > 65535)) {
-   cli_err(appctx, "provided port is not valid.\n");
+   else if (strcmp(args[

[PATCH v3 5/5] MINOR: server: enhance error precision when applying server state

2021-02-11 Thread William Dauchy
server health checks and agent parameters are written the same way as
others to be able to enahcne code reuse: basically we make use of
parsing and assignment at the same place. It makes it difficult for
error handling to know whether srv object was modified partially or not.
The problem was already present with SRV resolution though.

I was a bit puzzled about the approach to take to be honest, and I did
not wanted to go into a full refactor, so I assumed it was ok to simply
notify whether the line was failed or partially applied.

Signed-off-by: William Dauchy 
---
 src/server.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/src/server.c b/src/server.c
index 739bcfba6..b0a8f90d4 100644
--- a/src/server.c
+++ b/src/server.c
@@ -2625,6 +2625,7 @@ static void srv_update_state(struct server *srv, int 
version, char **params)
unsigned int port_svc;
char *srvrecord;
char *addr;
+   int partial_apply = 0;
 #ifdef USE_OPENSSL
int use_ssl;
 #endif
@@ -2795,6 +2796,7 @@ static void srv_update_state(struct server *srv, int 
version, char **params)
/* don't apply anything if one error has been detected */
if (msg->data)
goto out;
+   partial_apply = 1;
 
/* recover operational state and apply it to this server
 * and all servers tracking this one */
@@ -3032,8 +3034,12 @@ static void srv_update_state(struct server *srv, int 
version, char **params)
HA_SPIN_UNLOCK(SERVER_LOCK, >lock);
if (msg->data) {
chunk_appendf(msg, "\n");
-   ha_warning("server-state application failed for server 
'%s/%s'%s",
-  srv->proxy->id, srv->id, msg->area);
+   if (partial_apply == 1)
+   ha_warning("server-state partially applied for server 
'%s/%s'%s",
+  srv->proxy->id, srv->id, msg->area);
+   else
+   ha_warning("server-state application failed for server 
'%s/%s'%s",
+  srv->proxy->id, srv->id, msg->area);
}
  end:
free_trash_chunk(msg);
-- 
2.30.0




[PATCH v3 3/5] MEDIUM: server: add server-states version 2

2021-02-11 Thread William Dauchy
Even if it is possibly too much work for the current usage, it makes
sure we don't break states file from v2.3 to v2.4; indeed, since v2.3,
we introduced two new fields, so we put them aside to guarantee we can
easily reload from a version 1.
The diff seems huge but there is no specific change apart from:
- introduce v2 where it is needed (parsing, update)
- move away from switch/case in update to be able to reuse code
- move srv lock to the whole function to make it easier

this patch confirm how painful it is to maintain this functionality.

Signed-off-by: William Dauchy 
---
 include/haproxy/server-t.h |   7 +-
 src/server.c   | 742 ++---
 2 files changed, 367 insertions(+), 382 deletions(-)

diff --git a/include/haproxy/server-t.h b/include/haproxy/server-t.h
index b326305e8..2ec70dde4 100644
--- a/include/haproxy/server-t.h
+++ b/include/haproxy/server-t.h
@@ -101,9 +101,9 @@ enum srv_initaddr {
 } __attribute__((packed));
 
 /* server-state-file version */
-#define SRV_STATE_FILE_VERSION 1
+#define SRV_STATE_FILE_VERSION 2
 #define SRV_STATE_FILE_VERSION_MIN 1
-#define SRV_STATE_FILE_VERSION_MAX 1
+#define SRV_STATE_FILE_VERSION_MAX 2
 #define SRV_STATE_FILE_FIELD_NAMES \
 "be_id "  \
 "be_name "\
@@ -129,7 +129,8 @@ enum srv_initaddr {
 "srv_check_port"
 
 #define SRV_STATE_FILE_MAX_FIELDS 22
-#define SRV_STATE_FILE_NB_FIELDS_VERSION_1 22
+#define SRV_STATE_FILE_NB_FIELDS_VERSION_1 20
+#define SRV_STATE_FILE_NB_FIELDS_VERSION_2 22
 #define SRV_STATE_LINE_MAXLEN 512
 
 /* server flags -- 32 bits */
diff --git a/src/server.c b/src/server.c
index 9a2a8c520..c2b272068 100644
--- a/src/server.c
+++ b/src/server.c
@@ -2632,391 +2632,383 @@ static void srv_update_state(struct server *srv, int 
version, char **params)
fqdn = NULL;
port_svc = port_check = 0;
msg = get_trash_chunk();
-   switch (version) {
-   case 1:
-   /*
-* now we can proceed with server's state update:
-* srv_addr: params[0]
-* srv_op_state: params[1]
-* srv_admin_state:  params[2]
-* srv_uweight:  params[3]
-* srv_iweight:  params[4]
-* srv_last_time_change: params[5]
-* srv_check_status: params[6]
-* srv_check_result: params[7]
-* srv_check_health: params[8]
-* srv_check_state:  params[9]
-* srv_agent_state:  params[10]
-* bk_f_forced_id:   params[11]
-* srv_f_forced_id:  params[12]
-* srv_fqdn: params[13]
-* srv_port: params[14]
-* srvrecord:params[15]
-* srv_use_ssl:  params[16]
-* srv_check_port:   params[17]
-*/
+   HA_SPIN_LOCK(SERVER_LOCK, >lock);
+
+   if (version >= 1) {
+   /* srv_addr: params[0]
+* srv_op_state: params[1]
+* srv_admin_state:  params[2]
+* srv_uweight:  params[3]
+* srv_iweight:  params[4]
+* srv_last_time_change: params[5]
+* srv_check_status: params[6]
+* srv_check_result: params[7]
+* srv_check_health: params[8]
+* srv_check_state:  params[9]
+* srv_agent_state:  params[10]
+* bk_f_forced_id:   params[11]
+* srv_f_forced_id:  params[12]
+* srv_fqdn: params[13]
+* srv_port: params[14]
+* srvrecord:params[15]
+*/
 
-   /* validating srv_op_state */
-   p = NULL;
-   errno = 0;
-   srv_op_state = strtol(params[1], , 10);
-   if ((p == params[1]) || errno == EINVAL || errno == 
ERANGE ||
-   (srv_op_state != SRV_ST_STOPPED &&
-srv_op_state != SRV_ST_STARTING &&
-srv_op_state != SRV_ST_RUNNING &&
-srv_op_state != SRV_ST_STOPPING)) {
-   chunk_appendf(msg, ", invalid srv_op_state 
value '%s'", params[1]);
-   }
+   /* validating srv_op_state */
+   p = NULL;
+   errno = 0;
+   srv_op_state = strtol(params[1], , 10);
+   

[PATCH v3 2/5] MEDIUM: cli: add agent-port command

2021-02-11 Thread William Dauchy
this patch allows to set agent port at runtime. In order to align with
both `addr` and `check-addr` commands, also add the possibility to
optionnaly set port on `agent-addr` command. This led to a small
refactor in order to use the same function for both `agent-addr` and
`agent-port` commands.

Signed-off-by: William Dauchy 
---
 doc/management.txt |  6 +++-
 src/server.c   | 84 +-
 2 files changed, 80 insertions(+), 10 deletions(-)

diff --git a/doc/management.txt b/doc/management.txt
index bff770e4e..423c614b2 100644
--- a/doc/management.txt
+++ b/doc/management.txt
@@ -1828,10 +1828,14 @@ set server / agent [ up | down ]
   switch a server's state regardless of some slow agent checks for example.
   Note that the change is propagated to tracking servers if any.
 
-set server / agent-addr 
+set server / agent-addr  [port ]
   Change addr for servers agent checks. Allows to migrate agent-checks to
   another address at runtime. You can specify both IP and hostname, it will be
   resolved.
+  Optionally, change the port agent.
+
+set server / agent-port 
+  Change the port used for agent checks.
 
 set server / agent-send 
   Change agent string sent to agent check target. Allows to update string while
diff --git a/src/server.c b/src/server.c
index 74bd972e2..9a2a8c520 100644
--- a/src/server.c
+++ b/src/server.c
@@ -56,6 +56,8 @@ static int srv_state_get_version(FILE *f);
 static void srv_cleanup_connections(struct server *srv);
 static const char *update_server_check_addr_port(struct server *s, const char 
*addr,
 const char *port);
+static const char *update_server_agent_addr_port(struct server *s, const char 
*addr,
+const char *port);
 
 /* List head of all known server keywords */
 static struct srv_kw_list srv_keywords = {
@@ -3576,6 +3578,54 @@ int update_server_addr(struct server *s, void *ip, int 
ip_sin_family, const char
return 0;
 }
 
+/* update agent health check address and port
+ * addr can be ip4/ip6 or a hostname
+ * if one error occurs, don't apply anything
+ * must be called with the server lock held.
+ */
+static const char *update_server_agent_addr_port(struct server *s, const char 
*addr,
+const char *port)
+{
+   struct sockaddr_storage sk;
+   struct buffer *msg;
+   int new_port;
+
+   msg = get_trash_chunk();
+   chunk_reset(msg);
+
+   if (!(s->agent.state & CHK_ST_ENABLED)) {
+   chunk_strcat(msg, "agent checks are not enabled on this 
server");
+   goto out;
+   }
+   if (addr) {
+   memset(, 0, sizeof(struct sockaddr_storage));
+   if (str2ip(addr, ) == NULL) {
+   chunk_appendf(msg, "invalid addr '%s'", addr);
+   goto out;
+   }
+   }
+   if (port) {
+   if (strl2irc(port, strlen(port), _port) != 0) {
+   chunk_appendf(msg, "provided port is not an integer");
+   goto out;
+   }
+   if (new_port < 0 || new_port > 65535) {
+   chunk_appendf(msg, "provided port is invalid");
+   goto out;
+   }
+   }
+out:
+   if (msg->data)
+   return msg->area;
+   else {
+   if (addr)
+   set_srv_agent_addr(s, );
+   if (port)
+   set_srv_agent_port(s, new_port);
+   }
+   return NULL;
+}
+
 /* update server health check address and port
  * addr must be ip4 or ip6, it won't be resolved
  * if one error occurs, don't apply anything
@@ -4443,15 +4493,31 @@ static int cli_parse_set_server(char **args, char 
*payload, struct appctx *appct
cli_err(appctx, "'set server  agent' expects 'up' 
or 'down'.\n");
}
else if (strcmp(args[3], "agent-addr") == 0) {
-   struct sockaddr_storage sk;
-
-   memset(, 0, sizeof(sk));
-   if (!(sv->agent.state & CHK_ST_ENABLED))
-   cli_err(appctx, "agent checks are not enabled on this 
server.\n");
-   else if (str2ip(args[4], ))
-   set_srv_agent_addr(sv, );
-   else
-   cli_err(appctx, "incorrect addr address given for 
agent.\n");
+   char *addr = NULL;
+   char *port = NULL;
+   if (strlen(args[4]) == 0) {
+   cli_err(appctx, "set server / agent-addr requires"
+   " an address and optionally a port.\n");
+   goto out_unlock;
+   }
+   addr = args[4];
+   if (strcmp(args[5]

[PATCH v3 0/5] cli commands for checks and agent

2021-02-11 Thread William Dauchy
Hello Christopher,

Here is some work to finish you week.
I believe I addressed all the points raised:
- warning are no longer emitted when we have "0" or "-" values
- I enhanced the warning output as well, and understood my mistake for
  my previous CLEANUP patch removing a space... so I removed this patch
  as well.
- Fixed all the chunk_appendf issues.
- I was a bit lazy to address the partial vs complete failure in parsing
  as I was a bit puzzled about the approach to take. I think it would be
  sad to duplicate the code for pre validation, but on the other hand I
  agree it was clear to assume the whole line was not applied at all. I
  however concluded it was acceptable to simply know the line was not
  fully applied. I believe in that case the user should worry. We can
  probably add a way to show where it stopped, but I felt discouraged
  with the srv resolution, in the middle of srv port, where it is harder
  to have a kinda generic way to know where we stopped.

William Dauchy (5):
  MEDIUM: cli: add check-addr command
  MEDIUM: cli: add agent-port command
  MEDIUM: server: add server-states version 2
  MEDIUM: server: support {check,agent}_addr, agent_port in server state
  MINOR: server: enhance error precision when applying server state

 doc/management.txt|  15 +-
 include/haproxy/server-t.h|  16 +-
 .../checks/1be_40srv_odd_health_checks.vtc|   2 +-
 .../checks/40be_2srv_odd_health_checks.vtc|   2 +-
 reg-tests/checks/4be_1srv_health_checks.vtc   |   6 +-
 src/proxy.c   |  41 +-
 src/server.c  | 971 ++
 7 files changed, 612 insertions(+), 441 deletions(-)

-- 
2.30.0




Re: [PATCH v2 0/6] cli commands for checks and agent

2021-02-10 Thread William Dauchy
On Wed, Feb 10, 2021 at 4:21 PM Christopher Faulet  wrote:
> To conclude, I may merge the first 4 patches if you want, but I guess you
> will have to adapt the first ones to produce better warnings. So I will
> wait your confirmation to proceed. However, I will merge the bug fix.
> There is no reason to not take it.

thanks for the review.
don't worry I will come back with a v3.

-- 
William



Re: [PATCH v2 3/6] BUG/MEDIUM: server: re-align state file fields number

2021-02-08 Thread William Dauchy
Hello Christopher,

On Mon, Feb 8, 2021 at 11:53 PM William Dauchy  wrote:
> Since commit 3169471964fdc49963e63f68c1fd88686821a0c4 ("MINOR: Add
> server port field to server state file.") max_fields was not increased
> on version number 1. So this patch aims to fix it. This should be
> backported as far as v1.8, but the numbering should be adpated depending
> on the version: simply increase the field by 1.

this one is simply a mistake where I changed the subject line from
MEDIUM to MINOR. So ignore this one.
-- 
William



[PATCH v2 4/6] MEDIUM: server: add server-states version 2

2021-02-08 Thread William Dauchy
Even if it is possibly too much work for the current usage, it makes
sure we don't break states file from v2.3 to v2.4; indeed, since v2.3,
we introduced two new fields, so we put them aside to guarantee we can
easily reload from a version 1.
The diff seems huge but there is no specific change apart from:
- introduce v2 where it is needed (parsing, update)
- move away from switch/case in update to be able to reuse code
- move srv lock to the whole function to make it easier

this patch confirm how painful it is to maintain this functionality.

Signed-off-by: William Dauchy 
---
 include/haproxy/server-t.h |   7 +-
 src/server.c   | 742 ++---
 2 files changed, 367 insertions(+), 382 deletions(-)

diff --git a/include/haproxy/server-t.h b/include/haproxy/server-t.h
index af52b6a25..89d8ecc06 100644
--- a/include/haproxy/server-t.h
+++ b/include/haproxy/server-t.h
@@ -101,9 +101,9 @@ enum srv_initaddr {
 } __attribute__((packed));
 
 /* server-state-file version */
-#define SRV_STATE_FILE_VERSION 1
+#define SRV_STATE_FILE_VERSION 2
 #define SRV_STATE_FILE_VERSION_MIN 1
-#define SRV_STATE_FILE_VERSION_MAX 1
+#define SRV_STATE_FILE_VERSION_MAX 2
 #define SRV_STATE_FILE_FIELD_NAMES \
 "be_id "  \
 "be_name "\
@@ -129,7 +129,8 @@ enum srv_initaddr {
 "srv_check_port"
 
 #define SRV_STATE_FILE_MAX_FIELDS 22
-#define SRV_STATE_FILE_NB_FIELDS_VERSION_1 22
+#define SRV_STATE_FILE_NB_FIELDS_VERSION_1 20
+#define SRV_STATE_FILE_NB_FIELDS_VERSION_2 22
 #define SRV_STATE_LINE_MAXLEN 512
 
 /* server flags -- 32 bits */
diff --git a/src/server.c b/src/server.c
index a865fb4dd..84b283567 100644
--- a/src/server.c
+++ b/src/server.c
@@ -2629,391 +2629,383 @@ static void srv_update_state(struct server *srv, int 
version, char **params)
fqdn = NULL;
port_svc = port_check = 0;
msg = get_trash_chunk();
-   switch (version) {
-   case 1:
-   /*
-* now we can proceed with server's state update:
-* srv_addr: params[0]
-* srv_op_state: params[1]
-* srv_admin_state:  params[2]
-* srv_uweight:  params[3]
-* srv_iweight:  params[4]
-* srv_last_time_change: params[5]
-* srv_check_status: params[6]
-* srv_check_result: params[7]
-* srv_check_health: params[8]
-* srv_check_state:  params[9]
-* srv_agent_state:  params[10]
-* bk_f_forced_id:   params[11]
-* srv_f_forced_id:  params[12]
-* srv_fqdn: params[13]
-* srv_port: params[14]
-* srvrecord:params[15]
-* srv_use_ssl:  params[16]
-* srv_check_port:   params[17]
-*/
+   HA_SPIN_LOCK(SERVER_LOCK, >lock);
+
+   if (version >= 1) {
+   /* srv_addr: params[0]
+* srv_op_state: params[1]
+* srv_admin_state:  params[2]
+* srv_uweight:  params[3]
+* srv_iweight:  params[4]
+* srv_last_time_change: params[5]
+* srv_check_status: params[6]
+* srv_check_result: params[7]
+* srv_check_health: params[8]
+* srv_check_state:  params[9]
+* srv_agent_state:  params[10]
+* bk_f_forced_id:   params[11]
+* srv_f_forced_id:  params[12]
+* srv_fqdn: params[13]
+* srv_port: params[14]
+* srvrecord:params[15]
+*/
 
-   /* validating srv_op_state */
-   p = NULL;
-   errno = 0;
-   srv_op_state = strtol(params[1], , 10);
-   if ((p == params[1]) || errno == EINVAL || errno == 
ERANGE ||
-   (srv_op_state != SRV_ST_STOPPED &&
-srv_op_state != SRV_ST_STARTING &&
-srv_op_state != SRV_ST_RUNNING &&
-srv_op_state != SRV_ST_STOPPING)) {
-   chunk_appendf(msg, ", invalid srv_op_state 
value '%s'", params[1]);
-   }
+   /* validating srv_op_state */
+   p = NULL;
+   errno = 0;
+   srv_op_state = strtol(params[1], , 10);
+   

[PATCH v2 5/6] MEDIUM: server: support {check,agent}_addr, agent_port in server state

2021-02-08 Thread William Dauchy
logical followup from cli commands addition, so that the state server
file stays compatible with the changes made at runtime; use previously
added helper to load server attributes.

also alloc a specific chunk to avoid mixing with other called functions
using it

Signed-off-by: William Dauchy 
---
 doc/management.txt|  5 ++-
 include/haproxy/server-t.h|  9 ++--
 .../checks/1be_40srv_odd_health_checks.vtc|  2 +-
 .../checks/40be_2srv_odd_health_checks.vtc|  2 +-
 reg-tests/checks/4be_1srv_health_checks.vtc   |  6 +--
 src/proxy.c   | 41 +++
 src/server.c  | 40 +++---
 7 files changed, 66 insertions(+), 39 deletions(-)

diff --git a/doc/management.txt b/doc/management.txt
index 423c614b2..60e25c7e1 100644
--- a/doc/management.txt
+++ b/doc/management.txt
@@ -2455,7 +2455,10 @@ show servers state []
  srv_port:Server port.
  srvrecord:   DNS SRV record associated to this SRV.
  srv_use_ssl: use ssl for server connections.
- srv_check_port:  Server check port.
+ srv_check_port:  Server health check port.
+ srv_check_addr:  Server health check address.
+ srv_agent_addr:  Server health agent address.
+ srv_agent_port:  Server health agent port.
 
 show sess
   Dump all known sessions. Avoid doing this on slow connections as this can
diff --git a/include/haproxy/server-t.h b/include/haproxy/server-t.h
index 89d8ecc06..7f1356024 100644
--- a/include/haproxy/server-t.h
+++ b/include/haproxy/server-t.h
@@ -126,11 +126,14 @@ enum srv_initaddr {
 "srv_port "   \
 "srvrecord "  \
 "srv_use_ssl "\
-"srv_check_port"
+"srv_check_port " \
+"srv_check_addr " \
+"srv_agent_addr " \
+"srv_agent_port"
 
-#define SRV_STATE_FILE_MAX_FIELDS 22
+#define SRV_STATE_FILE_MAX_FIELDS 25
 #define SRV_STATE_FILE_NB_FIELDS_VERSION_1 20
-#define SRV_STATE_FILE_NB_FIELDS_VERSION_2 22
+#define SRV_STATE_FILE_NB_FIELDS_VERSION_2 25
 #define SRV_STATE_LINE_MAXLEN 512
 
 /* server flags -- 32 bits */
diff --git a/reg-tests/checks/1be_40srv_odd_health_checks.vtc 
b/reg-tests/checks/1be_40srv_odd_health_checks.vtc
index f01205295..c279972aa 100644
--- a/reg-tests/checks/1be_40srv_odd_health_checks.vtc
+++ b/reg-tests/checks/1be_40srv_odd_health_checks.vtc
@@ -112,6 +112,6 @@ syslog S -wait
 
 haproxy h1 -cli {
 send "show servers state"
-expect ~ "# be_id be_name srv_id srv_name srv_addr srv_op_state 
srv_admin_state srv_uweight srv_iweight srv_time_since_last_change 
srv_check_status srv_check_result srv_check_health srv_check_state 
srv_agent_state bk_f_forced_id srv_f_forced_id srv_fqdn srv_port srvrecord 
srv_use_ssl srv_check_port\n2 be1 1 srv0 ${s0_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 
0 0 0 0 - ${s0_port} - 0 0\n2 be1 2 srv1 ${s1_addr} 2 0 1 1 [[:digit:]]+ 6 
([[:digit:]]+ ){3}0 0 0 - ${s1_port} - 0 0\n2 be1 3 srv2 ${s2_addr} 2 0 1 1 
[[:digit:]]+ 1 0 1 0 0 0 0 - ${s2_port} - 0 0\n2 be1 4 srv3 ${s3_addr} 2 0 1 1 
[[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s3_port} - 0 0\n2 be1 5 srv4 
${s4_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s4_port} - 0 0\n2 be1 6 srv5 
${s5_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s5_port} - 0 0\n2 
be1 7 srv6 ${s6_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s6_port} - 0 0\n2 
be1 8 srv7 ${s7_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - 
${s7_port} - 0 0\n2 be1 9 srv8 ${s8_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - 
${s8_port} - 0 0\n2 be1 10 srv9 ${s9_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ 
){3}0 0 0 - ${s9_port} - 0 0\n2 be1 11 srv10 ${s10_addr} 2 0 1 1 [[:digit:]]+ 1 
0 1 0 0 0 0 - ${s10_port} - 0 0\n2 be1 12 srv11 ${s11_addr} 2 0 1 1 
[[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s11_port} - 0 0\n2 be1 13 srv12 
${s12_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s12_port} - 0 0\n2 be1 14 
srv13 ${s13_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s13_port} 
- 0 0\n2 be1 15 srv14 ${s14_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - 
${s14_port} - 0 0\n2 be1 16 srv15 ${s15_addr} 2 0 1 1 [[:digit:]]+ 6 
([[:digit:]]+ ){3}0 0 0 - ${s15_port} - 0 0\n2 be1 17 srv16 ${s16_addr} 2 0 1 1 
[[:digit:]]+ 1 0 1 0 0 0 0 - ${s16_port} - 0 0\n2 be1 18 srv17 ${s17_addr} 2 0 
1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s17_port} - 0 0\n2 be1 19 srv18 
${s18_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s18_port} - 0 0\n2 be1 20 
srv19 ${s19_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s19_port} 
- 0 0\n2 be1 21 srv20 ${s20_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - 
${s20_port} - 0 0\n2 be1 22 srv21 ${s21_addr} 2 0 1 1 [[:digit:]]+ 6 
([[:digit:]]+ ){3}0 0 0 - ${s21_port} - 0 

[PATCH v2 6/6] CLEANUP: server: add missing space in server-state error output

2021-02-08 Thread William Dauchy
a space was missing in the output to make it more readable.

Signed-off-by: William Dauchy 
---
 src/server.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/server.c b/src/server.c
index 6b360291d..673844dd7 100644
--- a/src/server.c
+++ b/src/server.c
@@ -3015,7 +3015,7 @@ static void srv_update_state(struct server *srv, int 
version, char **params)
HA_SPIN_UNLOCK(SERVER_LOCK, >lock);
if (msg->data) {
chunk_appendf(msg, "\n");
-   ha_warning("server-state application failed for server 
'%s/%s'%s",
+   ha_warning("server-state application failed for server '%s/%s' 
%s",
   srv->proxy->id, srv->id, msg->area);
}
  end:
-- 
2.30.0




[PATCH v2 3/6] BUG/MINOR: server: re-align state file fields number

2021-02-08 Thread William Dauchy
Since commit 3169471964fdc49963e63f68c1fd88686821a0c4 ("MINOR: Add
server port field to server state file.") max_fields was not increased
on version number 1. So this patch aims to fix it. This should be
backported as far as v1.8, but the numbering should be adpated depending
on the version: simply increase the field by 1.

Signed-off-by: William Dauchy 
---
 include/haproxy/server-t.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/haproxy/server-t.h b/include/haproxy/server-t.h
index 32697a9c4..af52b6a25 100644
--- a/include/haproxy/server-t.h
+++ b/include/haproxy/server-t.h
@@ -129,7 +129,7 @@ enum srv_initaddr {
 "srv_check_port"
 
 #define SRV_STATE_FILE_MAX_FIELDS 22
-#define SRV_STATE_FILE_NB_FIELDS_VERSION_1 21
+#define SRV_STATE_FILE_NB_FIELDS_VERSION_1 22
 #define SRV_STATE_LINE_MAXLEN 512
 
 /* server flags -- 32 bits */
-- 
2.30.0




[PATCH v2 2/6] MEDIUM: cli: add agent-port command

2021-02-08 Thread William Dauchy
this patch allows to set agent port at runtime. In order to align with
both `addr` and `check-addr` commands, also add the possibility to
optionnaly set port on `agent-addr` command. This led to a small
refactor in order to use the same function for both `agent-addr` and
`agent-port` commands.

Signed-off-by: William Dauchy 
---
 doc/management.txt |  6 +++-
 src/server.c   | 76 --
 2 files changed, 72 insertions(+), 10 deletions(-)

diff --git a/doc/management.txt b/doc/management.txt
index bff770e4e..423c614b2 100644
--- a/doc/management.txt
+++ b/doc/management.txt
@@ -1828,10 +1828,14 @@ set server / agent [ up | down ]
   switch a server's state regardless of some slow agent checks for example.
   Note that the change is propagated to tracking servers if any.
 
-set server / agent-addr 
+set server / agent-addr  [port ]
   Change addr for servers agent checks. Allows to migrate agent-checks to
   another address at runtime. You can specify both IP and hostname, it will be
   resolved.
+  Optionally, change the port agent.
+
+set server / agent-port 
+  Change the port used for agent checks.
 
 set server / agent-send 
   Change agent string sent to agent check target. Allows to update string while
diff --git a/src/server.c b/src/server.c
index 58347d215..a865fb4dd 100644
--- a/src/server.c
+++ b/src/server.c
@@ -56,6 +56,8 @@ static int srv_state_get_version(FILE *f);
 static void srv_cleanup_connections(struct server *srv);
 static const char *update_server_check_addr_port(struct server *s, const char 
*addr,
 const char *port);
+static const char *update_server_agent_addr_port(struct server *s, const char 
*addr,
+const char *port);
 
 /* List head of all known server keywords */
 static struct srv_kw_list srv_keywords = {
@@ -3573,6 +3575,46 @@ int update_server_addr(struct server *s, void *ip, int 
ip_sin_family, const char
return 0;
 }
 
+/* update agent health check address and port
+ * addr can be ip4/ip6 or a hostname
+ * must be called with the server lock held.
+ */
+static const char *update_server_agent_addr_port(struct server *s, const char 
*addr,
+const char *port)
+{
+   struct sockaddr_storage sk;
+   struct buffer *msg;
+   int new_port;
+
+   msg = get_trash_chunk();
+
+   if (!(s->agent.state & CHK_ST_ENABLED)) {
+   chunk_appendf(msg, "agent checks are not enabled on this 
server.\n");
+   goto out;
+   }
+   if (addr) {
+   memset(, 0, sizeof(struct sockaddr_storage));
+   if (str2ip(addr, ) == NULL) {
+   chunk_appendf(msg, "invalid addr '%s'\n", addr);
+   goto out;
+   }
+   set_srv_agent_addr(s, );
+   }
+   if (port) {
+   if (strl2irc(port, strlen(port), _port) != 0) {
+   chunk_appendf(msg, "provided port is not an integer\n");
+   goto out;
+   }
+   if (new_port < 0 || new_port > 65535) {
+   chunk_appendf(msg, "provided port is invalid\n");
+   goto out;
+   }
+   set_srv_agent_port(s, new_port);
+   }
+out:
+   return msg->area;
+}
+
 /* update server health check address and port
  * addr must be ip4 or ip6, it won't be resolved
  * must be called with the server lock held.
@@ -4432,15 +4474,31 @@ static int cli_parse_set_server(char **args, char 
*payload, struct appctx *appct
cli_err(appctx, "'set server  agent' expects 'up' 
or 'down'.\n");
}
else if (strcmp(args[3], "agent-addr") == 0) {
-   struct sockaddr_storage sk;
-
-   memset(, 0, sizeof(sk));
-   if (!(sv->agent.state & CHK_ST_ENABLED))
-   cli_err(appctx, "agent checks are not enabled on this 
server.\n");
-   else if (str2ip(args[4], ))
-   set_srv_agent_addr(sv, );
-   else
-   cli_err(appctx, "incorrect addr address given for 
agent.\n");
+   char *addr = NULL;
+   char *port = NULL;
+   if (strlen(args[4]) == 0) {
+   cli_err(appctx, "set server / agent-addr requires"
+   " an address and optionally a port.\n");
+   goto out_unlock;
+   }
+   addr = args[4];
+   if (strcmp(args[5], "port") == 0)
+   port = args[6];
+   warning = update_server_agent_addr_port(sv, addr, port);
+   if (warning)
+   cli_

[PATCH v2 1/6] MEDIUM: cli: add check-addr command

2021-02-08 Thread William Dauchy
this patch allows to set server health check address at runtime. In
order to align with `addr` command, also allow to set port optionnaly.
This led to a small refactor in order to use the same function for both
`check-addr` and `check-port` commands.
for `check-port`, we however don't permit the change anymore if checks
are not enabled on the server.

This command becomes more and more useful for people having a consul
like architecture:
- the backend server is located on a container with its own IP
- the health checks are done the consul instance located on the host
  with the host IP

Signed-off-by: William Dauchy 
---
 doc/management.txt |  4 +++
 src/server.c   | 87 ++
 2 files changed, 76 insertions(+), 15 deletions(-)

diff --git a/doc/management.txt b/doc/management.txt
index b74aba769..bff770e4e 100644
--- a/doc/management.txt
+++ b/doc/management.txt
@@ -1842,6 +1842,10 @@ set server / health [ up | stopping | 
down ]
   switch a server's state regardless of some slow health checks for example.
   Note that the change is propagated to tracking servers if any.
 
+set server / check-addr  [port ]
+  Change the IP address used for server health checks.
+  Optionally, change the port used for server health checks.
+
 set server / check-port 
   Change the port used for health checking to 
 
diff --git a/src/server.c b/src/server.c
index da2325e9a..58347d215 100644
--- a/src/server.c
+++ b/src/server.c
@@ -54,6 +54,8 @@ static int srv_set_fqdn(struct server *srv, const char *fqdn, 
int dns_locked);
 static void srv_state_parse_line(char *buf, const int version, char **params, 
char **srv_params);
 static int srv_state_get_version(FILE *f);
 static void srv_cleanup_connections(struct server *srv);
+static const char *update_server_check_addr_port(struct server *s, const char 
*addr,
+const char *port);
 
 /* List head of all known server keywords */
 static struct srv_kw_list srv_keywords = {
@@ -3571,6 +3573,51 @@ int update_server_addr(struct server *s, void *ip, int 
ip_sin_family, const char
return 0;
 }
 
+/* update server health check address and port
+ * addr must be ip4 or ip6, it won't be resolved
+ * must be called with the server lock held.
+ */
+static const char *update_server_check_addr_port(struct server *s, const char 
*addr,
+const char *port)
+{
+   struct sockaddr_storage sk;
+   struct buffer *msg;
+   int new_port;
+
+   msg = get_trash_chunk();
+
+   if (!(s->check.state & CHK_ST_ENABLED)) {
+   chunk_appendf(msg, "health checks are not enabled on this 
server.\n");
+   goto out;
+   }
+   if (addr) {
+   memset(, 0, sizeof(struct sockaddr_storage));
+   if (str2ip2(addr, , 0) == NULL) {
+   chunk_appendf(msg, "invalid addr '%s'\n", addr);
+   goto out;
+   }
+   s->check.addr = sk;
+   }
+   if (port) {
+   if (strl2irc(port, strlen(port), _port) != 0) {
+   chunk_appendf(msg, "provided port is not an integer\n");
+   goto out;
+   }
+   if (new_port < 0 || new_port > 65535) {
+   chunk_appendf(msg, "provided port is invalid\n");
+   goto out;
+   }
+   /* prevent the update of port to 0 if MAPPORTS are in use */
+   if ((s->flags & SRV_F_MAPPORTS) && new_port == 0) {
+   chunk_appendf(msg, "can't unset 'port' since MAPPORTS 
is in use\n");
+   goto out;
+   }
+   s->check.port = new_port;
+   }
+out:
+   return msg->area;
+}
+
 /*
  * This function update a server's addr and port only for AF_INET and AF_INET6 
families.
  *
@@ -4403,23 +4450,32 @@ static int cli_parse_set_server(char **args, char 
*payload, struct appctx *appct
cli_err(appctx, "cannot allocate memory for new 
string.\n");
}
}
-   else if (strcmp(args[3], "check-port") == 0) {
-   int i = 0;
-   if (strl2irc(args[4], strlen(args[4]), ) != 0) {
-   cli_err(appctx, "'set server  check-port' expects 
an integer as argument.\n");
-   goto out_unlock;
-   }
-   if ((i < 0) || (i > 65535)) {
-   cli_err(appctx, "provided port is not valid.\n");
+   else if (strcmp(args[3], "check-addr") == 0) {
+   char *addr = NULL;
+   char *port = NULL;
+   if (strlen(args[4]) == 0) {
+   cli_err(appctx, "set se

[PATCH v2 3/6] BUG/MEDIUM: server: re-align state file fields number

2021-02-08 Thread William Dauchy
Since commit 3169471964fdc49963e63f68c1fd88686821a0c4 ("MINOR: Add
server port field to server state file.") max_fields was not increased
on version number 1. So this patch aims to fix it. This should be
backported as far as v1.8, but the numbering should be adpated depending
on the version: simply increase the field by 1.

Signed-off-by: William Dauchy 
---
 include/haproxy/server-t.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/haproxy/server-t.h b/include/haproxy/server-t.h
index 32697a9c4..af52b6a25 100644
--- a/include/haproxy/server-t.h
+++ b/include/haproxy/server-t.h
@@ -129,7 +129,7 @@ enum srv_initaddr {
 "srv_check_port"
 
 #define SRV_STATE_FILE_MAX_FIELDS 22
-#define SRV_STATE_FILE_NB_FIELDS_VERSION_1 21
+#define SRV_STATE_FILE_NB_FIELDS_VERSION_1 22
 #define SRV_STATE_LINE_MAXLEN 512
 
 /* server flags -- 32 bits */
-- 
2.30.0




[PATCH v2 0/6] cli commands for checks and agent

2021-02-08 Thread William Dauchy
Hello Christopher,

Here is the v2 addressing the points raised yesterday.
The patch 4/6 clearly looks scary but I made sure to not change anything
crazy apart from adding support for a version 2. I will probably start
to dream about a server-state-file burning every night.
I hope this will be good enough, but otherwise, don't hesitate to put
more comments, I will come back with a v3.

William Dauchy (6):
  MEDIUM: cli: add check-addr command
  MEDIUM: cli: add agent-port command
  BUG/MEDIUM: server: re-align state file fields number
  MEDIUM: server: add server-states version 2
  MEDIUM: server: support {check,agent}_addr, agent_port in server state
  CLEANUP: server: add missing space in server-state error output

 doc/management.txt|  15 +-
 include/haproxy/server-t.h|  16 +-
 .../checks/1be_40srv_odd_health_checks.vtc|   2 +-
 .../checks/40be_2srv_odd_health_checks.vtc|   2 +-
 reg-tests/checks/4be_1srv_health_checks.vtc   |   6 +-
 src/proxy.c   |  41 +-
 src/server.c  | 929 ++
 7 files changed, 573 insertions(+), 438 deletions(-)

-- 
2.30.0




Re: [PATCH 0/6] cli commands coherency

2021-02-08 Thread William Dauchy
Hi Christopher,

On Mon, Feb 8, 2021 at 12:21 PM Christopher Faulet  wrote:
> First, there is a test to be sure the agent-check is enabled before updating 
> the
> agent address and/or port. Do you think it should also be done for the
> health-check? Because, for now, it is possible to set an address on a disabled
> (or even not configured) health-check. But it is not possible for an
> agent-check. Given that it is not possible to enable/disable an agent-check or
> an health-check from the CLI, I guess the warning for both makes sense.

yes ok.

> Then, there are some problems when the server-state file is loaded. The trash
> chunk used by srv_update_state() is overwritten when
> update_server_agent_addr_port() is called. It is not obvious, but the
> get_trash_chunk() function returns cyclically one of two trash chunks.
> srv_update_state() uses a trash chunk. When called,
> update_server_check_addr_port() uses the other one. So,
> update_server_agent_addr_port() re-uses the first trash chunk, the same than
> srv_update_state(). A solution may be to allocate a dedicated chunk in
> srv_update_state(), using alloc_trash_chunk().

ah ok, I thought it was ok as long as there was no reset, but I indeed
overlooked the second chunk. I will fix that.

> A more annoying problem happens when an old state file is loaded (prior these
> changes). Last parameters are not defined, params[18] (check-addr), params[19]
> (agent-addr) and params[20] (agent-port) are set to NULL. Thus, HAProxy 
> crashes
> when the health-check address is compared to "-". In fact, this comes from the
> SRV_STATE_FILE_NB_FIELDS_VERSION_1 macro. It should be set to 24. You added 3
> new fields but just incremented it by 1. Hum... No, there is also another
> problem. In srv_state_parse_line(). SRV_STATE_FILE_NB_FIELDS_VERSION_1 must be
> equal to SRV_STATE_FILE_MAX_FIELDS. Or better, it should really reflect the
> argument number of the version 1, so 21. And it must be compared to srv_arg
> instead of arg. It is a bug since the 1.8 (commit 316947196).

ah true I overlooked SRV_STATE_FILE_NB_FIELDS_VERSION_1 meaning.
let me fix that as well.

> However, this means with these changes, a cold restart is required. In the 
> 2.4,
> 5 new fields were added to the server-state file. Is it expected to perform a
> cold restart when upgrading to the 2.4 from a prior major version ? If so, I'm
> ok. But, it may be a good idea to change the state file version then to not
> silently ignore the state file. Otherwise, I guess it is possible to make 
> these
> 5 new fields optional, isn't it ?

good idea. We never did it in the past, but that's probably a good thing to do.

> William, I'm sorry if I'm not really clear but the subject is a bit foggy for 
> me
> :) Do you feel confident to handle all the changes ?

Thanks for the review. I will come back with a v2.

--
William



Re: [ANNOUNCE] haproxy-2.4-dev7

2021-02-07 Thread William Dauchy
On Fri, Feb 5, 2021 at 4:14 PM Willy Tarreau  wrote:
> HAProxy 2.4-dev7 was released on 2021/02/05. It added 153 new commits
> after version 2.4-dev6.
>   - Some significant lifting was done to the Prometheus exporter, including
> new fields, better descriptions and some filtering. I've seen quite a
> bunch pass in front of me but do not well understand what it does, all
> that interests me is that some users are happy with these changes so I
> guess they were long awaited :-)

about that, please note two breaking changes:

- objects' status are no longer a gauge value which you need to
translate manually; instead the state is a label with a proper string
value. The value of the metric simply informs whether the state is
active or not.
so we went from:
  haproxy_server_status{proxy="be_foo",server="srv0"} 2
(which meant MAINT)
to:
  haproxy_server_status{proxy="be_foo",server="srv0",state="DOWN"} 0
  haproxy_server_status{proxy="be_foo",server="srv0",state="UP"} 0
  haproxy_server_status{proxy="be_foo",server="srv0",state="MAINT"} 1
  haproxy_server_status{proxy="be_foo",server="srv0",state="DRAIN"} 0
  haproxy_server_status{proxy="be_foo",server="srv0",state="NOLB"} 0

This change is valid for frontend, backend, server with different label values.

- similar change with health checks where we put a state label:

haproxy_server_check_status{proxy="be_foo",server="srv0",state="HANA"} 0
haproxy_server_check_status{proxy="be_foo",server="srv0",state="SOCKERR"} 0
haproxy_server_check_status{proxy="be_foo",server="srv0",state="L4OK"} 0
haproxy_server_check_status{proxy="be_foo",server="srv0",state="L4TOUT"} 0
haproxy_server_check_status{proxy="be_foo",server="srv0",state="L4CON"} 1
haproxy_server_check_status{proxy="be_foo",server="srv0",state="L6OK"} 0
haproxy_server_check_status{proxy="be_foo",server="srv0",state="L6TOUT"} 0
haproxy_server_check_status{proxy="be_foo",server="srv0",state="L6RSP"} 0
haproxy_server_check_status{proxy="be_foo",server="srv0",state="L7TOUT"} 0
haproxy_server_check_status{proxy="be_foo",server="srv0",state="L7RSP"} 0
haproxy_server_check_status{proxy="be_foo",server="srv0",state="L7OK"} 0
haproxy_server_check_status{proxy="be_foo",server="srv0",state="L7OKC"} 0
haproxy_server_check_status{proxy="be_foo",server="srv0",state="L7STS"} 0
haproxy_server_check_status{proxy="be_foo",server="srv0",state="PROCERR"} 0
haproxy_server_check_status{proxy="be_foo",server="srv0",state="PROCTOUT"} 0
haproxy_server_check_status{proxy="be_foo",server="srv0",state="PROCOK"} 0


It means:
* a lot more metrics for large setup (but you can still filter as
explained in the doc)
* easier use on prometheus side: you will be able to group per state
very easily now.

Generally speaking I'm very interested in feedback regarding this
change. This was motivated by the usage I saw in several companies,
where people were struggling making use of those metrics.

Willy: it is probably a wise idea to keep it for the 2.4 final release
notes, some people might want to know that during their update; a lot
of people have their production alerts on those metrics.

Thanks,
-- 
William



[PATCH 2/2] MEDIUM: contrib/prometheus-exporter: export base stick table stats

2021-02-07 Thread William Dauchy
I saw some people falling back to unix socket to collect some data they
could not find in prometheus exporter. One of them is base info from
stick tables (used/size).
I do not plan to extend it more for now; keys are quite a mess to
handle.

This should resolve github issue #1008.

Signed-off-by: William Dauchy 
---
 contrib/prometheus-exporter/README|  10 ++
 .../prometheus-exporter/service-prometheus.c  | 148 +++---
 reg-tests/contrib/prometheus.vtc  |   4 +
 3 files changed, 142 insertions(+), 20 deletions(-)

diff --git a/contrib/prometheus-exporter/README 
b/contrib/prometheus-exporter/README
index a85981597..d882b092f 100644
--- a/contrib/prometheus-exporter/README
+++ b/contrib/prometheus-exporter/README
@@ -72,6 +72,7 @@ exported. Here are examples:
   /metrics?scope=frontend=backend # ==> Frontend and backend metrics 
will be exported
   /metrics?scope=*=   # ==> no metrics will be exported
   /metrics?scope==global  # ==> global metrics will be exported
+  /metrics?scope=sticktable # ==> stick tables metrics will be 
exported
 
 * How do I prevent my prometheus instance to explode?
 
@@ -320,3 +321,12 @@ See prometheus export for the description of each field.
 | haproxy_server_need_connections_current|
 | haproxy_server_uweight |
 ++
+
+* Stick table metrics
+
+++
+|Metric name |
+++
+| haproxy_sticktable_size|
+| haproxy_sticktable_used|
+++
diff --git a/contrib/prometheus-exporter/service-prometheus.c 
b/contrib/prometheus-exporter/service-prometheus.c
index 769389735..521fe1056 100644
--- a/contrib/prometheus-exporter/service-prometheus.c
+++ b/contrib/prometheus-exporter/service-prometheus.c
@@ -47,28 +47,33 @@ enum {
 
 /* Prometheus exporter dumper states (appctx->st1) */
 enum {
-PROMEX_DUMPER_INIT = 0, /* initialized */
-PROMEX_DUMPER_GLOBAL,   /* dump metrics of globals */
-PROMEX_DUMPER_FRONT,/* dump metrics of frontend proxies */
-PROMEX_DUMPER_BACK, /* dump metrics of backend proxies */
-PROMEX_DUMPER_LI,   /* dump metrics of listeners */
-PROMEX_DUMPER_SRV,  /* dump metrics of servers */
-   PROMEX_DUMPER_DONE, /* finished */
+   PROMEX_DUMPER_INIT = 0,   /* initialized */
+   PROMEX_DUMPER_GLOBAL, /* dump metrics of globals */
+   PROMEX_DUMPER_FRONT,  /* dump metrics of frontend proxies */
+   PROMEX_DUMPER_BACK,   /* dump metrics of backend proxies */
+   PROMEX_DUMPER_LI, /* dump metrics of listeners */
+   PROMEX_DUMPER_SRV,/* dump metrics of servers */
+   PROMEX_DUMPER_STICKTABLE, /* dump metrics of stick tables */
+   PROMEX_DUMPER_DONE,   /* finished */
 };
 
 /* Prometheus exporter flags (appctx->ctx.stats.flags) */
-#define PROMEX_FL_METRIC_HDR0x0001
-#define PROMEX_FL_INFO_METRIC   0x0002
-#define PROMEX_FL_FRONT_METRIC  0x0004
-#define PROMEX_FL_BACK_METRIC   0x0008
-#define PROMEX_FL_SRV_METRIC0x0010
-#define PROMEX_FL_SCOPE_GLOBAL  0x0020
-#define PROMEX_FL_SCOPE_FRONT   0x0040
-#define PROMEX_FL_SCOPE_BACK0x0080
-#define PROMEX_FL_SCOPE_SERVER  0x0100
-#define PROMEX_FL_NO_MAINT_SRV  0x0200
-
-#define PROMEX_FL_SCOPE_ALL 
(PROMEX_FL_SCOPE_GLOBAL|PROMEX_FL_SCOPE_FRONT|PROMEX_FL_SCOPE_BACK|PROMEX_FL_SCOPE_SERVER)
+#define PROMEX_FL_METRIC_HDR0x0001
+#define PROMEX_FL_INFO_METRIC   0x0002
+#define PROMEX_FL_FRONT_METRIC  0x0004
+#define PROMEX_FL_BACK_METRIC   0x0008
+#define PROMEX_FL_SRV_METRIC0x0010
+#define PROMEX_FL_SCOPE_GLOBAL  0x0020
+#define PROMEX_FL_SCOPE_FRONT   0x0040
+#define PROMEX_FL_SCOPE_BACK0x0080
+#define PROMEX_FL_SCOPE_SERVER  0x0100
+#define PROMEX_FL_NO_MAINT_SRV  0x0200
+#define PROMEX_FL_STICKTABLE_METRIC 0x0400
+#define PROMEX_FL_SCOPE_STICKTABLE  0x0800
+
+#define PROMEX_FL_SCOPE_ALL (PROMEX_FL_SCOPE_GLOBAL | PROMEX_FL_SCOPE_FRONT | \
+PROMEX_FL_SCOPE_BACK | PROMEX_FL_SCOPE_SERVER | \
+PROMEX_FL_SCOPE_STICKTABLE)
 
 /* Promtheus metric type (gauge or counter) */
 enum promex_mt_type {
@@ -298,6 +303,25 @@ const struct ist promex_st_metric_desc[ST_F_TOTAL_FIELDS] 
= {
[ST_F_TT_MAX] = IST("Maximum observed total request+response 
time (request+queue+connect+response+processing)"),
 };
 
+/* stick table base fields */
+enum sticktable_field {
+   STICKTABLE_SIZE = 0,
+   STICKTABLE_USED,
+   /* must always be the last one */
+   STICKTABLE_TOTA

[PATCH 1/2] MINOR: contrib/prometheus-exporter: use stats desc when possible followup

2021-02-07 Thread William Dauchy
Remove remaining descrition which are common to stats.c.

This patch is a followup of commit
82b2ce2f967d967139adb7afab064416fadad615 ("MINOR:
contrib/prometheus-exporter: use stats desc when possible"). I probably
messed up with one of my rebase because I'm pretty sure I removed them
at some point, but who knows what happened.

Signed-off-by: William Dauchy 
---
 .../prometheus-exporter/service-prometheus.c  | 35 ---
 1 file changed, 35 deletions(-)

diff --git a/contrib/prometheus-exporter/service-prometheus.c 
b/contrib/prometheus-exporter/service-prometheus.c
index 126962f5e..769389735 100644
--- a/contrib/prometheus-exporter/service-prometheus.c
+++ b/contrib/prometheus-exporter/service-prometheus.c
@@ -284,42 +284,7 @@ const struct promex_metric 
promex_st_metrics[ST_F_TOTAL_FIELDS] = {
 
 /* Description of overriden stats fields */
 const struct ist promex_st_metric_desc[ST_F_TOTAL_FIELDS] = {
-   [ST_F_PXNAME] = IST("The proxy name."),
-   [ST_F_SVNAME] = IST("The service name (FRONTEND for frontend, 
BACKEND for backend, any name for server/listener)."),
-   [ST_F_QCUR]   = IST("Current number of queued requests."),
-   [ST_F_QMAX]   = IST("Maximum observed number of queued 
requests."),
-   [ST_F_SCUR]   = IST("Current number of active sessions."),
-   [ST_F_SMAX]   = IST("Maximum observed number of active 
sessions."),
-   [ST_F_SLIM]   = IST("Configured session limit."),
-   [ST_F_STOT]   = IST("Total number of sessions."),
-   [ST_F_BIN]= IST("Current total of incoming bytes."),
-   [ST_F_BOUT]   = IST("Current total of outgoing bytes."),
-   [ST_F_DREQ]   = IST("Total number of denied requests."),
-   [ST_F_DRESP]  = IST("Total number of denied responses."),
-   [ST_F_EREQ]   = IST("Total number of request errors."),
-   [ST_F_ECON]   = IST("Total number of connection errors."),
-   [ST_F_ERESP]  = IST("Total number of response errors."),
-   [ST_F_WRETR]  = IST("Total number of retry warnings."),
-   [ST_F_WREDIS] = IST("Total number of redispatch warnings."),
[ST_F_STATUS] = IST("Current status of the service, per state 
label value."),
-   [ST_F_WEIGHT] = IST("Service weight."),
-   [ST_F_ACT]= IST("Current number of active servers."),
-   [ST_F_BCK]= IST("Current number of backup servers."),
-   [ST_F_CHKFAIL]= IST("Total number of failed check (Only counts 
checks failed when the server is up)."),
-   [ST_F_CHKDOWN]= IST("Total number of UP->DOWN transitions."),
-   [ST_F_LASTCHG]= IST("Number of seconds since the last UP<->DOWN 
transition."),
-   [ST_F_DOWNTIME]   = IST("Total downtime (in seconds) for the 
service."),
-   [ST_F_QLIMIT] = IST("Configured maxqueue for the server (0 
meaning no limit)."),
-   [ST_F_PID]= IST("Process id (0 for first instance, 1 for 
second, ...)"),
-   [ST_F_IID]= IST("Unique proxy id."),
-   [ST_F_SID]= IST("Server id (unique inside a proxy)."),
-   [ST_F_THROTTLE]   = IST("Current throttle percentage for the 
server, when slowstart is active, or no value if not in slowstart."),
-   [ST_F_LBTOT]  = IST("Total number of times a service was 
selected, either for new sessions, or when redispatching."),
-   [ST_F_TRACKED]= IST("Id of proxy/server if tracking is 
enabled."),
-   [ST_F_TYPE]   = IST("Service type (0=frontend, 1=backend, 
2=server, 3=socket/listener)."),
-   [ST_F_RATE]   = IST("Current number of sessions per second over 
last elapsed second."),
-   [ST_F_RATE_LIM]   = IST("Configured limit on new sessions per 
second."),
-   [ST_F_RATE_MAX]   = IST("Maximum observed number of sessions per 
second."),
[ST_F_CHECK_STATUS]   = IST("Status of last health check, per state 
label value."),
[ST_F_CHECK_CODE] = IST("layer5-7 code, if available of the last 
health check."),
[ST_F_CHECK_DURATION] = IST("Total duration of the latest server health 
check, in seconds."),
-- 
2.30.0




[PATCH 2/6] CLEANUP: tools: typo in `strl2irc` mention

2021-02-06 Thread William Dauchy
`str2irc` does not exist

Signed-off-by: William Dauchy 
---
 src/tools.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/tools.c b/src/tools.c
index 8fef15b4d..2d40d8910 100644
--- a/src/tools.c
+++ b/src/tools.c
@@ -2178,7 +2178,7 @@ int strl2irc(const char *s, int len, int *ret)
  * applications designed for hostile environments. It returns zero when the
  * number has successfully been converted, non-zero otherwise. When an error
  * is returned, the  value is left untouched. It is about 3 times slower
- * than str2irc().
+ * than strl2irc().
  */
 
 int strl2llrc(const char *s, int len, long long *ret)
-- 
2.30.0




[PATCH 5/6] MEDIUM: server: support {check,agent}_addr, agent_port in server state

2021-02-06 Thread William Dauchy
logical followup from cli commands addition, so that the state server
file stays compatible with the changes made at runtime; use previously
added helper to load server attributes.

Signed-off-by: William Dauchy 
---
 doc/management.txt|  5 ++-
 include/haproxy/server-t.h|  9 ++--
 .../checks/1be_40srv_odd_health_checks.vtc|  2 +-
 .../checks/40be_2srv_odd_health_checks.vtc|  2 +-
 reg-tests/checks/4be_1srv_health_checks.vtc   |  6 +--
 src/proxy.c   | 41 +++
 src/server.c  | 30 --
 7 files changed, 57 insertions(+), 38 deletions(-)

diff --git a/doc/management.txt b/doc/management.txt
index 423c614b2..60e25c7e1 100644
--- a/doc/management.txt
+++ b/doc/management.txt
@@ -2455,7 +2455,10 @@ show servers state []
  srv_port:Server port.
  srvrecord:   DNS SRV record associated to this SRV.
  srv_use_ssl: use ssl for server connections.
- srv_check_port:  Server check port.
+ srv_check_port:  Server health check port.
+ srv_check_addr:  Server health check address.
+ srv_agent_addr:  Server health agent address.
+ srv_agent_port:  Server health agent port.
 
 show sess
   Dump all known sessions. Avoid doing this on slow connections as this can
diff --git a/include/haproxy/server-t.h b/include/haproxy/server-t.h
index 32697a9c4..102eb4483 100644
--- a/include/haproxy/server-t.h
+++ b/include/haproxy/server-t.h
@@ -126,10 +126,13 @@ enum srv_initaddr {
 "srv_port "   \
 "srvrecord "  \
 "srv_use_ssl "\
-"srv_check_port"
+"srv_check_port " \
+"srv_check_addr " \
+"srv_agent_addr " \
+"srv_agent_port"
 
-#define SRV_STATE_FILE_MAX_FIELDS 22
-#define SRV_STATE_FILE_NB_FIELDS_VERSION_1 21
+#define SRV_STATE_FILE_MAX_FIELDS 25
+#define SRV_STATE_FILE_NB_FIELDS_VERSION_1 22
 #define SRV_STATE_LINE_MAXLEN 512
 
 /* server flags -- 32 bits */
diff --git a/reg-tests/checks/1be_40srv_odd_health_checks.vtc 
b/reg-tests/checks/1be_40srv_odd_health_checks.vtc
index f01205295..c279972aa 100644
--- a/reg-tests/checks/1be_40srv_odd_health_checks.vtc
+++ b/reg-tests/checks/1be_40srv_odd_health_checks.vtc
@@ -112,6 +112,6 @@ syslog S -wait
 
 haproxy h1 -cli {
 send "show servers state"
-expect ~ "# be_id be_name srv_id srv_name srv_addr srv_op_state 
srv_admin_state srv_uweight srv_iweight srv_time_since_last_change 
srv_check_status srv_check_result srv_check_health srv_check_state 
srv_agent_state bk_f_forced_id srv_f_forced_id srv_fqdn srv_port srvrecord 
srv_use_ssl srv_check_port\n2 be1 1 srv0 ${s0_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 
0 0 0 0 - ${s0_port} - 0 0\n2 be1 2 srv1 ${s1_addr} 2 0 1 1 [[:digit:]]+ 6 
([[:digit:]]+ ){3}0 0 0 - ${s1_port} - 0 0\n2 be1 3 srv2 ${s2_addr} 2 0 1 1 
[[:digit:]]+ 1 0 1 0 0 0 0 - ${s2_port} - 0 0\n2 be1 4 srv3 ${s3_addr} 2 0 1 1 
[[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s3_port} - 0 0\n2 be1 5 srv4 
${s4_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s4_port} - 0 0\n2 be1 6 srv5 
${s5_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s5_port} - 0 0\n2 
be1 7 srv6 ${s6_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s6_port} - 0 0\n2 
be1 8 srv7 ${s7_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - 
${s7_port} - 0 0\n2 be1 9 srv8 ${s8_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - 
${s8_port} - 0 0\n2 be1 10 srv9 ${s9_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ 
){3}0 0 0 - ${s9_port} - 0 0\n2 be1 11 srv10 ${s10_addr} 2 0 1 1 [[:digit:]]+ 1 
0 1 0 0 0 0 - ${s10_port} - 0 0\n2 be1 12 srv11 ${s11_addr} 2 0 1 1 
[[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s11_port} - 0 0\n2 be1 13 srv12 
${s12_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s12_port} - 0 0\n2 be1 14 
srv13 ${s13_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s13_port} 
- 0 0\n2 be1 15 srv14 ${s14_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - 
${s14_port} - 0 0\n2 be1 16 srv15 ${s15_addr} 2 0 1 1 [[:digit:]]+ 6 
([[:digit:]]+ ){3}0 0 0 - ${s15_port} - 0 0\n2 be1 17 srv16 ${s16_addr} 2 0 1 1 
[[:digit:]]+ 1 0 1 0 0 0 0 - ${s16_port} - 0 0\n2 be1 18 srv17 ${s17_addr} 2 0 
1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s17_port} - 0 0\n2 be1 19 srv18 
${s18_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s18_port} - 0 0\n2 be1 20 
srv19 ${s19_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s19_port} 
- 0 0\n2 be1 21 srv20 ${s20_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - 
${s20_port} - 0 0\n2 be1 22 srv21 ${s21_addr} 2 0 1 1 [[:digit:]]+ 6 
([[:digit:]]+ ){3}0 0 0 - ${s21_port} - 0 0\n2 be1 23 srv22 ${s22_addr} 2 0 1 1 
[[:digit:]]+ 1 0 1 0 0 0 0 - ${s22_port} - 0 0\n2 be1 24 srv23 ${s23_addr} 2 0 
1 1 [[:digit:

[PATCH 4/6] MEDIUM: cli: add agent-port command

2021-02-06 Thread William Dauchy
this patch allows to set agent port at runtime. In order to align with
both `addr` and `check-addr` commands, also add the possibility to
optionnaly set port on `agent-addr` command. This led to a small
refactor in order to use the same function for both `agent-addr` and
`agent-port` commands.

Signed-off-by: William Dauchy 
---
 doc/management.txt |  6 +++-
 src/server.c   | 77 --
 2 files changed, 73 insertions(+), 10 deletions(-)

diff --git a/doc/management.txt b/doc/management.txt
index bff770e4e..423c614b2 100644
--- a/doc/management.txt
+++ b/doc/management.txt
@@ -1828,10 +1828,14 @@ set server / agent [ up | down ]
   switch a server's state regardless of some slow agent checks for example.
   Note that the change is propagated to tracking servers if any.
 
-set server / agent-addr 
+set server / agent-addr  [port ]
   Change addr for servers agent checks. Allows to migrate agent-checks to
   another address at runtime. You can specify both IP and hostname, it will be
   resolved.
+  Optionally, change the port agent.
+
+set server / agent-port 
+  Change the port used for agent checks.
 
 set server / agent-send 
   Change agent string sent to agent check target. Allows to update string while
diff --git a/src/server.c b/src/server.c
index 533755f1e..a983d5d68 100644
--- a/src/server.c
+++ b/src/server.c
@@ -56,6 +56,8 @@ static int srv_state_get_version(FILE *f);
 static void srv_cleanup_connections(struct server *srv);
 static const char *update_server_check_addr_port(struct server *s, const char 
*addr,
 const char *port);
+static const char *update_server_agent_addr_port(struct server *s, const char 
*addr,
+const char *port);
 
 /* List head of all known server keywords */
 static struct srv_kw_list srv_keywords = {
@@ -3573,6 +3575,47 @@ int update_server_addr(struct server *s, void *ip, int 
ip_sin_family, const char
return 0;
 }
 
+/* update agent health check address and port
+ * addr can be ip4/ip6 or a hostname
+ * must be called with the server lock held.
+ */
+static const char *update_server_agent_addr_port(struct server *s, const char 
*addr,
+const char *port)
+{
+   struct sockaddr_storage sk;
+   struct buffer *msg;
+   int new_port;
+
+   msg = get_trash_chunk();
+
+   if (!(s->agent.state & CHK_ST_ENABLED)) {
+   chunk_appendf(msg, "agent checks are not enabled on this 
server.\n");
+   goto out;
+   }
+
+   if (addr) {
+   memset(, 0, sizeof(struct sockaddr_storage));
+   if (str2ip(addr, ) == NULL) {
+   chunk_appendf(msg, "invalid addr '%s'\n", addr);
+   goto out;
+   }
+   set_srv_agent_addr(s, );
+   }
+   if (port) {
+   if (strl2irc(port, strlen(port), _port) != 0) {
+   chunk_appendf(msg, "provided port is not an integer\n");
+   goto out;
+   }
+   if (new_port < 0 || new_port > 65535) {
+   chunk_appendf(msg, "provided port is invalid\n");
+   goto out;
+   }
+   set_srv_agent_port(s, new_port);
+   }
+out:
+   return msg->area;
+}
+
 /* update server health check address and port
  * addr must be ip4 or ip6, it won't be resolved
  * must be called with the server lock held.
@@ -4428,15 +4471,31 @@ static int cli_parse_set_server(char **args, char 
*payload, struct appctx *appct
cli_err(appctx, "'set server  agent' expects 'up' 
or 'down'.\n");
}
else if (strcmp(args[3], "agent-addr") == 0) {
-   struct sockaddr_storage sk;
-
-   memset(, 0, sizeof(sk));
-   if (!(sv->agent.state & CHK_ST_ENABLED))
-   cli_err(appctx, "agent checks are not enabled on this 
server.\n");
-   else if (str2ip(args[4], ))
-   set_srv_agent_addr(sv, );
-   else
-   cli_err(appctx, "incorrect addr address given for 
agent.\n");
+   char *addr = NULL;
+   char *port = NULL;
+   if (strlen(args[4]) == 0) {
+   cli_err(appctx, "set server / agent-addr requires"
+   " an address and optionally a port.\n");
+   goto out_unlock;
+   }
+   addr = args[4];
+   if (strcmp(args[5], "port") == 0)
+   port = args[6];
+   warning = update_server_agent_addr_port(sv, addr, port);
+   if (warning)
+   cli_

[PATCH 0/6] cli commands coherency

2021-02-06 Thread William Dauchy
Hello,

This is a followup from last week cleaning regarding check and agent
check. This patch series brings some more coherency on the CLI side. I
also put some minor cleaning.

William Dauchy (6):
  CLEANUP: check: fix some typo in comments
  CLEANUP: tools: typo in `strl2irc` mention
  MEDIUM: cli: add check-addr command
  MEDIUM: cli: add agent-port command
  MEDIUM: server: support {check,agent}_addr, agent_port in server state
  CLEANUP: server: add missing space in server-state error output

 doc/management.txt|  15 +-
 include/haproxy/server-t.h|   9 +-
 .../checks/1be_40srv_odd_health_checks.vtc|   2 +-
 .../checks/40be_2srv_odd_health_checks.vtc|   2 +-
 reg-tests/checks/4be_1srv_health_checks.vtc   |   6 +-
 src/check.c   |  18 +-
 src/proxy.c   |  41 ++--
 src/server.c  | 192 ++
 src/tools.c   |   2 +-
 9 files changed, 213 insertions(+), 74 deletions(-)

-- 
2.30.0




[PATCH 1/6] CLEANUP: check: fix some typo in comments

2021-02-06 Thread William Dauchy
a few obvious english typo in comments, some of which introduced by
myself quite recently

Signed-off-by: William Dauchy 
---
 src/check.c | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/src/check.c b/src/check.c
index edb2ac29f..5de867d7f 100644
--- a/src/check.c
+++ b/src/check.c
@@ -1004,7 +1004,7 @@ int check_buf_available(void *target)
 }
 
 /*
- * Allocate a buffer. If if fails, it adds the check in buffer wait queue.
+ * Allocate a buffer. If it fails, it adds the check in buffer wait queue.
  */
 struct buffer *check_get_buf(struct check *check, struct buffer *bptr)
 {
@@ -1211,10 +1211,10 @@ static int start_checks()
 
srand((unsigned)time(NULL));
 
-   /*
-* 2- start them as far as possible from each others. For this, we will
-* start them after their interval set to the min interval divided by
-* the number of servers, weighted by the server's position in the list.
+   /* 2- start them as far as possible from each other. For this, we will
+* start them after their interval is set to the min interval divided
+* by the number of servers, weighted by the server's position in the
+* list.
 */
for (px = proxies_list; px; px = px->next) {
if ((px->options2 & PR_O2_CHK_ANY) == PR_O2_EXT_CHK) {
@@ -1261,7 +1261,7 @@ static int srv_check_healthcheck_port(struct check *chk)
 
srv = chk->server;
 
-   /* by default, we use the health check port ocnfigured */
+   /* by default, we use the health check port configured */
if (chk->port > 0)
return chk->port;
 
@@ -1734,14 +1734,14 @@ int set_srv_agent_send(struct server *srv, const char 
*send)
return 0;
 }
 
-/* set agent addr and apprropriate flag */
+/* set agent addr and appropriate flag */
 inline void set_srv_agent_addr(struct server *srv, struct sockaddr_storage *sk)
 {
srv->agent.addr = *sk;
srv->flags |= SRV_F_AGENTADDR;
 }
 
-/* set agent port and apprropriate flag */
+/* set agent port and appropriate flag */
 inline void set_srv_agent_port(struct server *srv, int port)
 {
srv->agent.port = port;
@@ -2092,7 +2092,7 @@ static struct srv_kw_list srv_kws = { "CHK", { }, {
{ "check-via-socks4",srv_parse_check_via_socks4,0,  1 }, /* 
Enable socks4 proxy for health checks */
{ "no-agent-check",  srv_parse_no_agent_check,  0,  1 }, /* Do 
not enable any auxiliary agent check */
{ "no-check",srv_parse_no_check,0,  1 }, /* 
Disable health checks */
-   { "no-check-send-proxy", srv_parse_no_check_send_proxy, 0,  1 }, /* 
Disable PROXY protol for health checks */
+   { "no-check-send-proxy", srv_parse_no_check_send_proxy, 0,  1 }, /* 
Disable PROXY protocol for health checks */
{ "rise",srv_parse_check_rise,  1,  1 }, /* Set 
rise value for health checks */
{ "fall",srv_parse_check_fall,  1,  1 }, /* Set 
fall value for health checks */
{ "inter",   srv_parse_check_inter, 1,  1 }, /* Set 
inter value for health checks */
-- 
2.30.0




[PATCH 6/6] CLEANUP: server: add missing space in server-state error output

2021-02-06 Thread William Dauchy
a space was missing in the output to make it more readable.

Signed-off-by: William Dauchy 
---
 src/server.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/server.c b/src/server.c
index 42191eda5..33375e638 100644
--- a/src/server.c
+++ b/src/server.c
@@ -3017,7 +3017,7 @@ static void srv_update_state(struct server *srv, int 
version, char **params)
  out:
if (msg->data) {
chunk_appendf(msg, "\n");
-   ha_warning("server-state application failed for server 
'%s/%s'%s",
+   ha_warning("server-state application failed for server '%s/%s' 
%s",
   srv->proxy->id, srv->id, msg->area);
}
 }
-- 
2.30.0




[PATCH 3/6] MEDIUM: cli: add check-addr command

2021-02-06 Thread William Dauchy
this patch allows to set server health check address at runtime. In
order to align with `addr` command, also allow to set port optionnaly.
This led to a small refactor in order to use the same function for both
`check-addr` and `check-port` commands.

This command becomes more and more useful for people having a consul
like architecture:
- the backend server is located on a container with its own IP
- the health checks are done the consul instance located on the host
  with the host IP

Signed-off-by: William Dauchy 
---
 doc/management.txt |  4 +++
 src/server.c   | 83 +-
 2 files changed, 72 insertions(+), 15 deletions(-)

diff --git a/doc/management.txt b/doc/management.txt
index b74aba769..bff770e4e 100644
--- a/doc/management.txt
+++ b/doc/management.txt
@@ -1842,6 +1842,10 @@ set server / health [ up | stopping | 
down ]
   switch a server's state regardless of some slow health checks for example.
   Note that the change is propagated to tracking servers if any.
 
+set server / check-addr  [port ]
+  Change the IP address used for server health checks.
+  Optionally, change the port used for server health checks.
+
 set server / check-port 
   Change the port used for health checking to 
 
diff --git a/src/server.c b/src/server.c
index da2325e9a..533755f1e 100644
--- a/src/server.c
+++ b/src/server.c
@@ -54,6 +54,8 @@ static int srv_set_fqdn(struct server *srv, const char *fqdn, 
int dns_locked);
 static void srv_state_parse_line(char *buf, const int version, char **params, 
char **srv_params);
 static int srv_state_get_version(FILE *f);
 static void srv_cleanup_connections(struct server *srv);
+static const char *update_server_check_addr_port(struct server *s, const char 
*addr,
+const char *port);
 
 /* List head of all known server keywords */
 static struct srv_kw_list srv_keywords = {
@@ -3571,6 +3573,47 @@ int update_server_addr(struct server *s, void *ip, int 
ip_sin_family, const char
return 0;
 }
 
+/* update server health check address and port
+ * addr must be ip4 or ip6, it won't be resolved
+ * must be called with the server lock held.
+ */
+static const char *update_server_check_addr_port(struct server *s, const char 
*addr,
+const char *port)
+{
+   struct sockaddr_storage sk;
+   struct buffer *msg;
+   int new_port;
+
+   msg = get_trash_chunk();
+
+   if (addr) {
+   memset(, 0, sizeof(struct sockaddr_storage));
+   if (str2ip2(addr, , 0) == NULL) {
+   chunk_appendf(msg, "invalid addr '%s'\n", addr);
+   goto out;
+   }
+   s->check.addr = sk;
+   }
+   if (port) {
+   if (strl2irc(port, strlen(port), _port) != 0) {
+   chunk_appendf(msg, "provided port is not an integer\n");
+   goto out;
+   }
+   if (new_port < 0 || new_port > 65535) {
+   chunk_appendf(msg, "provided port is invalid\n");
+   goto out;
+   }
+   /* prevent the update of port to 0 if MAPPORTS are in use */
+   if ((s->flags & SRV_F_MAPPORTS) && new_port == 0) {
+   chunk_appendf(msg, "can't unset 'port' since MAPPORTS 
is in use\n");
+   goto out;
+   }
+   s->check.port = new_port;
+   }
+out:
+   return msg->area;
+}
+
 /*
  * This function update a server's addr and port only for AF_INET and AF_INET6 
families.
  *
@@ -4403,23 +4446,32 @@ static int cli_parse_set_server(char **args, char 
*payload, struct appctx *appct
cli_err(appctx, "cannot allocate memory for new 
string.\n");
}
}
-   else if (strcmp(args[3], "check-port") == 0) {
-   int i = 0;
-   if (strl2irc(args[4], strlen(args[4]), ) != 0) {
-   cli_err(appctx, "'set server  check-port' expects 
an integer as argument.\n");
-   goto out_unlock;
-   }
-   if ((i < 0) || (i > 65535)) {
-   cli_err(appctx, "provided port is not valid.\n");
+   else if (strcmp(args[3], "check-addr") == 0) {
+   char *addr = NULL;
+   char *port = NULL;
+   if (strlen(args[4]) == 0) {
+   cli_err(appctx, "set server / check-addr requires"
+   " an address and optionally a port.\n");
goto out_unlock;
}
-   /* prevent the update of port to 0 if MAPPORTS are in use */
-   if ((sv->flags & SRV_F_MAPPORTS) && (

Re: [PATCH v3 0/5] fix check port/addr consistency

2021-02-04 Thread William Dauchy
Hello Christopher,

Thanks for your continuous support on this review :)

On Thu, Feb 4, 2021 at 10:09 AM Christopher Faulet  wrote:
> Thanks William, it seems good. I've just a question (sorry :). When the server
> state file is loaded, why the check port is set only if there is a DNS
> resolution ? I know it was done this way before the support of check_port
> parameter in the server state. But I suppose that now we support it, it should
> always be set, isn't it ?

yes you are totally right.

> And is there any usage to add "agent-addr" in the server state file? Because 
> it
> can also be set on the CLI. And later, same question will appear for
> "check-addr" and "agent-port".

the general rule is indeed, anything which can be changed through the
CLI should be loaded through server states. Truth is server states
becomes a nightmare to manage; I don't remember if you were in the
discussions a few months ago, but we concluded we should change that
long term by https://github.com/haproxy/haproxy/issues/953
So while waiting for a proper solution, we are indeed supposed to keep
server states up to date.

> I don't want to bother you again. So, I propose you to merge the first patch 
> and
> to add a new one to not set the check port when the server state file is 
> loaded.
> Then I can merge the third patch and amend the second one to move the check 
> port
> assignment before merging it. And finally I can merge the fourth and fifth 
> patches.

Don't worry, I can send you a v4.
--
William



[PATCH v3 5/5] MEDIUM: check: align agentaddr and agentport behaviour

2021-02-03 Thread William Dauchy
in the same manner of agentaddr, we now:
- permit to set agentport through `port` keyword, like it is the case
  for agentaddr through `addr`
- set the priority on `agent-port` keyword when used
- add a flag to be able to test when the value is set like for agentaddr

it makes the behaviour between `addr` and `port` more consistent.

Signed-off-by: William Dauchy 
---
 doc/configuration.txt  | 10 +-
 include/haproxy/check.h|  1 +
 include/haproxy/server-t.h |  1 +
 src/check.c| 12 +++-
 4 files changed, 18 insertions(+), 6 deletions(-)

diff --git a/doc/configuration.txt b/doc/configuration.txt
index 2abd684fa..eb685785d 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -14061,11 +14061,11 @@ pool-purge-delay 
 
 port 
   Using the "port" parameter, it becomes possible to use a different port to
-  send health-checks. On some servers, it may be desirable to dedicate a port
-  to a specific component able to perform complex tests which are more suitable
-  to health-checks than the application. It is common to run a simple script in
-  inetd for instance. This parameter is ignored if the "check" parameter is not
-  set. See also the "addr" parameter.
+  send health-checks or to probe the agent-check. On some servers, it may be
+  desirable to dedicate a port to a specific component able to perform complex
+  tests which are more suitable to health-checks than the application. It is
+  common to run a simple script in inetd for instance. This parameter is
+  ignored if the "check" parameter is not set. See also the "addr" parameter.
 
 proto 
   Forces the multiplexer's protocol to use for the outgoing connections to this
diff --git a/include/haproxy/check.h b/include/haproxy/check.h
index ed8124470..ffeef4e22 100644
--- a/include/haproxy/check.h
+++ b/include/haproxy/check.h
@@ -53,6 +53,7 @@ int spoe_handle_healthcheck_response(char *frame, size_t 
size, char *err, int er
 
 int set_srv_agent_send(struct server *srv, const char *send);
 void set_srv_agent_addr(struct server *srv, struct sockaddr_storage *sk);
+void set_srv_agent_port(struct server *srv, int port);
 
 /* Use this one only. This inline version only ensures that we don't
  * call the function when the observe mode is disabled.
diff --git a/include/haproxy/server-t.h b/include/haproxy/server-t.h
index 45f41959c..32697a9c4 100644
--- a/include/haproxy/server-t.h
+++ b/include/haproxy/server-t.h
@@ -138,6 +138,7 @@ enum srv_initaddr {
 #define SRV_F_NON_STICK0x0004/* never add connections allocated to 
this server to a stick table */
 #define SRV_F_USE_NS_FROM_PP 0x0008  /* use namespace associated with 
connection if present */
 #define SRV_F_FORCED_ID0x0010/* server's ID was forced in the 
configuration */
+#define SRV_F_AGENTPORT0x0040/* this server has a agent port 
configured */
 #define SRV_F_AGENTADDR0x0080/* this server has a agent addr 
configured */
 #define SRV_F_COOKIESET0x0100/* this server has a cookie 
configured, so don't generate dynamic cookies */
 #define SRV_F_FASTOPEN 0x0200/* Use TCP Fast Open to connect to 
server */
diff --git a/src/check.c b/src/check.c
index 31f108bff..194f3b4a6 100644
--- a/src/check.c
+++ b/src/check.c
@@ -1697,7 +1697,7 @@ static int srv_parse_agent_port(char **args, int 
*cur_arg, struct proxy *curpx,
}
 
global.maxsock++;
-   srv->agent.port = atol(args[*cur_arg+1]);
+   set_srv_agent_port(srv, atol(args[*cur_arg + 1]));
 
   out:
return err_code;
@@ -1741,6 +1741,13 @@ inline void set_srv_agent_addr(struct server *srv, 
struct sockaddr_storage *sk)
srv->flags |= SRV_F_AGENTADDR;
 }
 
+/* set agent port and apprropriate flag */
+inline void set_srv_agent_port(struct server *srv, int port)
+{
+   srv->agent.port = port;
+   srv->flags |= SRV_F_AGENTPORT;
+}
+
 /* Parse the "agent-send" server keyword */
 static int srv_parse_agent_send(char **args, int *cur_arg, struct proxy 
*curpx, struct server *srv,
char **errmsg)
@@ -2060,6 +2067,9 @@ static int srv_parse_check_port(char **args, int 
*cur_arg, struct proxy *curpx,
 
global.maxsock++;
srv->check.port = atol(args[*cur_arg+1]);
+   /* if agentport was never set, we can use port */
+   if (!(srv->flags & SRV_F_AGENTPORT))
+   set_srv_agent_port(srv, srv->check.port);
 
   out:
return err_code;
-- 
2.30.0




[PATCH v3 3/5] MEDIUM: check: remove checkport checkaddr flag

2021-02-03 Thread William Dauchy
While trying to fix some consistency problem with the config file/cli
(e.g. check-port cli command does not set the flag), we realised
checkport flag was not necessarily needed. Indeed tcpcheck uses service
port as the last choice if check.port is zero. So we can assume if
check.port is zero, it means it was never set by the user, regardless if
it is by the cli or config file.  In the longterm this will avoid to
introduce a new consistency issue if we forget to set the flag.

in the same manner of checkport flag, we don't really need checkaddr
flag. We can assume if checkaddr is not set, it means it was never set
by the user or config.

Signed-off-by: William Dauchy 
---
 include/haproxy/server-t.h | 2 --
 src/check.c| 2 --
 src/dns.c  | 3 +--
 src/server.c   | 7 ++-
 4 files changed, 3 insertions(+), 11 deletions(-)

diff --git a/include/haproxy/server-t.h b/include/haproxy/server-t.h
index e42b1c7ed..45f41959c 100644
--- a/include/haproxy/server-t.h
+++ b/include/haproxy/server-t.h
@@ -138,8 +138,6 @@ enum srv_initaddr {
 #define SRV_F_NON_STICK0x0004/* never add connections allocated to 
this server to a stick table */
 #define SRV_F_USE_NS_FROM_PP 0x0008  /* use namespace associated with 
connection if present */
 #define SRV_F_FORCED_ID0x0010/* server's ID was forced in the 
configuration */
-#define SRV_F_CHECKADDR0x0020/* this server has a check addr 
configured */
-#define SRV_F_CHECKPORT0x0040/* this server has a check port 
configured */
 #define SRV_F_AGENTADDR0x0080/* this server has a agent addr 
configured */
 #define SRV_F_COOKIESET0x0100/* this server has a cookie 
configured, so don't generate dynamic cookies */
 #define SRV_F_FASTOPEN 0x0200/* Use TCP Fast Open to connect to 
server */
diff --git a/src/check.c b/src/check.c
index 879fe84ce..e24050ff2 100644
--- a/src/check.c
+++ b/src/check.c
@@ -1527,7 +1527,6 @@ static int srv_parse_addr(char **args, int *cur_arg, 
struct proxy *curpx, struct
}
 
srv->check.addr = srv->agent.addr = *sk;
-   srv->flags |= SRV_F_CHECKADDR;
srv->flags |= SRV_F_AGENTADDR;
 
   out:
@@ -2050,7 +2049,6 @@ static int srv_parse_check_port(char **args, int 
*cur_arg, struct proxy *curpx,
 
global.maxsock++;
srv->check.port = atol(args[*cur_arg+1]);
-   srv->flags |= SRV_F_CHECKPORT;
 
   out:
return err_code;
diff --git a/src/dns.c b/src/dns.c
index 8c97df46b..8fc9089e7 100644
--- a/src/dns.c
+++ b/src/dns.c
@@ -712,8 +712,7 @@ static void dns_check_dns_response(struct dns_resolution 
*res)
 
srv->svc_port = item->port;
srv->flags   &= ~SRV_F_MAPPORTS;
-   if ((srv->check.state & CHK_ST_CONFIGURED) &&
-   !(srv->flags & SRV_F_CHECKPORT))
+   if (!srv->check.port)
srv->check.port = item->port;
 
if (!srv->dns_opts.ignore_weight) {
diff --git a/src/server.c b/src/server.c
index d8216058f..1fd71e403 100644
--- a/src/server.c
+++ b/src/server.c
@@ -1660,8 +1660,6 @@ static void srv_settings_cpy(struct server *srv, struct 
server *src, int srv_tmp
srv->flags   |= src->flags;
srv->do_check = src->do_check;
srv->do_agent = src->do_agent;
-   if (srv->check.port)
-   srv->flags |= SRV_F_CHECKPORT;
srv->check.inter  = src->check.inter;
srv->check.fastinter  = src->check.fastinter;
srv->check.downinter  = src->check.downinter;
@@ -2985,8 +2983,7 @@ static void srv_update_state(struct server *srv, int 
version, char **params)
}
 
/* configure check.port accordingly */
-   if ((srv->check.state & CHK_ST_CONFIGURED) &&
-   !(srv->flags & SRV_F_CHECKPORT))
+   if (!srv->check.port)
srv->check.port = port_check;
 
/* Unset SRV_F_MAPPORTS for SRV records.
@@ -3673,7 +3670,7 @@ const char *update_server_addr_port(struct server *s, 
const char *addr, const ch
 * we're switching from a fixed port to a 
SRV_F_MAPPORTS (mapped) port
 * prevent PORT change if check doesn't have 
it's dedicated port while switching
 * to port mapping */
-   if ((s->check.state & CHK_ST_CONFIGURED) && 
!(s->flags & SRV_F_CHECKPORT)) {
+  

[PATCH v3 0/5] fix check port/addr consistency

2021-02-03 Thread William Dauchy
Hello Christopher,

Here is my last update on port/addr consistency. I addressed all the
point you mentioned. I hope I did not forgot anything.  I will come back
with `check-addr` and `agent-port` on the cli once those patches are
accepted.

William Dauchy (5):
  BUG/MINOR: cli: fix set server addr/port coherency with health checks
  MEDIUM: server: adding support for check_port in server state
  MEDIUM: check: remove checkport checkaddr flag
  BUG/MINOR: check: consitent way to set agentaddr
  MEDIUM: check: align agentaddr and agentport behaviour

 doc/configuration.txt | 10 +--
 doc/management.txt|  1 +
 include/haproxy/check.h   |  2 +
 include/haproxy/server-t.h| 10 +--
 .../checks/1be_40srv_odd_health_checks.vtc|  2 +-
 .../checks/40be_2srv_odd_health_checks.vtc|  2 +-
 reg-tests/checks/4be_1srv_health_checks.vtc   |  6 +-
 src/check.c   | 33 --
 src/dns.c |  3 +-
 src/proxy.c   |  4 +-
 src/server.c  | 64 +--
 11 files changed, 78 insertions(+), 59 deletions(-)

-- 
2.30.0




[PATCH v3 1/5] BUG/MINOR: cli: fix set server addr/port coherency with health checks

2021-02-03 Thread William Dauchy
while reading `update_server_addr_port` I found out some things which
can be seen as incoherency. I hope I did not overlooked anything:

- one comment is stating check's address should be updated if it uses
  the server one; however the condition checks if `SRV_F_CHECKADDR` is
  set; this flag is set when a check address is set; result is that we
  override the check address where I was not expecting it. In fact we
  don't need to update anything here as server addr is used when check
  addr is not set.
- same goes for check agent addr
- for port, it is a bit different, we update the check port if it is
  unset. This is harmless because we also use server port if check port
  is unset. However it creates some incoherency before/after using this
  command, as check port should stay unset througout the life of the
  process unless it is is set by `set server check-port` command.

quite hard to locate the origin of this this issue but the function was
introduced in commit d458adcc52b74608e2fe6a2a95f09ce5e94932b7 ("MINOR:
new update_server_addr_port() function to change both server's ADDR and
service PORT"). I was however not able to determine whether this is due
to a change of behavior along the years. So this patch can potentially
be backported up to v1.8 but we must be careful while doing so, as the
code has changed a lot. That being said, the bug being not very
impacting I would be fine keeping it for 2.4 only.

Signed-off-by: William Dauchy 
---
 src/server.c | 15 ---
 1 file changed, 15 deletions(-)

diff --git a/src/server.c b/src/server.c
index 10f528640..99b7e9181 100644
--- a/src/server.c
+++ b/src/server.c
@@ -3625,16 +3625,6 @@ const char *update_server_addr_port(struct server *s, 
const char *addr, const ch
ipcpy(, >addr);
changed = 1;
 
-   /* we also need to update check's ADDR only if it uses the 
server's one */
-   if ((s->check.state & CHK_ST_CONFIGURED) && (s->flags & 
SRV_F_CHECKADDR)) {
-   ipcpy(, >check.addr);
-   }
-
-   /* we also need to update agent ADDR only if it use the 
server's one */
-   if ((s->agent.state & CHK_ST_CONFIGURED) && (s->flags & 
SRV_F_AGENTADDR)) {
-   ipcpy(, >agent.addr);
-   }
-
/* update report for caller */
chunk_printf(msg, "IP changed from '%s' to '%s'", current_addr, 
addr);
}
@@ -3714,11 +3704,6 @@ const char *update_server_addr_port(struct server *s, 
const char *addr, const ch
}
 
chunk_appendf(msg, "%d'", new_port);
-
-   /* we also need to update health checks port only if it 
uses server's realport */
-   if ((s->check.state & CHK_ST_CONFIGURED) && !(s->flags 
& SRV_F_CHECKPORT)) {
-   s->check.port = new_port;
-   }
}
else {
chunk_appendf(msg, "no need to change the port");
-- 
2.30.0




[PATCH v3 4/5] BUG/MINOR: check: consitent way to set agentaddr

2021-02-03 Thread William Dauchy
small consistency problem with `addr` and `agent-addr` options:
for the both options, the last one parsed is always used to set the
agent-check addr.  Thus these two lines don't have the same behavior:

  server ... addr  agent-addr 
  server ... agent-addr  addr 

After this patch `agent-addr` will always be the priority option over
`addr`. It means we test the flag before setting agentaddr.
We also fix all the places where we did not set the flag to be coherent
everywhere.

I was not really able to determine where this issue is coming from. So
it is probable we may backport it to all stable version where the agent
is supported.

Signed-off-by: William Dauchy 
---
 include/haproxy/check.h |  1 +
 src/check.c | 19 +++
 src/server.c|  7 ++-
 3 files changed, 22 insertions(+), 5 deletions(-)

diff --git a/include/haproxy/check.h b/include/haproxy/check.h
index 8d70040a3..ed8124470 100644
--- a/include/haproxy/check.h
+++ b/include/haproxy/check.h
@@ -52,6 +52,7 @@ int spoe_prepare_healthcheck_request(char **req, int *len);
 int spoe_handle_healthcheck_response(char *frame, size_t size, char *err, int 
errlen);
 
 int set_srv_agent_send(struct server *srv, const char *send);
+void set_srv_agent_addr(struct server *srv, struct sockaddr_storage *sk);
 
 /* Use this one only. This inline version only ensures that we don't
  * call the function when the observe mode is disabled.
diff --git a/src/check.c b/src/check.c
index e24050ff2..31f108bff 100644
--- a/src/check.c
+++ b/src/check.c
@@ -1526,8 +1526,10 @@ static int srv_parse_addr(char **args, int *cur_arg, 
struct proxy *curpx, struct
goto error;
}
 
-   srv->check.addr = srv->agent.addr = *sk;
-   srv->flags |= SRV_F_AGENTADDR;
+   srv->check.addr = *sk;
+   /* if agentaddr was never set, we can use addr */
+   if (!(srv->flags & SRV_F_AGENTADDR))
+   set_srv_agent_addr(srv, sk);
 
   out:
return err_code;
@@ -1537,21 +1539,23 @@ static int srv_parse_addr(char **args, int *cur_arg, 
struct proxy *curpx, struct
goto out;
 }
 
-
 /* Parse the "agent-addr" server keyword */
 static int srv_parse_agent_addr(char **args, int *cur_arg, struct proxy 
*curpx, struct server *srv,
char **errmsg)
 {
+   struct sockaddr_storage sk;
int err_code = 0;
 
if (!*(args[*cur_arg+1])) {
memprintf(errmsg, "'%s' expects an address as argument.", 
args[*cur_arg]);
goto error;
}
-   if(str2ip(args[*cur_arg+1], >agent.addr) == NULL) {
+   memset(, 0, sizeof(sk));
+   if (str2ip(args[*cur_arg + 1], ) == NULL) {
memprintf(errmsg, "parsing agent-addr failed. Check if '%s' is 
correct address.", args[*cur_arg+1]);
goto error;
}
+   set_srv_agent_addr(srv, );
 
   out:
return err_code;
@@ -1730,6 +1734,13 @@ int set_srv_agent_send(struct server *srv, const char 
*send)
return 0;
 }
 
+/* set agent addr and apprropriate flag */
+inline void set_srv_agent_addr(struct server *srv, struct sockaddr_storage *sk)
+{
+   srv->agent.addr = *sk;
+   srv->flags |= SRV_F_AGENTADDR;
+}
+
 /* Parse the "agent-send" server keyword */
 static int srv_parse_agent_send(char **args, int *cur_arg, struct proxy 
*curpx, struct server *srv,
char **errmsg)
diff --git a/src/server.c b/src/server.c
index 1fd71e403..b6b3b0893 100644
--- a/src/server.c
+++ b/src/server.c
@@ -4384,9 +4384,14 @@ static int cli_parse_set_server(char **args, char 
*payload, struct appctx *appct
cli_err(appctx, "'set server  agent' expects 'up' 
or 'down'.\n");
}
else if (strcmp(args[3], "agent-addr") == 0) {
+   struct sockaddr_storage sk;
+
+   memset(, 0, sizeof(sk));
if (!(sv->agent.state & CHK_ST_ENABLED))
cli_err(appctx, "agent checks are not enabled on this 
server.\n");
-   else if (str2ip(args[4], >agent.addr) == NULL)
+   else if (str2ip(args[4], ))
+   set_srv_agent_addr(sv, );
+   else
cli_err(appctx, "incorrect addr address given for 
agent.\n");
}
else if (strcmp(args[3], "agent-send") == 0) {
-- 
2.30.0




[PATCH v3 2/5] MEDIUM: server: adding support for check_port in server state

2021-02-03 Thread William Dauchy
We can currently change the check-port using the cli command `set server
check-port` but there is a consistency issue when using server state.
This patch aims to fix this problem but will be also a good preparation
work to get rid of checkport flag, so we are able to know when checkport
was set by config.

I am fully aware this is not making github #953 moving forward, I
however think this might be acceptable while waiting for a proper
solution and resolve consistency problem faced with port settings.

Signed-off-by: William Dauchy 
---
 doc/management.txt|  1 +
 include/haproxy/server-t.h|  7 ++--
 .../checks/1be_40srv_odd_health_checks.vtc|  2 +-
 .../checks/40be_2srv_odd_health_checks.vtc|  2 +-
 reg-tests/checks/4be_1srv_health_checks.vtc   |  6 ++--
 src/proxy.c   |  4 +--
 src/server.c  | 35 ---
 7 files changed, 35 insertions(+), 22 deletions(-)

diff --git a/doc/management.txt b/doc/management.txt
index de3fbf607..b74aba769 100644
--- a/doc/management.txt
+++ b/doc/management.txt
@@ -2447,6 +2447,7 @@ show servers state []
  srv_port:Server port.
  srvrecord:   DNS SRV record associated to this SRV.
  srv_use_ssl: use ssl for server connections.
+ srv_check_port:  Server check port.
 
 show sess
   Dump all known sessions. Avoid doing this on slow connections as this can
diff --git a/include/haproxy/server-t.h b/include/haproxy/server-t.h
index b29c75c0b..e42b1c7ed 100644
--- a/include/haproxy/server-t.h
+++ b/include/haproxy/server-t.h
@@ -125,10 +125,11 @@ enum srv_initaddr {
 "srv_fqdn "   \
 "srv_port "   \
 "srvrecord "  \
-"srv_use_ssl"
+"srv_use_ssl "\
+"srv_check_port"
 
-#define SRV_STATE_FILE_MAX_FIELDS 21
-#define SRV_STATE_FILE_NB_FIELDS_VERSION_1 20
+#define SRV_STATE_FILE_MAX_FIELDS 22
+#define SRV_STATE_FILE_NB_FIELDS_VERSION_1 21
 #define SRV_STATE_LINE_MAXLEN 512
 
 /* server flags -- 32 bits */
diff --git a/reg-tests/checks/1be_40srv_odd_health_checks.vtc 
b/reg-tests/checks/1be_40srv_odd_health_checks.vtc
index bd07d8840..f01205295 100644
--- a/reg-tests/checks/1be_40srv_odd_health_checks.vtc
+++ b/reg-tests/checks/1be_40srv_odd_health_checks.vtc
@@ -112,6 +112,6 @@ syslog S -wait
 
 haproxy h1 -cli {
 send "show servers state"
-expect ~ "# be_id be_name srv_id srv_name srv_addr srv_op_state 
srv_admin_state srv_uweight srv_iweight srv_time_since_last_change 
srv_check_status srv_check_result srv_check_health srv_check_state 
srv_agent_state bk_f_forced_id srv_f_forced_id srv_fqdn srv_port srvrecord 
srv_use_ssl\n2 be1 1 srv0 ${s0_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - 
${s0_port} - 0\n2 be1 2 srv1 ${s1_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ 
){3}0 0 0 - ${s1_port} - 0\n2 be1 3 srv2 ${s2_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 
0 0 0 0 - ${s2_port} - 0\n2 be1 4 srv3 ${s3_addr} 2 0 1 1 [[:digit:]]+ 6 
([[:digit:]]+ ){3}0 0 0 - ${s3_port} - 0\n2 be1 5 srv4 ${s4_addr} 2 0 1 1 
[[:digit:]]+ 1 0 1 0 0 0 0 - ${s4_port} - 0\n2 be1 6 srv5 ${s5_addr} 2 0 1 1 
[[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s5_port} - 0\n2 be1 7 srv6 
${s6_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s6_port} - 0\n2 be1 8 srv7 
${s7_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s7_port} - 0\n2 
be1 9 srv8 ${s8_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s8_port} - 0\n2 
be1 10 srv9 ${s9_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - 
${s9_port} - 0\n2 be1 11 srv10 ${s10_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - 
${s10_port} - 0\n2 be1 12 srv11 ${s11_addr} 2 0 1 1 [[:digit:]]+ 6 
([[:digit:]]+ ){3}0 0 0 - ${s11_port} - 0\n2 be1 13 srv12 ${s12_addr} 2 0 1 1 
[[:digit:]]+ 1 0 1 0 0 0 0 - ${s12_port} - 0\n2 be1 14 srv13 ${s13_addr} 2 0 1 
1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s13_port} - 0\n2 be1 15 srv14 
${s14_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s14_port} - 0\n2 be1 16 
srv15 ${s15_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s15_port} 
- 0\n2 be1 17 srv16 ${s16_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - 
${s16_port} - 0\n2 be1 18 srv17 ${s17_addr} 2 0 1 1 [[:digit:]]+ 6 
([[:digit:]]+ ){3}0 0 0 - ${s17_port} - 0\n2 be1 19 srv18 ${s18_addr} 2 0 1 1 
[[:digit:]]+ 1 0 1 0 0 0 0 - ${s18_port} - 0\n2 be1 20 srv19 ${s19_addr} 2 0 1 
1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s19_port} - 0\n2 be1 21 srv20 
${s20_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s20_port} - 0\n2 be1 22 
srv21 ${s21_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s21_port} 
- 0\n2 be1 23 srv22 ${s22_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - 
${s22_port} - 0\n2 be1 24 srv23 ${s23_addr} 2 0 1 1 [[:digit:]]+ 6 
([[:digit:]]+ ){3}0 0 0 - ${s23_port} - 0\n2 be1 25 srv24 ${s24_addr} 2 0 1 1 
[[:digit:]]+ 1 

Re: [PATCH v2 0/5] fix check port/addr consistency

2021-02-03 Thread William Dauchy
On Wed, Feb 3, 2021 at 9:59 AM Christopher Faulet  wrote:
> At first glance, I'm just a bit annoyed with the patch 5. In the 
> documentation,
> it is stated that "addr" option will be used for agent-check too. And there is
> no info about interactions between "addr" and "agent-addr" options when both 
> are
> configured. For me, for an agent-check, the "agent-addr" option must be used 
> in
> priority, regardless the options order. If not defined, the "addr" option must
> be used, if defined. And at the end, we use the server address by default if
> none is specified.

ok I missed that part. it is a bit sad honestly, it makes things less explicit.

> There is another problem with "port" and "agent-port" options. It is mentioned
> in the documentation that "agent-port" is required to perform agent-checks. 
> And
> "port" option is not used for agent-check. It is not really consistent. I
> propose to deal with port/agent-port options in the same way than
> addr/agent-addr ones. And we keep the test to be sure to have a dedicated port
> for the agent-check to not use the server's one. This way, we keep backward
> compatibility and improve consistency.
> Thus, if you agree with these changes, I guess we should keep SRV_F_AGENTADDR
> flag and add SRV_F_AGENTPORT.

ok I will come back with a v3. I honestly don't like to have port/addr
used for agent, as stated previously because it is creating some non
explicit things. But indeed as we need to keep backward compat, I will
fix that.

> About the CLI. I agree, the "check-addr" parameter on the "set server" command
> must be added. And the "agent-port" parameter must bee added too. For me, it 
> is
> a consistency matter. But I understand it is also mandatory for dynamic
> environments.

thanks, let me come back with a v3, so we can move forward for the
next improvements.
-- 
William



Re: [PATCH v2 4/5] MEDIUM: get rid of checkaddr and agentaddr flag

2021-02-02 Thread William Dauchy
On Tue, Feb 2, 2021 at 10:57 PM William Dauchy  wrote:
> in the same manner of checkport flag, we found out some consistency
> problem. For instance, SRV_F_AGENTADDR flag is set when the "addr"
> option is set on a server line but not for the "agent-addr" option.
> The issue is similar in the cli where we never set the flag.
> A consequence of this patch is that agentaddr is no longer used, at
> least for now. We might reintroduce it later if we need it.

sorry for that, I forgot "server:" in the subject. will send v3 if
there are other feedbacks
-- 
William



[PATCH v2 4/5] MEDIUM: get rid of checkaddr and agentaddr flag

2021-02-02 Thread William Dauchy
in the same manner of checkport flag, we found out some consistency
problem. For instance, SRV_F_AGENTADDR flag is set when the "addr"
option is set on a server line but not for the "agent-addr" option.
The issue is similar in the cli where we never set the flag.
A consequence of this patch is that agentaddr is no longer used, at
least for now. We might reintroduced it later if we need it.

Signed-off-by: William Dauchy 
---
 include/haproxy/server-t.h | 2 --
 src/check.c| 2 --
 2 files changed, 4 deletions(-)

diff --git a/include/haproxy/server-t.h b/include/haproxy/server-t.h
index 131b97cb6..382caa218 100644
--- a/include/haproxy/server-t.h
+++ b/include/haproxy/server-t.h
@@ -138,8 +138,6 @@ enum srv_initaddr {
 #define SRV_F_NON_STICK0x0004/* never add connections allocated to 
this server to a stick table */
 #define SRV_F_USE_NS_FROM_PP 0x0008  /* use namespace associated with 
connection if present */
 #define SRV_F_FORCED_ID0x0010/* server's ID was forced in the 
configuration */
-#define SRV_F_CHECKADDR0x0020/* this server has a check addr 
configured */
-#define SRV_F_AGENTADDR0x0080/* this server has a agent addr 
configured */
 #define SRV_F_COOKIESET0x0100/* this server has a cookie 
configured, so don't generate dynamic cookies */
 #define SRV_F_FASTOPEN 0x0200/* Use TCP Fast Open to connect to 
server */
 #define SRV_F_SOCKS4_PROXY 0x0400/* this server uses SOCKS4 proxy */
diff --git a/src/check.c b/src/check.c
index f4b11fc46..f4d5bd452 100644
--- a/src/check.c
+++ b/src/check.c
@@ -1527,8 +1527,6 @@ static int srv_parse_addr(char **args, int *cur_arg, 
struct proxy *curpx, struct
}
 
srv->check.addr = srv->agent.addr = *sk;
-   srv->flags |= SRV_F_CHECKADDR;
-   srv->flags |= SRV_F_AGENTADDR;
 
   out:
return err_code;
-- 
2.29.2




[PATCH v2 2/5] MEDIUM: server: adding support for check_port in server state

2021-02-02 Thread William Dauchy
We can currently change the check-port using the cli command `set server
check-port` but there is a consistency issue when using server state.
This patch aims to fix this problem but will be also a good preparation
work to get rid of checkport flag, so we are able to know when checkport
was set by config.

I am fully aware this is not making github #953 moving forward, I
however think this might be acceptable while waiting for a proper
solution and resolve consistency problem faced with port settings.

Signed-off-by: William Dauchy 
---
 doc/management.txt|  1 +
 include/haproxy/server-t.h|  7 ++--
 .../checks/1be_40srv_odd_health_checks.vtc|  2 +-
 .../checks/40be_2srv_odd_health_checks.vtc|  2 +-
 reg-tests/checks/4be_1srv_health_checks.vtc   |  6 ++--
 src/proxy.c   |  4 +--
 src/server.c  | 35 ---
 7 files changed, 35 insertions(+), 22 deletions(-)

diff --git a/doc/management.txt b/doc/management.txt
index de3fbf607..b74aba769 100644
--- a/doc/management.txt
+++ b/doc/management.txt
@@ -2447,6 +2447,7 @@ show servers state []
  srv_port:Server port.
  srvrecord:   DNS SRV record associated to this SRV.
  srv_use_ssl: use ssl for server connections.
+ srv_check_port:  Server check port.
 
 show sess
   Dump all known sessions. Avoid doing this on slow connections as this can
diff --git a/include/haproxy/server-t.h b/include/haproxy/server-t.h
index b29c75c0b..e42b1c7ed 100644
--- a/include/haproxy/server-t.h
+++ b/include/haproxy/server-t.h
@@ -125,10 +125,11 @@ enum srv_initaddr {
 "srv_fqdn "   \
 "srv_port "   \
 "srvrecord "  \
-"srv_use_ssl"
+"srv_use_ssl "\
+"srv_check_port"
 
-#define SRV_STATE_FILE_MAX_FIELDS 21
-#define SRV_STATE_FILE_NB_FIELDS_VERSION_1 20
+#define SRV_STATE_FILE_MAX_FIELDS 22
+#define SRV_STATE_FILE_NB_FIELDS_VERSION_1 21
 #define SRV_STATE_LINE_MAXLEN 512
 
 /* server flags -- 32 bits */
diff --git a/reg-tests/checks/1be_40srv_odd_health_checks.vtc 
b/reg-tests/checks/1be_40srv_odd_health_checks.vtc
index bd07d8840..f01205295 100644
--- a/reg-tests/checks/1be_40srv_odd_health_checks.vtc
+++ b/reg-tests/checks/1be_40srv_odd_health_checks.vtc
@@ -112,6 +112,6 @@ syslog S -wait
 
 haproxy h1 -cli {
 send "show servers state"
-expect ~ "# be_id be_name srv_id srv_name srv_addr srv_op_state 
srv_admin_state srv_uweight srv_iweight srv_time_since_last_change 
srv_check_status srv_check_result srv_check_health srv_check_state 
srv_agent_state bk_f_forced_id srv_f_forced_id srv_fqdn srv_port srvrecord 
srv_use_ssl\n2 be1 1 srv0 ${s0_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - 
${s0_port} - 0\n2 be1 2 srv1 ${s1_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ 
){3}0 0 0 - ${s1_port} - 0\n2 be1 3 srv2 ${s2_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 
0 0 0 0 - ${s2_port} - 0\n2 be1 4 srv3 ${s3_addr} 2 0 1 1 [[:digit:]]+ 6 
([[:digit:]]+ ){3}0 0 0 - ${s3_port} - 0\n2 be1 5 srv4 ${s4_addr} 2 0 1 1 
[[:digit:]]+ 1 0 1 0 0 0 0 - ${s4_port} - 0\n2 be1 6 srv5 ${s5_addr} 2 0 1 1 
[[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s5_port} - 0\n2 be1 7 srv6 
${s6_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s6_port} - 0\n2 be1 8 srv7 
${s7_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s7_port} - 0\n2 
be1 9 srv8 ${s8_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s8_port} - 0\n2 
be1 10 srv9 ${s9_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - 
${s9_port} - 0\n2 be1 11 srv10 ${s10_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - 
${s10_port} - 0\n2 be1 12 srv11 ${s11_addr} 2 0 1 1 [[:digit:]]+ 6 
([[:digit:]]+ ){3}0 0 0 - ${s11_port} - 0\n2 be1 13 srv12 ${s12_addr} 2 0 1 1 
[[:digit:]]+ 1 0 1 0 0 0 0 - ${s12_port} - 0\n2 be1 14 srv13 ${s13_addr} 2 0 1 
1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s13_port} - 0\n2 be1 15 srv14 
${s14_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s14_port} - 0\n2 be1 16 
srv15 ${s15_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s15_port} 
- 0\n2 be1 17 srv16 ${s16_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - 
${s16_port} - 0\n2 be1 18 srv17 ${s17_addr} 2 0 1 1 [[:digit:]]+ 6 
([[:digit:]]+ ){3}0 0 0 - ${s17_port} - 0\n2 be1 19 srv18 ${s18_addr} 2 0 1 1 
[[:digit:]]+ 1 0 1 0 0 0 0 - ${s18_port} - 0\n2 be1 20 srv19 ${s19_addr} 2 0 1 
1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s19_port} - 0\n2 be1 21 srv20 
${s20_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s20_port} - 0\n2 be1 22 
srv21 ${s21_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s21_port} 
- 0\n2 be1 23 srv22 ${s22_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - 
${s22_port} - 0\n2 be1 24 srv23 ${s23_addr} 2 0 1 1 [[:digit:]]+ 6 
([[:digit:]]+ ){3}0 0 0 - ${s23_port} - 0\n2 be1 25 srv24 ${s24_addr} 2 0 1 1 
[[:digit:]]+ 1 

[PATCH v2 3/5] MEDIUM: server: get rid of checkport flag

2021-02-02 Thread William Dauchy
While trying to fix some consistency problem with the config file/cli
(e.g. check-port cli command does not set the flag), we realised
checkport flag was not necessarily needed. Indeed tcpcheck uses service
port as the last choice if check.port is zero. So we can assume if
check.port is zero, it means it was never set by the user, regardless if
it is by the cli or config file.  In the longterm this will avoid to
introduce a new consistency issue if we forget to set the flag.

Signed-off-by: William Dauchy 
---
 include/haproxy/server-t.h | 1 -
 src/check.c| 1 -
 src/dns.c  | 3 +--
 src/server.c   | 7 ++-
 4 files changed, 3 insertions(+), 9 deletions(-)

diff --git a/include/haproxy/server-t.h b/include/haproxy/server-t.h
index e42b1c7ed..131b97cb6 100644
--- a/include/haproxy/server-t.h
+++ b/include/haproxy/server-t.h
@@ -139,7 +139,6 @@ enum srv_initaddr {
 #define SRV_F_USE_NS_FROM_PP 0x0008  /* use namespace associated with 
connection if present */
 #define SRV_F_FORCED_ID0x0010/* server's ID was forced in the 
configuration */
 #define SRV_F_CHECKADDR0x0020/* this server has a check addr 
configured */
-#define SRV_F_CHECKPORT0x0040/* this server has a check port 
configured */
 #define SRV_F_AGENTADDR0x0080/* this server has a agent addr 
configured */
 #define SRV_F_COOKIESET0x0100/* this server has a cookie 
configured, so don't generate dynamic cookies */
 #define SRV_F_FASTOPEN 0x0200/* Use TCP Fast Open to connect to 
server */
diff --git a/src/check.c b/src/check.c
index 879fe84ce..f4b11fc46 100644
--- a/src/check.c
+++ b/src/check.c
@@ -2050,7 +2050,6 @@ static int srv_parse_check_port(char **args, int 
*cur_arg, struct proxy *curpx,
 
global.maxsock++;
srv->check.port = atol(args[*cur_arg+1]);
-   srv->flags |= SRV_F_CHECKPORT;
 
   out:
return err_code;
diff --git a/src/dns.c b/src/dns.c
index 8c97df46b..8fc9089e7 100644
--- a/src/dns.c
+++ b/src/dns.c
@@ -712,8 +712,7 @@ static void dns_check_dns_response(struct dns_resolution 
*res)
 
srv->svc_port = item->port;
srv->flags   &= ~SRV_F_MAPPORTS;
-   if ((srv->check.state & CHK_ST_CONFIGURED) &&
-   !(srv->flags & SRV_F_CHECKPORT))
+   if (!srv->check.port)
srv->check.port = item->port;
 
if (!srv->dns_opts.ignore_weight) {
diff --git a/src/server.c b/src/server.c
index d8216058f..1fd71e403 100644
--- a/src/server.c
+++ b/src/server.c
@@ -1660,8 +1660,6 @@ static void srv_settings_cpy(struct server *srv, struct 
server *src, int srv_tmp
srv->flags   |= src->flags;
srv->do_check = src->do_check;
srv->do_agent = src->do_agent;
-   if (srv->check.port)
-   srv->flags |= SRV_F_CHECKPORT;
srv->check.inter  = src->check.inter;
srv->check.fastinter  = src->check.fastinter;
srv->check.downinter  = src->check.downinter;
@@ -2985,8 +2983,7 @@ static void srv_update_state(struct server *srv, int 
version, char **params)
}
 
/* configure check.port accordingly */
-   if ((srv->check.state & CHK_ST_CONFIGURED) &&
-   !(srv->flags & SRV_F_CHECKPORT))
+   if (!srv->check.port)
srv->check.port = port_check;
 
/* Unset SRV_F_MAPPORTS for SRV records.
@@ -3673,7 +3670,7 @@ const char *update_server_addr_port(struct server *s, 
const char *addr, const ch
 * we're switching from a fixed port to a 
SRV_F_MAPPORTS (mapped) port
 * prevent PORT change if check doesn't have 
it's dedicated port while switching
 * to port mapping */
-   if ((s->check.state & CHK_ST_CONFIGURED) && 
!(s->flags & SRV_F_CHECKPORT)) {
+   if (!s->check.port) {
chunk_appendf(msg, "can't change  
to port map because it is incompatible with current health check port 
configuration (use 'port' statement from the 'server' directive.");
goto out;
}
-- 
2.29.2




[PATCH v2 0/5] fix check port/addr consistency

2021-02-02 Thread William Dauchy
Hello Christopher,

As discussed, I revisited my previous series regarding check addr and
port consistency. I don't think I missed anything.

I won't hide my aim here, I would like to add support to set
`check.addr` on the cli like it is possible for the service port. More
and more people have a setup with containers having their own IP, but
with a health check on the host (e.g. on the consul side for example).
So it becomes more and more needed.

That's mostly why I wanted to clarify the situation first with the
different things I've seen while revisting this part of the code.

William Dauchy (5):
  BUG/MINOR: cli: fix set server addr/port coherency with health checks
  MEDIUM: server: adding support for check_port in server state
  MEDIUM: server: get rid of checkport flag
  MEDIUM: get rid of checkaddr and agentaddr flag
  BUG/MINOR: server: don't set agent addr for addr parsing

 doc/management.txt|  1 +
 include/haproxy/server-t.h| 10 ++--
 .../checks/1be_40srv_odd_health_checks.vtc|  2 +-
 .../checks/40be_2srv_odd_health_checks.vtc|  2 +-
 reg-tests/checks/4be_1srv_health_checks.vtc   |  6 +-
 src/check.c   |  5 +-
 src/dns.c |  3 +-
 src/proxy.c   |  4 +-
 src/server.c  | 57 ---
 9 files changed, 39 insertions(+), 51 deletions(-)

-- 
2.29.2




[PATCH v2 5/5] BUG/MINOR: server: don't set agent addr for addr parsing

2021-02-02 Thread William Dauchy
small consistency problem with `addr` and `agent-addr` options:
for the both options, the last one parsed is always used to set the
agent-check addr.  Thus these two lines don't have the same behavior:

  server ... addr  agent-addr 
  server ... agent-addr  addr 

I was not really able to determine where this issue is coming from. So
it is probable we may backport it to all stable version where the agent
is supported.

Signed-off-by: William Dauchy 
---
 src/check.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/check.c b/src/check.c
index f4d5bd452..aab86dc8c 100644
--- a/src/check.c
+++ b/src/check.c
@@ -1526,7 +1526,7 @@ static int srv_parse_addr(char **args, int *cur_arg, 
struct proxy *curpx, struct
goto error;
}
 
-   srv->check.addr = srv->agent.addr = *sk;
+   srv->check.addr = *sk;
 
   out:
return err_code;
-- 
2.29.2




[PATCH v2 1/5] BUG/MINOR: cli: fix set server addr/port coherency with health checks

2021-02-02 Thread William Dauchy
while reading `update_server_addr_port` I found out some things which
can be seen as incoherency. I hope I did not overlooked anything:

- one comment is stating check's address should be updated if it uses
  the server one; however the condition checks if `SRV_F_CHECKADDR` is
  set; this flag is set when a check address is set; result is that we
  override the check address where I was not expecting it. In fact we
  don't need to update anything here as server addr is used when check
  addr is not set.
- same goes for check agent addr
- for port, it is a bit different, we update the check port if it is
  unset. This is harmless because we also use server port if check port
  is unset. However it creates some incoherency before/after using this
  command, as check port should stay unset througout the life of the
  process unless it is is set by `set server check-port` command.

quite hard to locate the origin of this this issue but the function was
introduced in commit d458adcc52b74608e2fe6a2a95f09ce5e94932b7 ("MINOR:
new update_server_addr_port() function to change both server's ADDR and
service PORT"). I was however not able to determine whether this is due
to a change of behavior along the years. So this patch can potentially
be backported up to v1.8 but we must be careful while doing so, as the
code has changed a lot. That being said, the bug being not very
impacting I would be fine keeping it for 2.4 only.

Signed-off-by: William Dauchy 
---
 src/server.c | 15 ---
 1 file changed, 15 deletions(-)

diff --git a/src/server.c b/src/server.c
index 10f528640..99b7e9181 100644
--- a/src/server.c
+++ b/src/server.c
@@ -3625,16 +3625,6 @@ const char *update_server_addr_port(struct server *s, 
const char *addr, const ch
ipcpy(, >addr);
changed = 1;
 
-   /* we also need to update check's ADDR only if it uses the 
server's one */
-   if ((s->check.state & CHK_ST_CONFIGURED) && (s->flags & 
SRV_F_CHECKADDR)) {
-   ipcpy(, >check.addr);
-   }
-
-   /* we also need to update agent ADDR only if it use the 
server's one */
-   if ((s->agent.state & CHK_ST_CONFIGURED) && (s->flags & 
SRV_F_AGENTADDR)) {
-   ipcpy(, >agent.addr);
-   }
-
/* update report for caller */
chunk_printf(msg, "IP changed from '%s' to '%s'", current_addr, 
addr);
}
@@ -3714,11 +3704,6 @@ const char *update_server_addr_port(struct server *s, 
const char *addr, const ch
}
 
chunk_appendf(msg, "%d'", new_port);
-
-   /* we also need to update health checks port only if it 
uses server's realport */
-   if ((s->check.state & CHK_ST_CONFIGURED) && !(s->flags 
& SRV_F_CHECKPORT)) {
-   s->check.port = new_port;
-   }
}
else {
chunk_appendf(msg, "no need to change the port");
-- 
2.29.2




Re: [PATCH 1/2] BUG/MINOR: cli: fix set server addr/port coherency with health checks

2021-02-02 Thread William Dauchy
Hi Christopher,

Thanks for the review.

On Tue, Feb 2, 2021 at 10:21 AM Christopher Faulet  wrote:
> So, it may be good to take a global look at this stuff. I may missed 
> something.
> And be carefull for the backports because the health-checks were refactored in
> the 2.2.

ok I will come back with a more global look and see whether I can get
rid of those flags.

Thanks,
-- 
William



Re: [PATCH 1/9] MAJOR: contrib/prometheus-exporter: move health check status to labels

2021-02-01 Thread William Dauchy
On Mon, Feb 1, 2021 at 12:05 PM Pierre Cheynier  wrote:
> In addition to this update, I would add some recommendations about the user
> setup in the README ("how do I prevent my prometheus instance to explode
> when scrapping this endpoint?").
> For server-template users:
> - 
>   params:
> no-maint:
> - empty
>
> Generally speaking, to prevent all server metrics to be saved, except this 
> one:
> - 
>metric_relabel_configs:
>- source_labels: ['__name__']
>   regex: 'haproxy_(process_|frontend_|backend_|server_check_status).*'
>   action: keep

agreed taken into account in v2

> "Status of last health check, per state value" ?

updated in v2.
-- 
William



Re: [PATCH 0/9] prometheus: health check as labels + cleanup

2021-02-01 Thread William Dauchy
On Mon, Feb 01, 2021 at 11:30:18AM +0100, Christopher Faulet wrote:
> Don't be sorry to send patches ! If you think the number of labels for the
> check_status metric is not a problem, even for huge configurations, I trust 
> you.
> And I guess we can reduce this number to 16 status only, removing all status
> with an unknown check result (HCHK_STATUS_UNKNOWN, HCHK_STATUS_INI,
> HCHK_STATUS_START, HCHK_STATUS_L57DATA) and the status regarding the 
> agent-check
> only (HCHK_STATUS_CHECKED).
> 
> For info, HCHK_STATUS_UNKNOWN is only used for uninitialized health-check
> (during configuration parsing). HCHK_STATUS_INI is used for initialized
> health-check and before the first run. HCHK_STATUS_START and 
> HCHK_STATUS_L57DATA
> are dummy status and never assigned to a health-check. And as said,
> HCHK_STATUS_CHECKED is only used for agent-check.
> 
> It is a slight improvement, but still better than nothing. To do so, I propose
> you to add a function in check.c to get the corresponding check result of a
> status (the attached patch). This way, in your first patch, we can filter on
> check result. I may amend your first patch this way if you're ok:

definitely worth it indeed. I let you do the amend

> --- a/contrib/prometheus-exporter/service-prometheus.c
> +++ b/contrib/prometheus-exporter/service-prometheus.c
> @@ -880,6 +880,8 @@ static int promex_dump_srv_metrics(struct appctx *appctx, 
> struct htx *htx)
> goto next_sv;
> for (i = 0; i < 
> HCHK_STATUS_SIZE; i++) {
> +   if 
> (get_check_status_result(i) < CHK_RES_FAILED)
> +   continue;
> val = 
> mkf_u32(FO_STATUS, sv->check.status == i);
> check_state = 
> get_check_status_info(i);
> labels[2].name = 
> ist("state");
> 
> No comments about the other patches. Except the README is now outdated and
> does not reflect recent changes. It could be good to keep it up-to-date as far
> as possible.

ah true, I will send an update about it soon.

-- 
William



[PATCH 1/2] BUG/MINOR: cli: fix set server addr/port coherency with health checks

2021-02-01 Thread William Dauchy
while reading `update_server_addr_port` I found out some things which
can be seen as incoherency. I hope I did not overlooked anything:

- one comment is stating check's address should be updated if it uses
  the server one; however the condition checks if `SRV_F_CHECKADDR` is
  set; this flag is set when a check address is set; result is that we
  override the check address where I was not expecting it. In fact we
  don't need to update anything here as server addr is used when check
  addr is not set.
- same goes for check agent addr
- for port, it is a bit different, we update the check port if it is
  unset. This is harmless because we also use server port if check port
  is unset. However it creates some incoherency before/after using this
  command, as check port should stay unset througout the life of the
  process unless it is is set by `set server check-port` command.

quite hard to locate the origin of this this issue but the function was
introduced in commit d458adcc52b74608e2fe6a2a95f09ce5e94932b7 ("MINOR:
new update_server_addr_port() function to change both server's ADDR and
service PORT"). I was however not able to determine whether this is due
to a change of behavior along the years. So this patch can potentially
be backported up to v1.8.

Signed-off-by: William Dauchy 
---
 src/server.c | 15 ---
 1 file changed, 15 deletions(-)

diff --git a/src/server.c b/src/server.c
index 10f528640..99b7e9181 100644
--- a/src/server.c
+++ b/src/server.c
@@ -3625,16 +3625,6 @@ const char *update_server_addr_port(struct server *s, 
const char *addr, const ch
ipcpy(, >addr);
changed = 1;
 
-   /* we also need to update check's ADDR only if it uses the 
server's one */
-   if ((s->check.state & CHK_ST_CONFIGURED) && (s->flags & 
SRV_F_CHECKADDR)) {
-   ipcpy(, >check.addr);
-   }
-
-   /* we also need to update agent ADDR only if it use the 
server's one */
-   if ((s->agent.state & CHK_ST_CONFIGURED) && (s->flags & 
SRV_F_AGENTADDR)) {
-   ipcpy(, >agent.addr);
-   }
-
/* update report for caller */
chunk_printf(msg, "IP changed from '%s' to '%s'", current_addr, 
addr);
}
@@ -3714,11 +3704,6 @@ const char *update_server_addr_port(struct server *s, 
const char *addr, const ch
}
 
chunk_appendf(msg, "%d'", new_port);
-
-   /* we also need to update health checks port only if it 
uses server's realport */
-   if ((s->check.state & CHK_ST_CONFIGURED) && !(s->flags 
& SRV_F_CHECKPORT)) {
-   s->check.port = new_port;
-   }
}
else {
chunk_appendf(msg, "no need to change the port");
-- 
2.29.2




[PATCH 2/2] BUG/MINOR: cli: set server check-port missing flag

2021-02-01 Thread William Dauchy
when we set check port flag, we need to set a flag on server.

this is missing since the origin of the feature: commit
5094656a67fa1b56f30cd2316adb675ca9d2a79a ("MINOR: cli: change a server
health check port through the stats socket"). So it can potentially be
backport up to v1.8.

Signed-off-by: William Dauchy 
---
 src/server.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/server.c b/src/server.c
index 99b7e9181..ded3d47ff 100644
--- a/src/server.c
+++ b/src/server.c
@@ -4405,6 +4405,7 @@ static int cli_parse_set_server(char **args, char 
*payload, struct appctx *appct
goto out_unlock;
}
sv->check.port = i;
+   sv->flags |= SRV_F_CHECKPORT;
cli_msg(appctx, LOG_NOTICE, "health check port updated.\n");
}
else if (strcmp(args[3], "addr") == 0) {
-- 
2.29.2




[PATCH 9/9] CLEANUP: contrib/prometheus-exporter: align and reorder fields

2021-01-30 Thread William Dauchy
- align safe_idle_connections_current field
  fix minor typo added in commit
  37286a5ac595069a491c3e8a7a7e4faf3d9ea8ad ("MEDIUM:
  contrib/prometheus-exporter: Rework matrices defining Promex metrics")
- reorder info fields to be able to compare them easily
- add missing ignored info fields as comment

Signed-off-by: William Dauchy 
---
 contrib/prometheus-exporter/service-prometheus.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/contrib/prometheus-exporter/service-prometheus.c 
b/contrib/prometheus-exporter/service-prometheus.c
index 98f6fca62..29524cd91 100644
--- a/contrib/prometheus-exporter/service-prometheus.c
+++ b/contrib/prometheus-exporter/service-prometheus.c
@@ -108,7 +108,6 @@ const struct promex_metric 
promex_global_metrics[INF_TOTAL_FIELDS] = {
//[INF_NAME]   ignored
//[INF_VERSION],   ignored
//[INF_RELEASE_DATE]   ignored
-   [INF_BUILD_INFO] = { .n = IST("build_info"),
.type = PROMEX_MT_GAUGE,   .flags = PROMEX_FL_INFO_METRIC },
[INF_NBTHREAD]   = { .n = IST("nbthread"),  
.type = PROMEX_MT_GAUGE,   .flags = PROMEX_FL_INFO_METRIC },
[INF_NBPROC] = { .n = IST("nbproc"),
.type = PROMEX_MT_GAUGE,   .flags = PROMEX_FL_INFO_METRIC },
[INF_PROCESS_NUM]= { .n = 
IST("relative_process_id"),   .type = PROMEX_MT_GAUGE,   .flags = 
PROMEX_FL_INFO_METRIC },
@@ -116,8 +115,11 @@ const struct promex_metric 
promex_global_metrics[INF_TOTAL_FIELDS] = {
//[INF_UPTIME] ignored
[INF_UPTIME_SEC] = { .n = IST("uptime_seconds"),
.type = PROMEX_MT_GAUGE,   .flags = PROMEX_FL_INFO_METRIC },
[INF_START_TIME_SEC] = { .n = 
IST("start_time_seconds"),.type = PROMEX_MT_GAUGE,   .flags = 
PROMEX_FL_INFO_METRIC },
+   //[INF_MEMMAX_MB]  ignored
[INF_MEMMAX_BYTES]   = { .n = IST("max_memory_bytes"),  
.type = PROMEX_MT_GAUGE,   .flags = PROMEX_FL_INFO_METRIC },
+   //[INF_POOL_ALLOC_MB]  ignored
[INF_POOL_ALLOC_BYTES]   = { .n = 
IST("pool_allocated_bytes"),  .type = PROMEX_MT_GAUGE,   .flags = 
PROMEX_FL_INFO_METRIC },
+   //[INF_POOL_USED_MB]   ignored
[INF_POOL_USED_BYTES]= { .n = IST("pool_used_bytes"),   
.type = PROMEX_MT_GAUGE,   .flags = PROMEX_FL_INFO_METRIC },
[INF_POOL_FAILED]= { .n = 
IST("pool_failures_total"),   .type = PROMEX_MT_COUNTER, .flags = 
PROMEX_FL_INFO_METRIC },
[INF_ULIMIT_N]   = { .n = IST("max_fds"),   
.type = PROMEX_MT_GAUGE,   .flags = PROMEX_FL_INFO_METRIC },
@@ -173,6 +175,7 @@ const struct promex_metric 
promex_global_metrics[INF_TOTAL_FIELDS] = {
[INF_BYTES_OUT_RATE] = { .n = IST("bytes_out_rate"),
.type = PROMEX_MT_GAUGE,   .flags = PROMEX_FL_INFO_METRIC },
//[INF_DEBUG_COMMANDS_ISSUED]  ignored
[INF_CUM_LOG_MSGS]   = { .n = IST("recv_logs_total"),   
.type = PROMEX_MT_COUNTER, .flags = PROMEX_FL_INFO_METRIC },
+   [INF_BUILD_INFO] = { .n = IST("build_info"),
.type = PROMEX_MT_GAUGE,   .flags = PROMEX_FL_INFO_METRIC },
 };
 
 /* frontend/backend/server fields */
@@ -273,7 +276,7 @@ const struct promex_metric 
promex_st_metrics[ST_F_TOTAL_FIELDS] = {
[ST_F_TT_MAX] = { .n = IST("max_total_time_seconds"),   
.type = PROMEX_MT_GAUGE,.flags = ( 
PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) },
[ST_F_EINT]   = { .n = IST("internal_errors_total"),
.type = PROMEX_MT_COUNTER,  .flags = (PROMEX_FL_FRONT_METRIC | 
PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) },
[ST_F_IDLE_CONN_CUR]  = { .n = IST("unsafe_idle_connections_current"),  
.type = PROMEX_MT_GAUGE,.flags = (  
   PROMEX_FL_SRV_METRIC) },
-   [ST_F_SAFE_CONN_CUR]=   { .n = IST("safe_idle_connections_current"),
.type = PROMEX_MT_GAUGE,.flags = (  
   PROMEX_FL_SRV_METRIC) },
+   [ST_F_SAFE_CONN_CUR]  = { .n = IST("safe_idle_connections_current"),
.type = PROMEX_MT_GAUGE,.flags = (  
   PROMEX_FL_SRV_METRIC) },
[ST_F_USED_CONN_CUR]  = { .n = IST("used_connections_current"), 
.type =

[PATCH 7/9] MINOR: contrib/prometheus-exporter: add recv logs_logs_total field

2021-01-30 Thread William Dauchy
this field was added by commit 45c457a62941a7c4a86ce4327d7755edcd4b230e
("MINOR: log: adds counters on received syslog messages.")

Signed-off-by: William Dauchy 
---
 contrib/prometheus-exporter/service-prometheus.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/contrib/prometheus-exporter/service-prometheus.c 
b/contrib/prometheus-exporter/service-prometheus.c
index 0db75b6ae..6c5072b16 100644
--- a/contrib/prometheus-exporter/service-prometheus.c
+++ b/contrib/prometheus-exporter/service-prometheus.c
@@ -177,6 +177,7 @@ const struct promex_metric 
promex_global_metrics[INF_TOTAL_FIELDS] = {
[INF_TOTAL_SPLICED_BYTES_OUT]= { .n = 
IST("spliced_bytes_out_total"),   .type = PROMEX_MT_COUNTER, .flags = 
PROMEX_FL_INFO_METRIC },
[INF_BYTES_OUT_RATE] = { .n = IST("bytes_out_rate"),
.type = PROMEX_MT_GAUGE,   .flags = PROMEX_FL_INFO_METRIC },
//[INF_DEBUG_COMMANDS_ISSUED]  ignored
+   [INF_CUM_LOG_MSGS]   = { .n = IST("recv_logs_total"),   
.type = PROMEX_MT_COUNTER, .flags = PROMEX_FL_INFO_METRIC },
 };
 
 /* frontend/backend/server fields */
-- 
2.29.2




[PATCH 8/9] CLEANUP: contrib/prometheus-exporter: remove unused includes

2021-01-30 Thread William Dauchy
unless I'm wrong, those includes are no longer needed.  The only recent
one I remember is ssl-sock include since commit
5d9b8f3c9347a1a10b86f81d70b22c3cab0e6925 ("MINOR:
contrib/prometheus-exporter: use fill_info for process dump") where we
make use of the code from stats.c

Signed-off-by: William Dauchy 
---
 contrib/prometheus-exporter/service-prometheus.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/contrib/prometheus-exporter/service-prometheus.c 
b/contrib/prometheus-exporter/service-prometheus.c
index 6c5072b16..98f6fca62 100644
--- a/contrib/prometheus-exporter/service-prometheus.c
+++ b/contrib/prometheus-exporter/service-prometheus.c
@@ -19,8 +19,6 @@
 #include 
 #include 
 #include 
-#include 
-#include 
 #include 
 #include 
 #include 
@@ -29,12 +27,9 @@
 #include 
 #include 
 #include 
-#include 
-#include 
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
-- 
2.29.2




[PATCH 5/9] MINOR: contrib/prometheus-exporter: use stats desc when possible

2021-01-30 Thread William Dauchy
It is a followup work of commit a191b77e54c26a97064cb42ab4927d4f5c95b896
("MINOR: contrib/prometheus-exporter: merge info description from
stats") but for all other stats fields; we however keep a way to
override them when needed (e.g. units, specific cases)

this is another step which will avoid duplicating work between stats.c
and prometheus.

Signed-off-by: William Dauchy 
---
 .../prometheus-exporter/service-prometheus.c  | 107 ++
 1 file changed, 10 insertions(+), 97 deletions(-)

diff --git a/contrib/prometheus-exporter/service-prometheus.c 
b/contrib/prometheus-exporter/service-prometheus.c
index 8baef82d7..f10e78f49 100644
--- a/contrib/prometheus-exporter/service-prometheus.c
+++ b/contrib/prometheus-exporter/service-prometheus.c
@@ -282,107 +282,20 @@ const struct promex_metric 
promex_st_metrics[ST_F_TOTAL_FIELDS] = {
[ST_F_NEED_CONN_EST]  = { .n = IST("need_connections_current"), 
.type = PROMEX_MT_GAUGE,.flags = (  
   PROMEX_FL_SRV_METRIC) },
 };
 
-/* Description of all stats fields */
+/* Description of overriden stats fields */
 const struct ist promex_st_metric_desc[ST_F_TOTAL_FIELDS] = {
-   [ST_F_PXNAME] = IST("The proxy name."),
-   [ST_F_SVNAME] = IST("The service name (FRONTEND for frontend, 
BACKEND for backend, any name for server/listener)."),
-   [ST_F_QCUR]   = IST("Current number of queued requests."),
-   [ST_F_QMAX]   = IST("Maximum observed number of queued 
requests."),
-   [ST_F_SCUR]   = IST("Current number of active sessions."),
-   [ST_F_SMAX]   = IST("Maximum observed number of active 
sessions."),
-   [ST_F_SLIM]   = IST("Configured session limit."),
-   [ST_F_STOT]   = IST("Total number of sessions."),
-   [ST_F_BIN]= IST("Current total of incoming bytes."),
-   [ST_F_BOUT]   = IST("Current total of outgoing bytes."),
-   [ST_F_DREQ]   = IST("Total number of denied requests."),
-   [ST_F_DRESP]  = IST("Total number of denied responses."),
-   [ST_F_EREQ]   = IST("Total number of request errors."),
-   [ST_F_ECON]   = IST("Total number of connection errors."),
-   [ST_F_ERESP]  = IST("Total number of response errors."),
-   [ST_F_WRETR]  = IST("Total number of retry warnings."),
-   [ST_F_WREDIS] = IST("Total number of redispatch warnings."),
[ST_F_STATUS] = IST("Current status of the service (0/1 
depending on current `state` label value)."),
-   [ST_F_WEIGHT] = IST("Service weight."),
-   [ST_F_ACT]= IST("Current number of active servers."),
-   [ST_F_BCK]= IST("Current number of backup servers."),
-   [ST_F_CHKFAIL]= IST("Total number of failed check (Only counts 
checks failed when the server is up)."),
-   [ST_F_CHKDOWN]= IST("Total number of UP->DOWN transitions."),
-   [ST_F_LASTCHG]= IST("Number of seconds since the last UP<->DOWN 
transition."),
-   [ST_F_DOWNTIME]   = IST("Total downtime (in seconds) for the 
service."),
-   [ST_F_QLIMIT] = IST("Configured maxqueue for the server (0 
meaning no limit)."),
-   [ST_F_PID]= IST("Process id (0 for first instance, 1 for 
second, ...)"),
-   [ST_F_IID]= IST("Unique proxy id."),
-   [ST_F_SID]= IST("Server id (unique inside a proxy)."),
-   [ST_F_THROTTLE]   = IST("Current throttle percentage for the 
server, when slowstart is active, or no value if not in slowstart."),
-   [ST_F_LBTOT]  = IST("Total number of times a service was 
selected, either for new sessions, or when redispatching."),
-   [ST_F_TRACKED]= IST("Id of proxy/server if tracking is 
enabled."),
-   [ST_F_TYPE]   = IST("Service type (0=frontend, 1=backend, 
2=server, 3=socket/listener)."),
-   [ST_F_RATE]   = IST("Current number of sessions per second over 
last elapsed second."),
-   [ST_F_RATE_LIM]   = IST("Configured limit on new sessions per 
second."),
-   [ST_F_RATE_MAX]   = IST("Maximum observed number of sessions per 
second."),
[ST_F_CHECK_STATUS]   = IST("Status of last health check (0/1 depending 
on current `state` label value)."),
[ST_F_CHECK_CODE] = IST("layer5-7 code, if available of the last 
health check."),
[ST_F_CHECK_DURATION] = IST("Total duration of the latest server health 
che

[PATCH 6/9] MINOR: contrib/prometheus-exporter: add uweight field

2021-01-30 Thread William Dauchy
this field was added in commit bd7151002437af1a034a9fdbb582b3cbef5a78d1
("MINOR: stats: report server's user-configured weight next to effective
weight")

Signed-off-by: William Dauchy 
---
 contrib/prometheus-exporter/service-prometheus.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/contrib/prometheus-exporter/service-prometheus.c 
b/contrib/prometheus-exporter/service-prometheus.c
index f10e78f49..0db75b6ae 100644
--- a/contrib/prometheus-exporter/service-prometheus.c
+++ b/contrib/prometheus-exporter/service-prometheus.c
@@ -280,6 +280,7 @@ const struct promex_metric 
promex_st_metrics[ST_F_TOTAL_FIELDS] = {
[ST_F_SAFE_CONN_CUR]=   { .n = IST("safe_idle_connections_current"),
.type = PROMEX_MT_GAUGE,.flags = (  
   PROMEX_FL_SRV_METRIC) },
[ST_F_USED_CONN_CUR]  = { .n = IST("used_connections_current"), 
.type = PROMEX_MT_GAUGE,.flags = (  
   PROMEX_FL_SRV_METRIC) },
[ST_F_NEED_CONN_EST]  = { .n = IST("need_connections_current"), 
.type = PROMEX_MT_GAUGE,.flags = (  
   PROMEX_FL_SRV_METRIC) },
+   [ST_F_UWEIGHT]= { .n = IST("uweight"),  
.type = PROMEX_MT_GAUGE,.flags = ( 
PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) },
 };
 
 /* Description of overriden stats fields */
-- 
2.29.2




[PATCH 4/9] MINOR: stats: improve max stats descriptions

2021-01-30 Thread William Dauchy
In order to unify prometheus and stats description, we need to remove
some field reference which are specific to stats implementation:
- `scur` in max current sessions (also reword current session)
- `rate` in max sessions
- `req_rate` in max requests
- `conn_rate` in max connections

Signed-off-by: William Dauchy 
---
 src/stats.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/src/stats.c b/src/stats.c
index fdb8e9adf..6fa59a9ef 100644
--- a/src/stats.c
+++ b/src/stats.c
@@ -160,8 +160,8 @@ const struct name_desc stat_fields[ST_F_TOTAL_FIELDS] = {
[ST_F_SVNAME]= { .name = "svname",  
.desc = "Server name" },
[ST_F_QCUR]  = { .name = "qcur",
.desc = "Number of current queued connections" },
[ST_F_QMAX]  = { .name = "qmax",
.desc = "Highest value of queued connections encountered since process 
started" },
-   [ST_F_SCUR]  = { .name = "scur",
.desc = "Current number of sessions on the frontend, backend or server" 
},
-   [ST_F_SMAX]  = { .name = "smax",
.desc = "Highest value of scur encountered since process started" },
+   [ST_F_SCUR]  = { .name = "scur",
.desc = "Number of current sessions on the frontend, backend or server" 
},
+   [ST_F_SMAX]  = { .name = "smax",
.desc = "Highest value of current sessions encountered since process 
started" },
[ST_F_SLIM]  = { .name = "slim",
.desc = "Frontend/listener/server's maxconn, backend's fullconn" },
[ST_F_STOT]  = { .name = "stot",
.desc = "Total number of sessions since process started" },
[ST_F_BIN]   = { .name = "bin", 
.desc = "Total number of request bytes since process started" },
@@ -191,7 +191,7 @@ const struct name_desc stat_fields[ST_F_TOTAL_FIELDS] = {
[ST_F_TYPE]  = { .name = "type",
.desc = "Type of the object (Listener, Frontend, Backend, Server)" },
[ST_F_RATE]  = { .name = "rate",
.desc = "Total number of sessions processed by this object over the 
last second (sessions for listeners/frontends, requests for backends/servers)" 
},
[ST_F_RATE_LIM]  = { .name = "rate_lim",
.desc = "Limit on the number of sessions accepted in a second (frontend 
only, 'rate-limit sessions' setting)" },
-   [ST_F_RATE_MAX]  = { .name = "rate_max",
.desc = "Highest value of 'rate' observed since the worker process 
started" },
+   [ST_F_RATE_MAX]  = { .name = "rate_max",
.desc = "Highest value of sessions per second observed since the worker 
process started" },
[ST_F_CHECK_STATUS]  = { .name = "check_status",
.desc = "Status report of the server's latest health check, prefixed 
with '*' if a check is currently in progress" },
[ST_F_CHECK_CODE]= { .name = "check_code",  
.desc = "HTTP/SMTP/LDAP status code reported by the latest server 
health check" },
[ST_F_CHECK_DURATION]= { .name = "check_duration",  
.desc = "Total duration of the latest server health check, in 
milliseconds" },
@@ -203,7 +203,7 @@ const struct name_desc stat_fields[ST_F_TOTAL_FIELDS] = {
[ST_F_HRSP_OTHER]= { .name = "hrsp_other",  
.desc = "Total number of HTTP responses with status <100, >599 returned 
by this object since the worker process started (error -1 included)" },
[ST_F_HANAFAIL]  = { .name = "hanafail",
.desc = "Total number of failed checks caused by an 'on-error' 
directive after an 'observe' condition matched" },
[ST_F_REQ_RATE]  = { .name = "req_rate",
.desc = "Number of HTTP requests processed over the last second on this 
object" },
-   [ST_F_REQ_RATE_MAX]  = { .name = "req_rate_max",
.desc = "Highest value of 'req_rate' observed since the worker p

[PATCH 3/9] MINOR: stats: improve pending connections description

2021-01-30 Thread William Dauchy
In order to unify prometheus and stats description, we need to clarify
the description for pending connections.
- remove the BE reference in counters struct, as it is also used in
  servers
- remove reference of `qcur` field in description as it is specific to
  stats implemention
- try to reword cur and max pending connections description

Signed-off-by: William Dauchy 
---
 include/haproxy/counters-t.h | 2 +-
 src/stats.c  | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/haproxy/counters-t.h b/include/haproxy/counters-t.h
index b896c4d36..1ea68dd6c 100644
--- a/include/haproxy/counters-t.h
+++ b/include/haproxy/counters-t.h
@@ -73,7 +73,7 @@ struct be_counters {
 
unsigned int cps_max;   /* maximum of new connections 
received per second */
unsigned int sps_max;   /* maximum of new connections 
accepted per second (sessions) */
-   unsigned int nbpend_max;/* max number of pending 
connections with no server assigned yet (BE only) */
+   unsigned int nbpend_max;/* max number of pending 
connections with no server assigned yet */
unsigned int cur_sess_max;  /* max number of currently 
active sessions */
 
long long bytes_in; /* number of bytes transferred 
from the client to the server */
diff --git a/src/stats.c b/src/stats.c
index 4b8bd89b4..fdb8e9adf 100644
--- a/src/stats.c
+++ b/src/stats.c
@@ -158,8 +158,8 @@ const struct name_desc info_fields[INF_TOTAL_FIELDS] = {
 const struct name_desc stat_fields[ST_F_TOTAL_FIELDS] = {
[ST_F_PXNAME]= { .name = "pxname",  
.desc = "Proxy name" },
[ST_F_SVNAME]= { .name = "svname",  
.desc = "Server name" },
-   [ST_F_QCUR]  = { .name = "qcur",
.desc = "Current number of connections waiting in the server of backend 
queue" },
-   [ST_F_QMAX]  = { .name = "qmax",
.desc = "Highest value of qcur encountered since process started" },
+   [ST_F_QCUR]  = { .name = "qcur",
.desc = "Number of current queued connections" },
+   [ST_F_QMAX]  = { .name = "qmax",
.desc = "Highest value of queued connections encountered since process 
started" },
[ST_F_SCUR]  = { .name = "scur",
.desc = "Current number of sessions on the frontend, backend or server" 
},
[ST_F_SMAX]  = { .name = "smax",
.desc = "Highest value of scur encountered since process started" },
[ST_F_SLIM]  = { .name = "slim",
.desc = "Frontend/listener/server's maxconn, backend's fullconn" },
-- 
2.29.2




[PATCH 1/9] MAJOR: contrib/prometheus-exporter: move health check status to labels

2021-01-30 Thread William Dauchy
this patch is a breaking change between v2.3 and v2.4: we move from
using gauge value for health check states to labels values. The diff is
quite small thanks to the preparation work from Christopher to allow
more flexibility in labels, see commit
5a2f938732126f43bbf4cea5482c01552b0d0314 ("MEDIUM:
contrib/prometheus-exporter: Use dynamic labels instead of static ones")

this is a follow up of commit c6464591a365bfcf509b322bdaa4d608c9395d75
("MAJOR: contrib/prometheus-exporter: move ftd/bkd/srv states to
labels"). The main goal being to be better aligned with prometheus use
cases in terms of queries. More specifically to health checks, Pierre C.
mentioned the possible quirks he had to put in place in order to make
use of those metrics through prometheus:

   by(proxy, check_status) (count_values by(proxy,
  instance) ("check_status", haproxy_server_check_status))

I am perfectly aware this introduces a lot more metrics but I don't see
how we can improve the usability without it. The main issue remains in
the cardinality of the states which are > 20. Prometheus recommends to
stay below a cardinality of 10 for a given metric but I consider our
case very specific, because highly linked to the level of precision
haproxy exposes.

Even before this patch I saw several large production setup (a few
hundreds of MB in output) which are making use of the scope parameter to
simply ignore the server metrics, so that the scrapping can be faster,
and memory consumed on client side not too high. So I believe we should
eventually continue in that direction and offer more granularity of
filtering of the output. That being said it is already possible to
filter out the data on prometheus client side.

this is related to github issue #1029

Signed-off-by: William Dauchy 
---
 .../prometheus-exporter/service-prometheus.c| 17 ++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/contrib/prometheus-exporter/service-prometheus.c 
b/contrib/prometheus-exporter/service-prometheus.c
index dbf4c7f39..a3141d39d 100644
--- a/contrib/prometheus-exporter/service-prometheus.c
+++ b/contrib/prometheus-exporter/service-prometheus.c
@@ -18,6 +18,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -319,7 +320,7 @@ const struct ist promex_st_metric_desc[ST_F_TOTAL_FIELDS] = 
{
[ST_F_RATE]   = IST("Current number of sessions per second over 
last elapsed second."),
[ST_F_RATE_LIM]   = IST("Configured limit on new sessions per 
second."),
[ST_F_RATE_MAX]   = IST("Maximum observed number of sessions per 
second."),
-   [ST_F_CHECK_STATUS]   = IST("Status of last health check (HCHK_STATUS_* 
values)."),
+   [ST_F_CHECK_STATUS]   = IST("Status of last health check (0/1 depending 
on current `state` label value)."),
[ST_F_CHECK_CODE] = IST("layer5-7 code, if available of the last 
health check."),
[ST_F_CHECK_DURATION] = IST("Total duration of the latest server health 
check, in seconds."),
[ST_F_HRSP_1XX]   = IST("Total number of HTTP responses."),
@@ -886,6 +887,7 @@ static int promex_dump_srv_metrics(struct appctx *appctx, 
struct htx *htx)
int ret = 1;
double secs;
enum promex_srv_state state;
+   const char *check_state;
int i;
 
for (;appctx->st2 < ST_F_TOTAL_FIELDS; appctx->st2++) {
@@ -963,8 +965,17 @@ static int promex_dump_srv_metrics(struct appctx *appctx, 
struct htx *htx)
case ST_F_CHECK_STATUS:
if ((sv->check.state & 
(CHK_ST_ENABLED|CHK_ST_PAUSED)) != CHK_ST_ENABLED)
goto next_sv;
-   val = mkf_u32(FN_OUTPUT, 
sv->check.status);
-   break;
+
+   for (i = 0; i < 
HCHK_STATUS_SIZE; i++) {
+   val = 
mkf_u32(FO_STATUS, sv->check.status == i);
+   check_state = 
get_check_status_info(i);
+   labels[2].name = 
ist("state");
+   labels[2].value = 
ist2(check_state, strlen(check_state));
+   if 
(!promex_dump_metric(appctx, htx, prefix, _st_metrics[appctx->st2],
+   
, labels, , max))
+   goto full;
+   }
+   goto next_sv;

[PATCH 2/9] MINOR: contrib/prometheus-exporter: improve service status description field

2021-01-30 Thread William Dauchy
Since we changed the behaviour of this metric, improve the description
to better explain what is the meaning of the new gauge value; it also
reflects the description we did for health check status.

Signed-off-by: William Dauchy 
---
 contrib/prometheus-exporter/service-prometheus.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/contrib/prometheus-exporter/service-prometheus.c 
b/contrib/prometheus-exporter/service-prometheus.c
index a3141d39d..8baef82d7 100644
--- a/contrib/prometheus-exporter/service-prometheus.c
+++ b/contrib/prometheus-exporter/service-prometheus.c
@@ -301,7 +301,7 @@ const struct ist promex_st_metric_desc[ST_F_TOTAL_FIELDS] = 
{
[ST_F_ERESP]  = IST("Total number of response errors."),
[ST_F_WRETR]  = IST("Total number of retry warnings."),
[ST_F_WREDIS] = IST("Total number of redispatch warnings."),
-   [ST_F_STATUS] = IST("Current status of the service."),
+   [ST_F_STATUS] = IST("Current status of the service (0/1 
depending on current `state` label value)."),
[ST_F_WEIGHT] = IST("Service weight."),
[ST_F_ACT]= IST("Current number of active servers."),
[ST_F_BCK]= IST("Current number of backup servers."),
-- 
2.29.2




[PATCH 0/9] prometheus: health check as labels + cleanup

2021-01-30 Thread William Dauchy
Hi Christopher,

Sorry for this long series but I believe this one is fairly easy to
review:
- the major change is for health checks on prometheus side, there are
  more details in the commit message; this is something I wanted early
  so people can start testing it, especially on large setup. I however
  think we will need to improve the `scope` filtering. The good point is
  that after this patch, this metric will be way easier to use; the bad
  point is that it is a breaking change between v2.3 and v2.4. But I
  believe this is acceptable following the previous state changes
  already merged.
- next patch is the continuation of the cleaning work:
 * try to improve descriptions so we can use them in both case
 * merge them when possible, override otherwise
- I finished with a bit of minor cleaning which pointed me a few missing
  fields. I know we probably can improve that to avoid forgetting adding
  those fields in the future, but I assume it is ok for now to simply
  come back with a sane result. The postive point is I did not had to
  add description and implementation, as we make use of stats.c

William Dauchy (9):
  MAJOR: contrib/prometheus-exporter: move health check status to labels
  MINOR: contrib/prometheus-exporter: improve service status description
field
  MINOR: stats: improve pending connections description
  MINOR: stats: improve max stats descriptions
  MINOR: contrib/prometheus-exporter: use stats desc when possible
  MINOR: contrib/prometheus-exporter: add uweight field
  MINOR: contrib/prometheus-exporter: add recv logs_logs_total field
  CLEANUP: contrib/prometheus-exporter: remove unused includes
  CLEANUP: contrib/prometheus-exporter: align and reorder fields

 .../prometheus-exporter/service-prometheus.c  | 140 --
 include/haproxy/counters-t.h  |   2 +-
 src/stats.c   |  14 +-
 3 files changed, 40 insertions(+), 116 deletions(-)

-- 
2.29.2




Re: lua function core.get_info() broken in haproxy 2.2.7

2021-01-30 Thread William Dauchy
Hi James,

On Sat, Jan 30, 2021 at 3:07 AM James Brown  wrote:
> Ah, never mind, I see that this was already fixed in master in 
> 3ddec3ee7d344112b4e4fbde317f8886a20d66a0.

yeah, sorry about that. This will be fixed in v2.2.8 and v2.3.4
-- 
William



Re: Makefile, environment variables and REGTESTS_TYPES

2021-01-29 Thread William Dauchy
On Fri, Jan 29, 2021 at 2:46 PM William Lallemand
 wrote:
> According to `make reg-tests-help` the REGTESTS_TYPES parameter must be
> configured as an environment variable and not a make argument.
> However since patch 3bad3d5 ("BUILD: Makefile: exclude broken tests by
> default"), it does not work anymore with an environment variable.
> Looking at the several CI files we have, it is used as a make
> argument everywhere.

wow indeed, I did not realise that; yet another thing I broke ;-)

> I'm going to update the `make reg-tests-help` command with the correct
> syntax if there isn't any complain.

Thanks for looking into this!
-- 
William



  1   2   3   4   5   >