Re: [PR] Improve wording of http-request wait-for-body
Hi Craig, Thank you for your patch. Do you think you could come back with a patch which is only modifying the section you want? It seems you did a change on the full file instead which makes the review quite hard. -- William
Re: clarify close behaviour on http-request rules
Hi Christopher, Thanks for your answer. On Tue, Feb 7, 2023 at 9:09 AM Christopher Faulet wrote: > The tarpit action is final. So it cannot be used in addition to a return or a > deny action. For the wait-for-body action, indeed, it will wait for the body > or > a full buffer for 1 second. Thus in this case, if the whole request can be > stored in buffer and is received fast enough, this will mitigate your issue. ah yes good point indeed, I forgot the behaviour of tarpit. I will give it a try with `wait-for-body`. > FYI, we are refactoring the way errors, aborts and shutdowns are handled > internally. And the data draining at the mux level is definitely a subject we > should address. It is a painful work but we hope to include it in the 2.8, at > least partially. thanks! -- William
Re: clarify close behaviour on http-request rules
Hi Christopher, On Fri, Feb 3, 2023 at 7:59 PM William Dauchy wrote: > On Tue, Oct 18, 2022 at 4:15 PM Christopher Faulet > wrote: > > On all HTX versions, K/A and close modes are handled in the H1 multiplexer. > > Thus, on these versions, http_reply_and_close() is only closing the stream. > > The > > multiplexer is responsible to close the client connection or not. > > > > On pre-HTX versions, when http_reply_and_close() is used, the client > > connection > > is also closed. It is a limitation of of HAProxy versions using the legacy > > HTTP. > > > > Note there is a case where the connection client may be closed. If the > > HAProxy > > response is returned before the end of the request, the client connection is > > closed. There is no (not yet) draining mode at the mux level. > > coming back on this very late: > > `http-request wait-for-body time` or a `http-request tarpit` mitigate > the draining issue? > I am trying to find a workaround on a setup where we are behind > another L7 LB where we unexpectedly close the connection. saying another way, what is going the behaviour of `http-request return` if I have: http-request wait-for-body time 1s if CONDITION_A http-request deny if CONDITION_A Is it going to wait for the request, and so mitigate the mentioned drain limitation we currently have in the mux when `CONDITION_A` matches? -- William
Re: clarify close behaviour on http-request rules
On Tue, Oct 18, 2022 at 4:15 PM Christopher Faulet wrote: > On all HTX versions, K/A and close modes are handled in the H1 multiplexer. > Thus, on these versions, http_reply_and_close() is only closing the stream. > The > multiplexer is responsible to close the client connection or not. > > On pre-HTX versions, when http_reply_and_close() is used, the client > connection > is also closed. It is a limitation of of HAProxy versions using the legacy > HTTP. > > Note there is a case where the connection client may be closed. If the HAProxy > response is returned before the end of the request, the client connection is > closed. There is no (not yet) draining mode at the mux level. coming back on this very late: `http-request wait-for-body time` or a `http-request tarpit` mitigate the draining issue? I am trying to find a workaround on a setup where we are behind another L7 LB where we unexpectedly close the connection. Thanks! -- William
Re: [PATCH 0/2] BUG/MINOR: promex: fix haproxy_backend_agg_server_check_status
On Tue, Dec 6, 2022 at 3:27 AM Christopher Faulet wrote: > IMHO, we should keep the current metric and its description should be updated > to > mention it is deprecated. This way, we will be able to remove it in the 2.9 or > 3.0. This means we should find new names for the aggregated servers status and > the aggregated check status. "ST_F_AGG_SRV_STATUS" is pretty good for the > first > one. I may propose "ST_F_AGG_CHECK_STATUS" for the second one. > > I guess the two new metrics may be backported. They are not used by the stats > applet. > > What do you think about it? I am aligned with this plan of creating two new metrics Thanks, -- William
Re: [PATCH 0/2] BUG/MINOR: promex: fix haproxy_backend_agg_server_check_status
On Tue, Nov 29, 2022 at 11:52 AM William Dauchy wrote: > On Mon, Nov 28, 2022 at 8:46 AM Cedric Paillet wrote: > > As described in github issue #1312, the first intention of patch 42d7c402d > > was to aggregate haproxy_server_check_status. But we aggregated > > haproxy_server_status instead. > > To fix that, rename haproxy_backend_agg_server_check_status to > > haproxy_backend_agg_server_status and use right data source for > > haproxy_backend_agg_server_check_status. > > Thank you for looking into this. This has been on my todo list for too long. > I am fine with those changes as long as we backport them until the > version where this was introduced (this was introduced in v2.5 and > then backported in v2.4 if my checks are correct). > I understand that's a breaking change, some people started to build > tooling on top of that can't deal with behaviour change between > versions (I am thinking about dashboards in my case where we deal with > different versions). In that regard, I believe it can be ok to have to > do a one off change to fix this past mistake. If we are not aligned > with a backport, we will need to find another name. I realised I was not necessarily very clear on my position: - either we merge this patch, and then backport it until 2.4 - either we find a new name for the new metric and don't introduce a breaking change I am mostly worried about metrics collectors who are not able to deal with different behaviour from one version to another. In my own case, we have different haproxy versions running, and there is no global way we can adapt the collector depending on the software version it is collecting. The collector expects the same behaviour across versions. Having a breaking change which affects all versions is fine as it will require a unique change on our side, but having a situation in the middle is not acceptable for us. -- William
Re: [PATCH 0/2] BUG/MINOR: promex: fix haproxy_backend_agg_server_check_status
Hello Cedric, On Mon, Nov 28, 2022 at 8:46 AM Cedric Paillet wrote: > As described in github issue #1312, the first intention of patch 42d7c402d > was to aggregate haproxy_server_check_status. But we aggregated > haproxy_server_status instead. > To fix that, rename haproxy_backend_agg_server_check_status to > haproxy_backend_agg_server_status and use right data source for > haproxy_backend_agg_server_check_status. Thank you for looking into this. This has been on my todo list for too long. I am fine with those changes as long as we backport them until the version where this was introduced (this was introduced in v2.5 and then backported in v2.4 if my checks are correct). I understand that's a breaking change, some people started to build tooling on top of that can't deal with behaviour change between versions (I am thinking about dashboards in my case where we deal with different versions). In that regard, I believe it can be ok to have to do a one off change to fix this past mistake. If we are not aligned with a backport, we will need to find another name. -- William
change accept() behaviour under maxconn
Hello, I was looking at maxconn behaviour on frontend side. I understood we don't trigger accept() once maxconn is reached https://git.haproxy.org/?p=haproxy.git;a=blob;f=src/listener.c;h=412af94a160662aa11802dd58512c6410c9b56da;hb=HEAD#l920 I wondered whether it would be an acceptable patch to add an option in order to accept() and close() the connection right after during such event. The context would be to "fail fast" on some environment where it is preferable to let know the client as soon as possible we can't accept the connection. In an environment where the haproxy instance is the second level of proxies, we want to fail fast instead of waiting for the end of tcp timeout to connect. Thanks, -- William
Re: clarify close behaviour on http-request rules
On Tue, Oct 18, 2022 at 4:15 PM Christopher Faulet wrote: > On all HTX versions, K/A and close modes are handled in the H1 multiplexer. > Thus, on these versions, http_reply_and_close() is only closing the stream. > The > multiplexer is responsible to close the client connection or not. ok, thanks for the details. > Note there is a case where the connection client may be closed. If the HAProxy > response is returned before the end of the request, the client connection is > closed. There is no (not yet) draining mode at the mux level. That particular point is very interesting to me. Would you have a pointer so that I can better understand where it is happening in the code? -- William
Re: clarify close behaviour on http-request rules
Hi Christopher, Thanks for your reply. On Tue, Oct 18, 2022 at 10:22 AM Christopher Faulet wrote: > Since the 2.2, HAProxy responses don't close the client connections in K/A > mode, > except for 400-Bad-Request and 408-Request-Time-out. It is the behavior with > default error messages. Of course, if a "Connection: close" header is found > in a > custom reply, the client connection is closed. > > Thus, in your case, if there is no "Connection: close" header in the response, > both http-request action (return and deny) should do the same and keep the > client connection alive. > > So if you observe connection closing, it may be for something else. Maybe a > timeout. Or a bug... ok I will investigate more. To come back on the code path, was I wrong for the `http_reply_and_close` difference? Would you have a pointer so I may better understand where `http_reply_and_close` would not close the connection in K/A mode? -- William
clarify close behaviour on http-request rules
Hello, I am trying to clarify in which case a tcp connection might be closed following those rules: - http-request return - http-request deny unless I missed something I have not been able to see the answer within the doc. General context being, we are using `option http-keep-alive`; also our haproxy is behind another L7 proxy; it means the later do not expect the connection to be loosely closed. [cloud L7 LB] <--> [haproxy] <--> backend >From what I read in https://git.haproxy.org/?p=haproxy.git;a=blob;f=src/http_ana.c;h=2b2cfdc56103f313d766143f9016d91200065092;hb=HEAD#l354 I got: - `http-request return` gets `HTTP_RULE_RES_ABRT`; this flag leads to `return_prx_cond`, which is not calling `http_reply_and_close` - `http-request deny` gets `HTTP_RULE_RES_DENY`; this flag leads to `deny`, which is calling `http_reply_and_close` We are using haproxy v2.4.x but I think the behaviour did not change in an earlier version. Some followup questions: - do we confirm `http-request return` does not close the connection while `http-request deny` does? meaning: - http-request return status 403 - http-request deny don't have the same behaviour? - from what I understood, using http-keep-alive does not influence the behaviour of `http_reply_and_close`, is that correct? - while using `http-request deny` which might close the connection, we see some requests on the cloud LB side being ended in error 502. We suspect those are the next requests, where the cloud LB tried to use the same previous connection. For now it is hard to understand why the cloud LB would not detect the close. Is there a moment where haproxy would loosely close the connection which might explain the behavior? Thanks in advance, -- William
Re: [PATCH] BUG/MEDIUM: server: avoid changing healthcheck ctx with set server ssl
Hello Christopher, On Wed, Jan 12, 2022 at 12:45 PM William Dauchy wrote: > my approach was to say: > - remove the implicit behavior > - then work on the missing commands for the health checks Do you think we can conclude on it? -- William
Re: [PATCH] BUG/MEDIUM: server: avoid changing healthcheck ctx with set server ssl
On Wed, Jan 12, 2022 at 10:02 AM Christopher Faulet wrote: > I don't know what is the expected behavior on the stable releases for users. > Honestly, I've misread you patch and kept in mind you alternative solution... > But as said, if there is no implicit change (and I'm fine with this > solution), a > new command or an option to current ones must be introduced to be able to > change > SSL setting for health-check too. Otherwise, it will be unusable. my approach was to say: - remove the implicit behavior - then work on the missing commands for the health checks -- William
Re: [PATCH] BUG/MEDIUM: server: avoid changing healthcheck ctx with set server ssl
Hello Christopher, Thanks for your research, On Tue, Jan 11, 2022 at 6:55 PM Christopher Faulet wrote: > Le 1/10/22 à 23:19, Willy Tarreau a écrit : > > At this point I'm starting to think that we should probably avoid as > > much as possible to use implicit settings for whatever is dynamic. > > Originally a lot of settings were implicit because we don't want to > > have huge config lines to enforce lots of obvious settings. On the CLI > > it's less of a problem and for example if "ssl" only deals with the > > traffic without manipulating the checks, I'm personally not shocked, > > because these are normally sent by bots and we don't care if they > > have to set a few more settings to avoid multiple implicit changes > > that may not always be desired. This is where the CLI (or any future > > API) might differ a bit from the config, and an agent writing some > > config might have to explicitly state certain things like "no-check-ssl" > > for example to stay safe and avoid such implicit dependencies. > > > > I agree with Willy on this point. Especially because, depending the order of > commands, the result can be different because of implicit changes and it is > pretty hard to predict how it will behave in all cases. yup totally agree with both of you. I can't really remember why I did that in the first place. > For instance, to fix William's bug about "set server ssl" command, in a way or > another, we must stop to dynamically update the health-check if its port or > its > address is explicitly specified. With this change, the result of following set > of commands will be different: > >$ set server BK/SRV ssl on >$ set server BK/SRV check-port XXX > > ==> SSL is enabled for the server and the health-check interesting one :)) >$ set server BK/SRV check-port XXX >$ set server BK/SRV ssl on > > ==> because the check's port was updated first, the SSL is only enabled for > the > server. > Agree. But, if possible, a warning may be added in the documentation to warn > about implicit changes. >From the discussion, I would be tempted to say the opposite, as I feel like keeping implicit things for this command is worse. > About the fix, I checked the code, and, at first glance, there is no reason to > change "srv->check.use_ssl" value when the health-check uses the server > settings. Thus, we may fix "set server ssl" command this way: > > diff --git a/src/check.c b/src/check.c > index f0ae81504..8cf8a1c5b 100644 > --- a/src/check.c > +++ b/src/check.c > @@ -1565,10 +1565,8 @@ int init_srv_check(struct server *srv) > * default, unless one is specified. > */ > if (!srv->check.port && !is_addr(>check.addr)) { > - if (!srv->check.use_ssl && srv->use_ssl != -1) { > - srv->check.use_ssl = srv->use_ssl; > - srv->check.xprt= srv->xprt; > - } > + if (!srv->check.use_ssl && srv->use_ssl != -1) > + srv->check.xprt = srv->xprt; > else if (srv->check.use_ssl == 1) > srv->check.xprt = xprt_get(XPRT_SSL); > srv->check.send_proxy |= (srv->pp_opts); indeed this is simplified that way > diff --git a/src/server.c b/src/server.c > index 2061153bc..a18836a71 100644 > --- a/src/server.c > +++ b/src/server.c > @@ -2113,10 +2113,11 @@ void srv_set_ssl(struct server *s, int use_ssl) > return; > > s->use_ssl = use_ssl; > - if (s->use_ssl) > - s->xprt = xprt_get(XPRT_SSL); > - else > - s->xprt = s->check.xprt = s->agent.xprt = xprt_get(XPRT_RAW); > + s->xprt = (s->use_ssl == 1) ? xprt_get(XPRT_SSL) : xprt_get(XPRT_RAW); > + > + if ((s->check.state & CHK_ST_CONFIGURED) && !s->check.use_ssl && > + !s->check.port && !is_addr(>check.addr)) > + s->check.xprt = s->xprt; > } as mentioned above I am not sure I am aligned here; I would rather remove the side effect of changing s->check. > #endif /* USE_OPENSSL */ > > This may be done with the following 3 steps: > >* First, stop to change "srv->check.use_ssl" value > >* Then, stop to change the agent settings in srv_set_ssl() because there is > no ssl support for agent-check. good point, I did not know >* Finally, fix the bug identified by William, adding the according > documentation. > > Note I don't know if all this stuff works properly with the server-state > file... I am always scared to look at the server state functionality... -- William
Re: [PATCH] BUG/MEDIUM: server: avoid changing healthcheck ctx with set server ssl
On Sat, Jan 8, 2022 at 3:03 PM Tim Düsterhus wrote: > Causes issues when applying the patch, because git gets confused and > believes this to be the patch. > I tend to indent this type of "literal code block" within my commit > message with 4 spaces for clarity. indeed, good point, will fix if I have to resend a v2 On Mon, Jan 10, 2022 at 7:51 AM Willy Tarreau wrote: > It's important to always keep in mind that checks are not necessarily > related to the production traffic, and that configuring one part should > not have any impact on the other one. By default a server running in SSL > will not be checked using SSL unless "check-ssl" is set. note it is only true in your example if you use another port. > You could for > example have a server forwarding to multiple ports (say 80 and 443) and > decide to check only one of them, or even check another one. > > As such, I think your patch is correct as it only affects what the user > attempts to modify. I suspect that the reason for your initial choice was > that it was not yet possible by then to enable SSL checks manually, sorry what do you mean by manually? "check-ssl" has been available for a long time, so that's not the reason behind it, but I guess you were referring to something else. I suspect I did a dumb copy/paste from the new_server function and probably never thought was possibly wrong as my previous production never had any check using tls. > it > would be worth rechecking, because if that's the case, maybe we should > not backport it to 2.4 and only document a behavior change between 2.4 > and 2.5. > If you could have a double-check at the history behind this, that would > be nice so that we know how far to backport it. By the way, maybe your > proposed alternative would be acceptable for older versions which do not > allow to enable SSL health checks on the CLI. unless I missed something, for me the current behavior is broken as you can't come back to a working state if you are using tls on both traffic and health check path. The only working setup is when you are using `no-check-ssl` in your default server. In that sense I believe it should be backported to v2.4. -- William
[PATCH] BUG/MEDIUM: server: avoid changing healthcheck ctx with set server ssl
While giving a fresh try to `set server ssl` (which I wrote), I realised the behavior is a bit inconsistent. Indeed when using this command over a server with ssl enabled for the data path but also for the health check path we have: - data and health check done using tls - emit `set server be_foo/srv0 ssl off` - data path and health check path becomes plain text - emit `set server be_foo/srv0 ssl on` - data path becomes tls and health check path remains plain text while I thought the end result would be: - data path and health check path comes back in tls In the current code we indeed erase all connections while deactivating, but restore only the data path while activating. I made this mistake in the past because I was testing with a case where the health check plain text by default. There are several ways to solve this issue. The cleanest one would probably be to avoid changing the health check connection when we use `set server ssl` command, and create a new command `set server ssl-check` to change this. For now I assumed this would be ok to simply avoid changing the health check path and be more consistent. This patch tries to address that and also update the documentation. It should not break the existing usage with health check on plain text, as in this case they should have `no-check-ssl` in defaults. Without this patch, it makes the command unusable in an env where you have a list of server to add along the way with initial `server-template`, and all using tls for data and healthcheck path. For 2.6 we should probably reconsider and add `set server ssl-check` command for better granularity of cases. If this solution is accepted, this patch should be backported up to >= 2.4. The alternative solution was to restore the previous state, but I believe this will create even more confusion in the future: --- a/src/server.c +++ b/src/server.c @@ -2113,8 +2113,11 @@ void srv_set_ssl(struct server *s, int use_ssl) return; s->use_ssl = use_ssl; - if (s->use_ssl) + if (use_ssl) { s->xprt = xprt_get(XPRT_SSL); + if (s->check.use_ssl) + s->check.xprt = s->agent.xprt = xprt_get(XPRT_SSL); + } else s->xprt = s->check.xprt = s->agent.xprt = xprt_get(XPRT_RAW); } Signed-off-by: William Dauchy --- doc/management.txt | 2 ++ src/server.c | 2 +- 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/doc/management.txt b/doc/management.txt index 147e059f6..b96cbc878 100644 --- a/doc/management.txt +++ b/doc/management.txt @@ -2187,6 +2187,8 @@ set server / fqdn set server / ssl [ on | off ] This option configures SSL ciphering on outgoing connections to the server. + When switch off, all traffic becomes plain text; health check path is not + changed. set severity-output [ none | number | string ] Change the severity output format of the stats socket connected to for the diff --git a/src/server.c b/src/server.c index 2061153bc..d165f50b9 100644 --- a/src/server.c +++ b/src/server.c @@ -2116,7 +2116,7 @@ void srv_set_ssl(struct server *s, int use_ssl) if (s->use_ssl) s->xprt = xprt_get(XPRT_SSL); else - s->xprt = s->check.xprt = s->agent.xprt = xprt_get(XPRT_RAW); + s->xprt = xprt_get(XPRT_RAW); } #endif /* USE_OPENSSL */ -- 2.34.1
[PATCH] MINOR: proxy: add option idle-close-on-response
Avoid closing idle connections if a soft stop is in progress. By default, idle connections will be closed during a soft stop. In some environments, a client talking to the proxy may have prepared some idle connections in order to send requests later. If there is no proper retry on write errors, this can result in errors while haproxy is reloading. Even though a proper implementation should retry on connection/write errors, this option was introduced to support back compat with haproxy < v2.4. Indeed before v2.4, we were waiting for a last request to be able to add a "connection: close" header and advice the client to close the connection. In a real life example, this behavior was seen in AWS using the ALB in front of a haproxy. The end result was ALB sending 502 during haproxy reloads. This patch was tested on haproxy v2.4, with a regular reload on the process, and a constant trend of requests coming in. Before the patch, we see regular 502 returned to the client; when activating the option, the 502 disappear. This patch should help fixing github issue #1506. In order to unblock some v2.3 to v2.4 migraton, this patch should be backported up to v2.4 branch. Signed-off-by: William Dauchy --- doc/configuration.txt | 21 + include/haproxy/proxy-t.h | 2 +- src/mux_h1.c | 3 ++- src/proxy.c | 1 + 4 files changed, 25 insertions(+), 2 deletions(-) diff --git a/doc/configuration.txt b/doc/configuration.txt index cd8ab4b72..66571a566 100644 --- a/doc/configuration.txt +++ b/doc/configuration.txt @@ -3756,6 +3756,7 @@ option tcp-smart-connect (*) X - X X option tcpka X X X X option tcplog X X X X option transparent (*) X - X X +option idle-close-on-response(*) X X X - external-check commandX - X X external-check path X - X X persist rdp-cookieX - X X @@ -9836,6 +9837,26 @@ no option transparent "transparent" option of the "bind" keyword. +option idle-close-on-response +no option idle-close-on-response + Avoid closing idle connections if a stop stop is in progress + May be used in sections : defaults | frontend | listen | backend + yes |yes | yes | no + Arguments : none + + By default, idle connections will be closed during a soft stop. In some + environments, a client talking to the proxy may have prepared some idle + connections in order to send requests later. If there is no proper retry on + write errors, this can result in errors while haproxy is reloading. Even + though a proper implementation should retry on connection/write errors, this + option was introduced to support back compat with haproxy < v2.4. Indeed + before v2.4, we were waiting for a last request to be able to add a + "connection: close" header and advice the client to close the connection. + + In a real life example, this behavior was seen in AWS using the ALB in front + of a haproxy. The end result was ALB sending 502 during haproxy reloads. + + external-check command Executable to run when performing an external-check May be used in sections : defaults | frontend | listen | backend diff --git a/include/haproxy/proxy-t.h b/include/haproxy/proxy-t.h index 8ca4e818d..421f900e2 100644 --- a/include/haproxy/proxy-t.h +++ b/include/haproxy/proxy-t.h @@ -82,7 +82,7 @@ enum PR_SRV_STATE_FILE { #define PR_O_REUSE_ALWS 0x000C /* always reuse a shared connection */ #define PR_O_REUSE_MASK 0x000C /* mask to retrieve shared connection preferences */ -/* unused: 0x10 */ +#define PR_O_IDLE_CLOSE_RESP 0x0010 /* avoid closing idle connections during a soft stop */ #define PR_O_PREF_LAST 0x0020 /* prefer last server */ #define PR_O_DISPATCH 0x0040 /* use dispatch mode */ #define PR_O_FORCED_ID 0x0080 /* proxy's ID was forced in the configuration */ diff --git a/src/mux_h1.c b/src/mux_h1.c index 7b6ab946d..1ec6cb77c 100644 --- a/src/mux_h1.c +++ b/src/mux_h1.c @@ -2999,7 +2999,8 @@ static int h1_process(struct h1c * h1c) */ if (!(h1c->flags & H1C_F_IS_BACK)) { if (unlikely(h1c->px->flags & (PR_FL_DISABLED|PR_FL_STOPPED))) { - if (h1c->flags & H1C_F_WAIT_NEXT_REQ) + if (!(h1c->px->options & PR_O_IDLE_CLOSE_RESP) && + h1c->flags & H1C_F_WAIT_NEXT_REQ) goto release; } } diff --git a/src/proxy.c b/src/proxy.c index ff5e35e2c..245b6
Re: Remaining issues on 2.5 for the release
Hi Willy, On Fri, Nov 12, 2021 at 3:48 PM Willy Tarreau wrote: > I intended to emit the final 2.5 this week-end, but a few users having > upgraded to the latest 2.4, 2.3 or 2.2 reported strange issues that we > couldn't reproduce and for which we don't have more info yet. Some seem > related to connections taking longer to vanish, others to possibly > truncated responses. These include some of the recent 2.5 fixes, and > it's very possible that a bug has been hiding another one, and that > fixing one revealed the other one. > For now I'm failing to reproduce any such problem on any of these > versions, and the information gathered for now are insufficient to draw > any conclusion. I'm not thrilled at the idea of releasing 2.5 and having > to emit yet another a few days one later. We're not late in the cycle so > I prefer that we focus our efforts on finding that problem and fixing > the affected versions first. > For the record, the related issues are #1448, #1451 and #1453 for the > strange behavior on the connections, and #1450 for the truncation. If > anyone manages to set up any reliable reproducer, do not hesitate to > share it! Do we know whether v2.4.7 is affected by this regression? Thanks, -- William
Re: [PATCH] MINOR: promex: backend aggregated server check status
On Mon, Nov 8, 2021 at 1:52 PM Willy Tarreau wrote: > Just to be sure, is this something you want to merge into 2.5 or is it > to be queued next ? I'm fine with both, but I prefer to ask as it's not > just a one-liner and I don't know the impact on promex. I know Christopher was possibly thinking about a more advanced implementation but I thought it could be ok for a first version. Probably a good idea to wait for Christopher feedbacks anyway. I was indeed targeting a late v2.5, despite being a new thing. Sorry for that and I forgot to add a message about it. If it works well enough, I will also probably ask for a backport in 2.4 before the end of the year as I know large clusters are waiting for this patch to lower their memory consumption in prometheus. Thanks, -- William
[PATCH] MINOR: promex: backend aggregated server check status
- add new metric: `haproxy_backend_agg_server_check_status` it counts the number of servers matching a specific check status this permits to exclude per server check status as the usage is often to rely on the total. Indeed in large setup having thousands of servers per backend the memory impact is not neglible to store the per server metric. - realign promex_str_metrics array quite simple implementation - we could improve it later by adding an internal state to the prometheus exporter, thus to avoid counting at every dump. this patch is an attempt to close github issue #1312 Signed-off-by: William Dauchy --- addons/promex/service-prometheus.c | 229 - include/haproxy/stats-t.h | 1 + src/stats.c| 4 + 3 files changed, 131 insertions(+), 103 deletions(-) diff --git a/addons/promex/service-prometheus.c b/addons/promex/service-prometheus.c index 221e40705..ef217007d 100644 --- a/addons/promex/service-prometheus.c +++ b/addons/promex/service-prometheus.c @@ -189,106 +189,107 @@ const struct promex_metric promex_global_metrics[INF_TOTAL_FIELDS] = { /* frontend/backend/server fields */ const struct promex_metric promex_st_metrics[ST_F_TOTAL_FIELDS] = { - //[ST_F_PXNAME] ignored - //[ST_F_SVNAME] ignored - [ST_F_QCUR] = { .n = IST("current_queue"), .type = PROMEX_MT_GAUGE,.flags = ( PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) }, - [ST_F_QMAX] = { .n = IST("max_queue"), .type = PROMEX_MT_GAUGE,.flags = ( PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) }, - [ST_F_SCUR] = { .n = IST("current_sessions"), .type = PROMEX_MT_GAUGE,.flags = (PROMEX_FL_FRONT_METRIC | PROMEX_FL_LI_METRIC | PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) }, - [ST_F_SMAX] = { .n = IST("max_sessions"), .type = PROMEX_MT_GAUGE,.flags = (PROMEX_FL_FRONT_METRIC | PROMEX_FL_LI_METRIC | PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) }, - [ST_F_SLIM] = { .n = IST("limit_sessions"), .type = PROMEX_MT_GAUGE,.flags = (PROMEX_FL_FRONT_METRIC | PROMEX_FL_LI_METRIC | PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) }, - [ST_F_STOT] = { .n = IST("sessions_total"), .type = PROMEX_MT_COUNTER, .flags = (PROMEX_FL_FRONT_METRIC | PROMEX_FL_LI_METRIC | PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) }, - [ST_F_BIN]= { .n = IST("bytes_in_total"), .type = PROMEX_MT_COUNTER, .flags = (PROMEX_FL_FRONT_METRIC | PROMEX_FL_LI_METRIC | PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) }, - [ST_F_BOUT] = { .n = IST("bytes_out_total"), .type = PROMEX_MT_COUNTER, .flags = (PROMEX_FL_FRONT_METRIC | PROMEX_FL_LI_METRIC | PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) }, - [ST_F_DREQ] = { .n = IST("requests_denied_total"), .type = PROMEX_MT_COUNTER, .flags = (PROMEX_FL_FRONT_METRIC | PROMEX_FL_LI_METRIC | PROMEX_FL_BACK_METRIC ) }, - [ST_F_DRESP] = { .n = IST("responses_denied_total"), .type = PROMEX_MT_COUNTER, .flags = (PROMEX_FL_FRONT_METRIC | PROMEX_FL_LI_METRIC | PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) }, - [ST_F_EREQ] = { .n = IST("request_errors_total"), .type = PROMEX_MT_COUNTER, .flags = (PROMEX_FL_FRONT_METRIC | PROMEX_FL_LI_METRIC ) }, - [ST_F_ECON] = { .n = IST("connection_errors_total"), .type = PROMEX_MT_COUNTER, .flags = ( PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) }, - [ST_F_ERESP] = { .n = IST("response_errors_total"), .type = PROMEX_MT_COUNTER, .flags = ( PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) }, - [ST_F_WRETR] = { .n = IST("retry_warnings_total"), .type = PROMEX_MT_COUNTER, .flags = ( PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) }, - [ST_F_WREDIS] = { .n = IST("redispatch_warnings_total"), .type = PROMEX_MT_COUNTER, .flags = ( PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) }, - [ST_F_STATUS] = { .n = IST("status"), .type = PROMEX_MT_GAUGE,.flags = (PROMEX_FL_FRONT_METRIC | PROMEX_FL_LI_METRIC | PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) }, - [ST_F_WEIGHT]
[PATCH] DOC: stats: fix location of the text representation
`info_field_names` and `stat_field_names` no longer exist and have been moved in stats.c To avoid changing this comment, just mention the name of the new table `info_fields` and `stat_fields` Signed-off-by: William Dauchy --- include/haproxy/stats-t.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/include/haproxy/stats-t.h b/include/haproxy/stats-t.h index 312013364..9ac875e73 100644 --- a/include/haproxy/stats-t.h +++ b/include/haproxy/stats-t.h @@ -253,8 +253,8 @@ enum field_scope { FS_MASK = 0xFF00, }; -/* Show Info fields for CLI output. For any field added here, please add the text - * representation in the info_field_names array below. Please only append at the end, +/* Show info fields for CLI output. For any field added here, please add the + * text representation in the info_fields array. Please only append at the end, * before the INF_TOTAL_FIELDS entry, and never insert anything in the middle * nor at the beginning. */ @@ -338,7 +338,7 @@ enum info_field { /* Stats fields for CSV output. For any field added here, please add the text - * representation in the stat_field_names array below. Please only append at the end, + * representation in the stat_fields array. Please only append at the end, * before the ST_F_TOTAL_FIELDS entry, and never insert anything in the middle * nor at the beginning. */ -- 2.30.2
Re: http-request return bytes_read from v2.3 to v2.4
On Mon, Jul 26, 2021 at 8:35 AM Christopher Faulet wrote: > The response size reported in the log messages is related to the internal > representation of the HTTP message. It is a limitation of the current design. > It > means this size is not equal to the raw size of the message and may change > when > the HTX representation changes. In this case, in 2.4, the start-line > representation was changed, a 32-bits integer was removed from hx_sl > structure. > In addition, the end-of-message block (EOM) was removed from the HTX > representation. It counted for 1 byte. This explains the difference of 5 bytes > between 2.3 and 2.4. Thanks Christophe, crystal clear explanation! -- William
http-request return bytes_read from v2.3 to v2.4
Hello, While upgrading from v2.3.x to v2.4.x I noticed a difference of bytes read in requests http logs with this simple frontend config: frontend foo bind 127.0.0.1:8000 http-request return in v2.3, the logs (%B) tells me 54 bytes, in v2.4, I am getting 49. So a difference of 5 bytes per request. I was simply curious to understand that behavior change. Is it a change in the way we count bytes in stats, or is there really a difference in the payload we write? When I look at a dump, the TCP payloads are identical with 57 bytes in both versions. And the overall length of the packet is 123 bytes. Maybe that's expected, but I wanted some clarification. Thanks, -- William
Re: [ANNOUNCE] haproxy-2.4.0
On Fri, May 14, 2021 at 11:58 AM Willy Tarreau wrote: > - interoperability / protocol support: WebSocket over HTTP/2 (RFC8441) > is now supported on both sides, regardless of the version on the other > side. The cache now supports the "Vary" header with a few commonly > used headers, including "Accept-encoding" which gets normalized for > optimal cache hit ratio. The Prometheus exporter got a significant > liftup, requires less tricks on the Prometheus side, and supports > listing only certain metrics for faster retrieval. Optional native > support for Opentracing was also integrated (via USE_OT=1). The DNS > resolvers now support talking to servers over TCP. Basic support for > extracting information from MQTT and FIX protocol was added. Timeouts > can now be adjusted on the fly and per-request in order to adapt to > particuarly slow servers or special protocols. Regarding prometheus, it should probably noted some major changes regarding some metrics, such as for health check, where the value is now located in a label, instead of the value of the metric itself: see also http://git.haproxy.org/?p=haproxy.git;a=commit;h=de3c32638925c2071a5a84cbdafe2f112d2c4261 -- William
Re: [ANNOUNCE] haproxy-2.3.9
On Tue, Mar 30, 2021 at 6:59 PM Willy Tarreau wrote: > HAProxy 2.3.9 was released on 2021/03/30. It added 5 new commits > after version 2.3.8. > > This essentially fixes the rate counters issue that popped up in 2.3.8 > after the previous fix for the rate counters already. > > What happened is that the internal time in millisecond wraps every 49.7 > days and that the new global counter used to make sure rate counters are > now stable across threads starts at zero and is initialized when older > than the current thread's current date. It just happens that the wrapping > happened a few hours ago at "Mon Mar 29 23:59:46 CEST 2021" exactly and > that any process started since this date and for the next 24 days doesn't > validate this condition anymore, hence doesn't rotate its rate counters > anymore. Thanks Willy for the quick update. That's a good example to avoid pushing stable versions at the same time, so we have opportunities to find those regressions. -- William
Re: Table sticky counters decrementation problem
On Tue, Mar 30, 2021 at 5:57 PM Willy Tarreau wrote: > out of curiosity I wanted to check when the overflow happened: > > $ date --date=@$$(date +%s) * 1000) & -0x800) / 1000)) > Mon Mar 29 23:59:46 CEST 2021 > > So it only affects processes started since today. I'm quite tempted not > to wait further and to emit 2.3.9 urgently to fix this before other > people get trapped after reloading their process. Any objection ? I do confirm the timestamp on our side but do not have the necessary tooling to test the fix. Thanks, -- William
Re: "[ANNOUNCE] haproxy-2.3.6
Hi, On Wed, Mar 3, 2021 at 4:09 PM Christopher Faulet wrote: >- An issue leading to possible infinite loops because of a double locking > effect in the mt lists was fixed by Olivier. If MT_LIST_TRY_ADDQ() > macro, it was possible to try to lock twice the same element, making the > second lock attempt to fail in loop. > Olivier Houchard (1): >BUG/MEDIUM: lists: Avoid an infinite loop in MT_LIST_TRY_ADDQ(). not very clear in which conditions it can be triggered. Do you have more details about it? Thanks, -- William
[PATCH] BUG/MEDIUM: contrib/prometheus-exporter: fix segfault in listener name dump
We need to check whether listener is empty before doing anything; in that case, we were trying to dump listerner name while name is null. So simply move the counters check above, which validate all possible cases when the listener is empty. This is very similar to what is done in stats.c see also the trace: Thread 1 "haproxy" received signal SIGSEGV, Segmentation fault. __strlen_sse2 () at ../sysdeps/x86_64/multiarch/../strlen.S:120 120 ../sysdeps/x86_64/multiarch/../strlen.S: No such file or directory. (gdb) bt #0 __strlen_sse2 () at ../sysdeps/x86_64/multiarch/../strlen.S:120 #1 0x555b716b in promex_dump_listener_metrics (htx=0x558fadf0, appctx=0x55926070) at contrib/prometheus-exporter/service-prometheus.c:722 #2 promex_dump_metrics (htx=0x558fadf0, si=0x55925920, appctx=0x55926070) at contrib/prometheus-exporter/service-prometheus.c:1200 #3 promex_appctx_handle_io (appctx=0x55926070) at contrib/prometheus-exporter/service-prometheus.c:1477 #4 0x556f0c94 in task_run_applet (t=0x55926180, context=0x55926070, state=) at src/applet.c:88 #5 0x556bc6d8 in run_tasks_from_lists (budgets=budgets@entry=0x7fffe374) at src/task.c:548 #6 0x556bd1a0 in process_runnable_tasks () at src/task.c:750 #7 0x55696cdd in run_poll_loop () at src/haproxy.c:2870 #8 0x55697025 in run_thread_poll_loop (data=data@entry=0x0) at src/haproxy.c:3035 #9 0x55596c90 in main (argc=, argv=0x7fffe818) at src/haproxy.c:3723 quit) this bug was introduced by commit e3f7bd5ae9e969cbfe87e4130d06bff7a3e814c6 ("MEDIUM: contrib/prometheus-exporter: add listen stats"), which is present for 2.4 only, so no backport needed. Signed-off-by: William Dauchy --- contrib/prometheus-exporter/service-prometheus.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/contrib/prometheus-exporter/service-prometheus.c b/contrib/prometheus-exporter/service-prometheus.c index 7cf30c1f3..e6023d353 100644 --- a/contrib/prometheus-exporter/service-prometheus.c +++ b/contrib/prometheus-exporter/service-prometheus.c @@ -718,12 +718,12 @@ static int promex_dump_listener_metrics(struct appctx *appctx, struct htx *htx) li = appctx->ctx.stats.obj2; list_for_each_entry_from(li, >conf.listeners, by_fe) { - labels[1].name = ist("listener"); - labels[1].value = ist2(li->name, strlen(li->name)); - if (!li->counters) continue; + labels[1].name = ist("listener"); + labels[1].value = ist2(li->name, strlen(li->name)); + if (!stats_fill_li_stats(px, li, 0, stats, ST_F_TOTAL_FIELDS, &(appctx->st2))) return -1; -- 2.30.0
[PATCH 2/3] REGTESTS: contrib/prometheus-exporter: test NaN values
In order to make sure we detect when we change default behaviour for some metrics, test the NaN value when it is expected. Those metrics were listed since our last rework as their default value changed, unless the appropriate config is set. Signed-off-by: William Dauchy --- reg-tests/contrib/prometheus.vtc | 30 ++ 1 file changed, 30 insertions(+) diff --git a/reg-tests/contrib/prometheus.vtc b/reg-tests/contrib/prometheus.vtc index cdd0f0f55..1ebeb29cb 100644 --- a/reg-tests/contrib/prometheus.vtc +++ b/reg-tests/contrib/prometheus.vtc @@ -10,6 +10,11 @@ server s1 { txresp } -repeat 2 -start +server s2 { + rxreq + txresp +} -repeat 2 -start + haproxy h1 -conf { defaults mode http @@ -29,11 +34,13 @@ haproxy h1 -conf { backend be stick-table type ip size 1m expire 10s store http_req_rate(10s) server s1 ${s1_addr}:${s1_port} + server s2 ${s2_addr}:${s2_port} check maxqueue 10 maxconn 12 pool-max-conn 42 } -start client c1 -connect ${h1_stats_sock} { txreq -url "/metrics" rxresp + # test general metrics expect resp.status == 200 expect resp.body ~ ".*haproxy_process.*" expect resp.body ~ ".*haproxy_frontend.*" @@ -42,6 +49,29 @@ client c1 -connect ${h1_stats_sock} { expect resp.body ~ ".*haproxy_server.*" expect resp.body ~ ".*haproxy_sticktable.*" + # test expected NaN values + expect resp.body ~ ".*haproxy_server_check_failures_total{proxy=\"be\",server=\"s1\"} NaN.*" + expect resp.body ~ ".*haproxy_server_check_up_down_total{proxy=\"be\",server=\"s1\"} NaN.*" + expect resp.body ~ ".*haproxy_server_check_failures_total{proxy=\"be\",server=\"s2\"} 0.*" + expect resp.body ~ ".*haproxy_server_check_up_down_total{proxy=\"be\",server=\"s2\"} 0.*" + + expect resp.body ~ ".*haproxy_server_queue_limit{proxy=\"be\",server=\"s1\"} NaN.*" + expect resp.body ~ ".*haproxy_server_queue_limit{proxy=\"be\",server=\"s2\"} 10.*" + + expect resp.body ~ ".*haproxy_server_limit_sessions{proxy=\"be\",server=\"s1\"} NaN.*" + expect resp.body ~ ".*haproxy_server_limit_sessions{proxy=\"be\",server=\"s2\"} 12.*" + + expect resp.body ~ ".*haproxy_backend_downtime_seconds_total{proxy=\"stats\"} NaN.*" + expect resp.body ~ ".*haproxy_backend_downtime_seconds_total{proxy=\"be\"} 0.*" + expect resp.body ~ ".*haproxy_server_downtime_seconds_total{proxy=\"be\",server=\"s1\"} NaN.*" + expect resp.body ~ ".*haproxy_server_downtime_seconds_total{proxy=\"be\",server=\"s2\"} 0.*" + + expect resp.body ~ ".*haproxy_server_current_throttle{proxy=\"be\",server=\"s1\"} NaN.*" + + expect resp.body ~ ".*haproxy_server_idle_connections_limit{proxy=\"be\",server=\"s1\"} NaN.*" + expect resp.body ~ ".*haproxy_server_idle_connections_limit{proxy=\"be\",server=\"s2\"} 42.*" + + # test scope txreq -url "/metrics?scope=" rxresp expect resp.status == 200 -- 2.30.0
[PATCH 1/3] DOC: contrib/prometheus-exporter: remove htx reference
now that htx is the default everywhere, we can remove the need to put htx as a mandatory option to setup prometheus. Signed-off-by: William Dauchy --- contrib/prometheus-exporter/README | 1 - 1 file changed, 1 deletion(-) diff --git a/contrib/prometheus-exporter/README b/contrib/prometheus-exporter/README index 30154de95..fdbc50203 100644 --- a/contrib/prometheus-exporter/README +++ b/contrib/prometheus-exporter/README @@ -27,7 +27,6 @@ and the corresponding HTTP proxy must enable the HTX support. For instance: frontend test mode http ... -option http-use-htx http-request use-service prometheus-exporter if { path /metrics } ... -- 2.30.0
[PATCH 3/3] REGTESTS: contrib/prometheus-exporter: test well known labels
as we previously briefly broke labels handling, test them to make sure we don't introduce regressions in the future. see also commit 040b1195f70d6a24204ede081451fd1dd71e6a34 ("BUG/MINOR: contrib/prometheus-exporter: Restart labels dump at the right pos") for reference Signed-off-by: William Dauchy --- reg-tests/contrib/prometheus.vtc | 9 + 1 file changed, 9 insertions(+) diff --git a/reg-tests/contrib/prometheus.vtc b/reg-tests/contrib/prometheus.vtc index 1ebeb29cb..ebe0b8753 100644 --- a/reg-tests/contrib/prometheus.vtc +++ b/reg-tests/contrib/prometheus.vtc @@ -71,6 +71,15 @@ client c1 -connect ${h1_stats_sock} { expect resp.body ~ ".*haproxy_server_idle_connections_limit{proxy=\"be\",server=\"s1\"} NaN.*" expect resp.body ~ ".*haproxy_server_idle_connections_limit{proxy=\"be\",server=\"s2\"} 42.*" + # test well known labels presence + expect resp.body ~ ".*haproxy_process_build_info{version=\".*\"} 1.*" + expect resp.body ~ ".*haproxy_frontend_http_responses_total{proxy=\"stats\",code=\"4xx\"} 0.*" + expect resp.body ~ ".*haproxy_frontend_status{proxy=\"fe\",state=\"UP\"} 1.*" + expect resp.body ~ ".*haproxy_listener_status{proxy=\"stats\",listener=\"sock-1\",state=\"WAITING\"} 0.*" + expect resp.body ~ ".*haproxy_backend_status{proxy=\"be\",state=\"UP\"} 1.*" + expect resp.body ~ ".*haproxy_server_status{proxy=\"be\",server=\"s1\",state=\"DOWN\"} 0.*" + expect resp.body ~ ".*haproxy_server_check_status{proxy=\"be\",server=\"s2\",state=\"HANA\"} 0.*" + # test scope txreq -url "/metrics?scope=" rxresp -- 2.30.0
[PATCH] MINOR: cli: add missing agent commands for set server
we previously forgot to add `agent-*` commands. Take this opportunity to rewrite the help string in a simpler way for readability (mainly removing simple quotes) Signed-off-by: William Dauchy --- src/server.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/src/server.c b/src/server.c index da6ee52ad..1c1eeab73 100644 --- a/src/server.c +++ b/src/server.c @@ -4618,9 +4618,10 @@ static int cli_parse_set_server(char **args, char *payload, struct appctx *appct #endif } else { cli_err(appctx, - "'set server ' only supports 'agent', 'health', " - "'state', 'weight', 'addr', 'fqdn', 'check-addr', " - "'check-port' and 'ssl'.\n"); + "usage: set server / " + "addr | agent | agent-addr | agent-port | agent-send | " + "check-addr | check-port | fqdn | health | ssl | " + "state | weight\n"); } out_unlock: HA_SPIN_UNLOCK(SERVER_LOCK, >lock); -- 2.30.0
[PATCH 2/3] MINOR: stats: add helper to get status string
move listen status to a helper, defining both status enum and string definition. this will be helpful to be reused in prometheus code. It also removes this hard-to-read nested ternary. Signed-off-by: William Dauchy --- include/haproxy/listener-t.h | 9 + include/haproxy/listener.h | 3 +++ src/listener.c | 18 ++ src/stats.c | 2 +- 4 files changed, 31 insertions(+), 1 deletion(-) diff --git a/include/haproxy/listener-t.h b/include/haproxy/listener-t.h index ce5ed408f..7d3998ece 100644 --- a/include/haproxy/listener-t.h +++ b/include/haproxy/listener-t.h @@ -84,6 +84,15 @@ enum li_state { * not rely on this state. */ +/* listener status for stats */ +enum li_status { + LI_STATUS_WAITING = 0, + LI_STATUS_OPEN, + LI_STATUS_FULL, + + LI_STATE_COUNT /* must be last */ +}; + /* listener socket options */ #define LI_O_NONE 0x #define LI_O_NOLINGER 0x0001 /* disable linger on this socket */ diff --git a/include/haproxy/listener.h b/include/haproxy/listener.h index 1be8551c3..f229efae4 100644 --- a/include/haproxy/listener.h +++ b/include/haproxy/listener.h @@ -218,6 +218,9 @@ struct task *manage_global_listener_queue(struct task *t, void *context, unsigne extern struct accept_queue_ring accept_queue_rings[MAX_THREADS] __attribute__((aligned(64))); +extern const char* li_status_st[LI_STATE_COUNT]; +enum li_status get_li_status(struct listener *l); + #endif /* _HAPROXY_LISTENER_H */ /* diff --git a/src/listener.c b/src/listener.c index 0b929b906..9ca910b37 100644 --- a/src/listener.c +++ b/src/listener.c @@ -46,6 +46,12 @@ static struct bind_kw_list bind_keywords = { static struct mt_list global_listener_queue = MT_LIST_HEAD_INIT(global_listener_queue); static struct task *global_listener_queue_task; +/* listener status for stats */ +const char* li_status_st[LI_STATE_COUNT] = { + [LI_STATUS_WAITING] = "WAITING", + [LI_STATUS_OPEN]= "OPEN", + [LI_STATUS_FULL]= "FULL", +}; #if defined(USE_THREAD) @@ -183,6 +189,18 @@ REGISTER_CONFIG_POSTPARSER("multi-threaded accept queue", accept_queue_init); #endif // USE_THREAD +/* helper to get listener status for stats */ +enum li_status get_li_status(struct listener *l) +{ + if (!l->maxconn || l->nbconn < l->maxconn) { + if (l->state == LI_LIMITED) + return LI_STATUS_WAITING; + else + return LI_STATUS_OPEN; + } + return LI_STATUS_FULL; +} + /* adjust the listener's state and its proxy's listener counters if needed. * It must be called under the listener's lock, but uses atomic ops to change * the proxy's counters so that the proxy lock is not needed. diff --git a/src/stats.c b/src/stats.c index 1e0db0b24..a63178d5a 100644 --- a/src/stats.c +++ b/src/stats.c @@ -1898,7 +1898,7 @@ int stats_fill_li_stats(struct proxy *px, struct listener *l, int flags, metric = mkf_u64(FN_COUNTER, l->counters->denied_sess); break; case ST_F_STATUS: - metric = mkf_str(FO_STATUS, (!l->maxconn || l->nbconn < l->maxconn) ? (l->state == LI_LIMITED) ? "WAITING" : "OPEN" : "FULL"); + metric = mkf_str(FO_STATUS, li_status_st[get_li_status(l)]); break; case ST_F_PID: metric = mkf_u32(FO_KEY, relative_pid); -- 2.30.0
[PATCH 1/3] MEDIUM: stats: allow to select one field in `stats_fill_li_stats`
prometheus approach requires to output all values for a given metric name; meaning we iterate through all metrics, and then iterate in the inner loop on all objects for this metric. In order to allow more code reuse, adapt the stats API to be able to select one field or fill them all otherwise. >From this patch it should be possible to add support for listen stats in prometheus. Signed-off-by: William Dauchy --- include/haproxy/stats.h | 2 +- src/hlua_fcn.c | 3 +- src/stats.c | 166 +++- 3 files changed, 116 insertions(+), 55 deletions(-) diff --git a/include/haproxy/stats.h b/include/haproxy/stats.h index 6115dca8f..cbc33f279 100644 --- a/include/haproxy/stats.h +++ b/include/haproxy/stats.h @@ -49,7 +49,7 @@ int stats_fill_info(struct field *info, int len); int stats_fill_fe_stats(struct proxy *px, struct field *stats, int len, enum stat_field *selected_field); int stats_fill_li_stats(struct proxy *px, struct listener *l, int flags, -struct field *stats, int len); +struct field *stats, int len, enum stat_field *selected_field); int stats_fill_sv_stats(struct proxy *px, struct server *sv, int flags, struct field *stats, int len, enum stat_field *selected_field); int stats_fill_be_stats(struct proxy *px, int flags, struct field *stats, int len, diff --git a/src/hlua_fcn.c b/src/hlua_fcn.c index 6dd9efb21..38a9cfd21 100644 --- a/src/hlua_fcn.c +++ b/src/hlua_fcn.c @@ -866,7 +866,8 @@ int hlua_listener_get_stats(lua_State *L) return 1; } - stats_fill_li_stats(li->bind_conf->frontend, li, STAT_SHLGNDS, stats, STATS_LEN); + stats_fill_li_stats(li->bind_conf->frontend, li, STAT_SHLGNDS, stats, + STATS_LEN, NULL); lua_newtable(L); for (i=0; i with the listener statistics. is - * preallocated array of length . The length of the array - * must be at least ST_F_TOTAL_FIELDS. If this length is less - * then this value, the function returns 0, otherwise, it - * returns 1. can take the value STAT_SHLGNDS. +/* Fill with the listener statistics. is preallocated array of + * length . The length of the array must be at least ST_F_TOTAL_FIELDS. If + * this length is less then this value, the function returns 0, otherwise, it + * returns 1. If selected_field is != NULL, only fill this one. can + * take the value STAT_SHLGNDS. */ int stats_fill_li_stats(struct proxy *px, struct listener *l, int flags, -struct field *stats, int len) +struct field *stats, int len, enum stat_field *selected_field) { + enum stat_field current_field = (selected_field != NULL ? *selected_field : 0); struct buffer *out = get_trash_chunk(); if (len < ST_F_TOTAL_FIELDS) @@ -1850,54 +1851,112 @@ int stats_fill_li_stats(struct proxy *px, struct listener *l, int flags, chunk_reset(out); - stats[ST_F_PXNAME] = mkf_str(FO_KEY|FN_NAME|FS_SERVICE, px->id); - stats[ST_F_SVNAME] = mkf_str(FO_KEY|FN_NAME|FS_SERVICE, l->name); - stats[ST_F_MODE] = mkf_str(FO_CONFIG|FS_SERVICE, proxy_mode_str(px->mode)); - stats[ST_F_SCUR] = mkf_u32(0, l->nbconn); - stats[ST_F_SMAX] = mkf_u32(FN_MAX, l->counters->conn_max); - stats[ST_F_SLIM] = mkf_u32(FO_CONFIG|FN_LIMIT, l->maxconn); - stats[ST_F_STOT] = mkf_u64(FN_COUNTER, l->counters->cum_conn); - stats[ST_F_BIN] = mkf_u64(FN_COUNTER, l->counters->bytes_in); - stats[ST_F_BOUT] = mkf_u64(FN_COUNTER, l->counters->bytes_out); - stats[ST_F_DREQ] = mkf_u64(FN_COUNTER, l->counters->denied_req); - stats[ST_F_DRESP]= mkf_u64(FN_COUNTER, l->counters->denied_resp); - stats[ST_F_EREQ] = mkf_u64(FN_COUNTER, l->counters->failed_req); - stats[ST_F_DCON] = mkf_u64(FN_COUNTER, l->counters->denied_conn); - stats[ST_F_DSES] = mkf_u64(FN_COUNTER, l->counters->denied_sess); - stats[ST_F_STATUS] = mkf_str(FO_STATUS, (!l->maxconn || l->nbconn < l->maxconn) ? (l->state == LI_LIMITED) ? "WAITING" : "OPEN" : "FULL"); - stats[ST_F_PID] = mkf_u32(FO_KEY, relative_pid); - stats[ST_F_IID] = mkf_u32(FO_KEY|FS_SERVICE, px->uuid); - stats[ST_F_SID] = mkf_u32(FO_KEY|FS_SERVICE, l->luid); - stats[ST_F_TYPE] = mkf_u32(FO_CONFIG|FS_SERVICE, STATS_TYPE_SO); - stats[ST_F_WREW] = mkf_u64(FN_COUNTER, l->counters->failed_rewrites); - stats[ST_F_EINT] = mkf_u64(FN_COUNTER, l->counters->internal_errors); - - if (flags & STAT_SHLGNDS) { - char str[INET6_ADDRSTRLEN]; - int port; - - port = get_host_port(>rx.addr);
[PATCH 0/3] prometheus: add listen stats
Hello Christopher, I know I'm a bit late regarding the merge window; this is however the logical followup of my prometheus work, now adding listen stats. patch 2/3 is a proposition but I'm ok to revisit. William Dauchy (3): MEDIUM: stats: allow to select one field in `stats_fill_li_stats` MINOR: stats: add helper to get status string MEDIUM: contrib/prometheus-exporter: add listen stats contrib/prometheus-exporter/README| 24 +- .../prometheus-exporter/service-prometheus.c | 274 -- include/haproxy/listener-t.h | 9 + include/haproxy/listener.h| 3 + include/haproxy/stats.h | 2 +- reg-tests/contrib/prometheus.vtc | 4 + src/hlua_fcn.c| 3 +- src/listener.c| 18 ++ src/stats.c | 166 +++ 9 files changed, 365 insertions(+), 138 deletions(-) -- 2.30.0
[PATCH 3/3] MEDIUM: contrib/prometheus-exporter: add listen stats
this was a missing piece for a while now even though it was planned. This patch adds listen stats. Nothing in particular but we make use of the status helper previously added. `promex_st_metrics` diff also looks scary, but I had to realign all lines. Signed-off-by: William Dauchy --- contrib/prometheus-exporter/README| 24 +- .../prometheus-exporter/service-prometheus.c | 274 -- reg-tests/contrib/prometheus.vtc | 4 + 3 files changed, 219 insertions(+), 83 deletions(-) diff --git a/contrib/prometheus-exporter/README b/contrib/prometheus-exporter/README index d882b092f..f08cceba6 100644 --- a/contrib/prometheus-exporter/README +++ b/contrib/prometheus-exporter/README @@ -70,6 +70,7 @@ exported. Here are examples: /metrics?scope=server # ==> server metrics will be exported /metrics?scope=frontend=backend # ==> Frontend and backend metrics will be exported + /metrics?scope=listen # ==> listen metrics will be exported /metrics?scope=*= # ==> no metrics will be exported /metrics?scope==global # ==> global metrics will be exported /metrics?scope=sticktable # ==> stick tables metrics will be exported @@ -102,7 +103,7 @@ except the server_check_status, you may configure prometheus that way: - metric_relabel_configs: - source_labels: ['__name__'] - regex: 'haproxy_(process_|frontend_|backend_|server_check_status).*' + regex: 'haproxy_(process_|frontend_|listen_|backend_|server_check_status).*' action: keep Exported metrics @@ -212,6 +213,27 @@ See prometheus export for the description of each field. | haproxy_frontend_internal_errors_total | +-+ +* Listen metrics + ++-+ +|Metric name | ++-+ +| haproxy_listen_current_sessions | +| haproxy_listen_max_sessions | +| haproxy_listen_limit_sessions | +| haproxy_listen_sessions_total | +| haproxy_listen_bytes_in_total | +| haproxy_listen_bytes_out_total | +| haproxy_listen_requests_denied_total| +| haproxy_listen_responses_denied_total | +| haproxy_listen_request_errors_total | +| haproxy_listen_status | +| haproxy_listen_denied_connections_total | +| haproxy_listen_denied_sessions_total| +| haproxy_listen_failed_header_rewriting_total| +| haproxy_listen_internal_errors_total| ++-+ + * Backend metrics +-+ diff --git a/contrib/prometheus-exporter/service-prometheus.c b/contrib/prometheus-exporter/service-prometheus.c index 403f5a619..70ea1b7b4 100644 --- a/contrib/prometheus-exporter/service-prometheus.c +++ b/contrib/prometheus-exporter/service-prometheus.c @@ -63,17 +63,19 @@ enum { #define PROMEX_FL_FRONT_METRIC 0x0004 #define PROMEX_FL_BACK_METRIC 0x0008 #define PROMEX_FL_SRV_METRIC0x0010 -#define PROMEX_FL_SCOPE_GLOBAL 0x0020 -#define PROMEX_FL_SCOPE_FRONT 0x0040 -#define PROMEX_FL_SCOPE_BACK0x0080 -#define PROMEX_FL_SCOPE_SERVER 0x0100 -#define PROMEX_FL_NO_MAINT_SRV 0x0200 -#define PROMEX_FL_STICKTABLE_METRIC 0x0400 -#define PROMEX_FL_SCOPE_STICKTABLE 0x0800 +#define PROMEX_FL_LI_METRIC 0x0020 +#define PROMEX_FL_STICKTABLE_METRIC 0x0040 +#define PROMEX_FL_SCOPE_GLOBAL 0x0080 +#define PROMEX_FL_SCOPE_FRONT 0x0100 +#define PROMEX_FL_SCOPE_BACK0x0200 +#define PROMEX_FL_SCOPE_SERVER 0x0400 +#define PROMEX_FL_SCOPE_LI 0x0800 +#define PROMEX_FL_SCOPE_STICKTABLE 0x1000 +#define PROMEX_FL_NO_MAINT_SRV 0x2000 #define PROMEX_FL_SCOPE_ALL (PROMEX_FL_SCOPE_GLOBAL | PROMEX_FL_SCOPE_FRONT | \ -PROMEX_FL_SCOPE_BACK | PROMEX_FL_SCOPE_SERVER | \ -PROMEX_FL_SCOPE_STICKTABLE) +PROMEX_FL_SCOPE_LI | PROMEX_FL_SCOPE_BACK | \ +PROMEX_FL_SCOPE_SERVER | PROMEX_FL_SCOPE_STICKTABLE) /* Promtheus metric type (gauge or counter) */ enum promex_mt_type { @@ -187,66 +189,66 @@ const struct promex_metric promex_global_metrics[INF_TOTAL_FIELDS] = { const struct promex_metric promex_st_metrics[ST_F_TOTAL_FIELDS] = { //[ST_F_PXNAME] ignored //[ST_F_SVNAME] ignored - [ST_F_QCUR] = { .n = IST("current_queue"), .type = PROMEX_MT_GAUGE,.flags = ( PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) }, - [ST_F_QMAX]
[PATCH 2/2] CLEANUP: contrib/prometheus-exporter: align for with srv status case
the for loop was wrongly indented following our recent rework Signed-off-by: William Dauchy --- contrib/prometheus-exporter/service-prometheus.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/contrib/prometheus-exporter/service-prometheus.c b/contrib/prometheus-exporter/service-prometheus.c index 403f5a619..1b08fcfb4 100644 --- a/contrib/prometheus-exporter/service-prometheus.c +++ b/contrib/prometheus-exporter/service-prometheus.c @@ -861,7 +861,7 @@ static int promex_dump_srv_metrics(struct appctx *appctx, struct htx *htx) switch (appctx->st2) { case ST_F_STATUS: state = promex_srv_status(sv); - for (; appctx->ctx.stats.st_code < PROMEX_SRV_STATE_COUNT; appctx->ctx.stats.st_code++) { + for (; appctx->ctx.stats.st_code < PROMEX_SRV_STATE_COUNT; appctx->ctx.stats.st_code++) { val = mkf_u32(FO_STATUS, state == appctx->ctx.stats.st_code); labels[2].name = ist("state"); labels[2].value = promex_srv_st[appctx->ctx.stats.st_code]; -- 2.30.0
[PATCH 1/2] CLEANUP: check: fix get_check_status_info declaration
we always put a \n between function name and `{` Signed-off-by: William Dauchy --- src/check.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/check.c b/src/check.c index d37a55201..d17f747fe 100644 --- a/src/check.c +++ b/src/check.c @@ -178,8 +178,8 @@ const char *get_check_status_description(short check_status) { } /* Converts check_status code to short info */ -const char *get_check_status_info(short check_status) { - +const char *get_check_status_info(short check_status) +{ const char *info; if (check_status < HCHK_STATUS_SIZE) -- 2.30.0
[PATCH] DOC: tune: explain the origin of block size for ssl.cachesize
A user could eventually ask himself where those 200 bytes block size are coming from. This patch tries to better explain the origin in case people are curious or want to double check the reality. Signed-off-by: William Dauchy --- doc/configuration.txt | 21 +++-- 1 file changed, 11 insertions(+), 10 deletions(-) diff --git a/doc/configuration.txt b/doc/configuration.txt index 391e074a7..b21c56091 100644 --- a/doc/configuration.txt +++ b/doc/configuration.txt @@ -2451,16 +2451,17 @@ tune.sndbuf.server tune.ssl.cachesize Sets the size of the global SSL session cache, in a number of blocks. A block - is large enough to contain an encoded session without peer certificate. - An encoded session with peer certificate is stored in multiple blocks - depending on the size of the peer certificate. A block uses approximately - 200 bytes of memory. The default value may be forced at build time, otherwise - defaults to 2. When the cache is full, the most idle entries are purged - and reassigned. Higher values reduce the occurrence of such a purge, hence - the number of CPU-intensive SSL handshakes by ensuring that all users keep - their session as long as possible. All entries are pre-allocated upon startup - and are shared between all processes if "nbproc" is greater than 1. Setting - this value to 0 disables the SSL session cache. + is large enough to contain an encoded session without peer certificate. An + encoded session with peer certificate is stored in multiple blocks depending + on the size of the peer certificate. A block uses approximately 200 bytes of + memory (based on `sizeof(struct sh_ssl_sess_hdr) + SHSESS_BLOCK_MIN_SIZE` + calculation used for `shctx_init` function). The default value may be forced + at build time, otherwise defaults to 2. When the cache is full, the most + idle entries are purged and reassigned. Higher values reduce the occurrence + of such a purge, hence the number of CPU-intensive SSL handshakes by ensuring + that all users keep their session as long as possible. All entries are + pre-allocated upon startup and are shared between all processes if "nbproc" + is greater than 1. Setting this value to 0 disables the SSL session cache. tune.ssl.force-private-cache This option disables SSL session cache sharing between all processes. It -- 2.30.0
Re: [PATCH v3 0/5] cli commands for checks and agent
On Fri, Feb 12, 2021 at 3:06 PM Christopher Faulet wrote: > I just slightly amended the 3rd patch to handle the v2 in > apply_server_state(). > There is a test on the version when a state-file is local to a proxy. Just a > minor change. ok, thanks for that. > And in the last one, I removed the "chunk_appendf(msg, "\n");" to > move the LF in ha_warning() calls. yes ok it is probably clearer that way. Glad to see the end of this series, now I'm ready to receive related bug reports ;) -- William
Re: stats / "show servers conn" looses counter after reload
Hi Christian, On Fri, Feb 12, 2021 at 11:59 AM Christian Ruppert wrote: > Is this a bug? Can you confirm this behavior? Is there any other way I > could figure out whether a backend is currently in use? unfortunately reload does not recover stats values; it is a known problem; see also https://github.com/haproxy/haproxy/issues/954 -- William
[PATCH v3 4/5] MEDIUM: server: support {check,agent}_addr, agent_port in server state
logical followup from cli commands addition, so that the state server file stays compatible with the changes made at runtime; use previously added helper to load server attributes. also alloc a specific chunk to avoid mixing with other called functions using it Signed-off-by: William Dauchy --- doc/management.txt| 5 +- include/haproxy/server-t.h| 9 ++- .../checks/1be_40srv_odd_health_checks.vtc| 2 +- .../checks/40be_2srv_odd_health_checks.vtc| 2 +- reg-tests/checks/4be_1srv_health_checks.vtc | 6 +- src/proxy.c | 41 ++- src/server.c | 68 +-- 7 files changed, 87 insertions(+), 46 deletions(-) diff --git a/doc/management.txt b/doc/management.txt index 423c614b2..60e25c7e1 100644 --- a/doc/management.txt +++ b/doc/management.txt @@ -2455,7 +2455,10 @@ show servers state [] srv_port:Server port. srvrecord: DNS SRV record associated to this SRV. srv_use_ssl: use ssl for server connections. - srv_check_port: Server check port. + srv_check_port: Server health check port. + srv_check_addr: Server health check address. + srv_agent_addr: Server health agent address. + srv_agent_port: Server health agent port. show sess Dump all known sessions. Avoid doing this on slow connections as this can diff --git a/include/haproxy/server-t.h b/include/haproxy/server-t.h index 2ec70dde4..11cb71489 100644 --- a/include/haproxy/server-t.h +++ b/include/haproxy/server-t.h @@ -126,11 +126,14 @@ enum srv_initaddr { "srv_port " \ "srvrecord " \ "srv_use_ssl "\ -"srv_check_port" +"srv_check_port " \ +"srv_check_addr " \ +"srv_agent_addr " \ +"srv_agent_port" -#define SRV_STATE_FILE_MAX_FIELDS 22 +#define SRV_STATE_FILE_MAX_FIELDS 25 #define SRV_STATE_FILE_NB_FIELDS_VERSION_1 20 -#define SRV_STATE_FILE_NB_FIELDS_VERSION_2 22 +#define SRV_STATE_FILE_NB_FIELDS_VERSION_2 25 #define SRV_STATE_LINE_MAXLEN 512 /* server flags -- 32 bits */ diff --git a/reg-tests/checks/1be_40srv_odd_health_checks.vtc b/reg-tests/checks/1be_40srv_odd_health_checks.vtc index f01205295..c279972aa 100644 --- a/reg-tests/checks/1be_40srv_odd_health_checks.vtc +++ b/reg-tests/checks/1be_40srv_odd_health_checks.vtc @@ -112,6 +112,6 @@ syslog S -wait haproxy h1 -cli { send "show servers state" -expect ~ "# be_id be_name srv_id srv_name srv_addr srv_op_state srv_admin_state srv_uweight srv_iweight srv_time_since_last_change srv_check_status srv_check_result srv_check_health srv_check_state srv_agent_state bk_f_forced_id srv_f_forced_id srv_fqdn srv_port srvrecord srv_use_ssl srv_check_port\n2 be1 1 srv0 ${s0_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s0_port} - 0 0\n2 be1 2 srv1 ${s1_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s1_port} - 0 0\n2 be1 3 srv2 ${s2_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s2_port} - 0 0\n2 be1 4 srv3 ${s3_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s3_port} - 0 0\n2 be1 5 srv4 ${s4_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s4_port} - 0 0\n2 be1 6 srv5 ${s5_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s5_port} - 0 0\n2 be1 7 srv6 ${s6_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s6_port} - 0 0\n2 be1 8 srv7 ${s7_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s7_port} - 0 0\n2 be1 9 srv8 ${s8_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s8_port} - 0 0\n2 be1 10 srv9 ${s9_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s9_port} - 0 0\n2 be1 11 srv10 ${s10_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s10_port} - 0 0\n2 be1 12 srv11 ${s11_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s11_port} - 0 0\n2 be1 13 srv12 ${s12_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s12_port} - 0 0\n2 be1 14 srv13 ${s13_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s13_port} - 0 0\n2 be1 15 srv14 ${s14_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s14_port} - 0 0\n2 be1 16 srv15 ${s15_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s15_port} - 0 0\n2 be1 17 srv16 ${s16_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s16_port} - 0 0\n2 be1 18 srv17 ${s17_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s17_port} - 0 0\n2 be1 19 srv18 ${s18_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s18_port} - 0 0\n2 be1 20 srv19 ${s19_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s19_port} - 0 0\n2 be1 21 srv20 ${s20_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s20_port} - 0 0\n2 be1 22 srv21 ${s21_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s21_port} - 0 0\n2 b
[PATCH v3 1/5] MEDIUM: cli: add check-addr command
this patch allows to set server health check address at runtime. In order to align with `addr` command, also allow to set port optionnaly. This led to a small refactor in order to use the same function for both `check-addr` and `check-port` commands. for `check-port`, we however don't permit the change anymore if checks are not enabled on the server. This command becomes more and more useful for people having a consul like architecture: - the backend server is located on a container with its own IP - the health checks are done the consul instance located on the host with the host IP Signed-off-by: William Dauchy --- doc/management.txt | 4 ++ src/server.c | 95 ++ 2 files changed, 84 insertions(+), 15 deletions(-) diff --git a/doc/management.txt b/doc/management.txt index b74aba769..bff770e4e 100644 --- a/doc/management.txt +++ b/doc/management.txt @@ -1842,6 +1842,10 @@ set server / health [ up | stopping | down ] switch a server's state regardless of some slow health checks for example. Note that the change is propagated to tracking servers if any. +set server / check-addr [port ] + Change the IP address used for server health checks. + Optionally, change the port used for server health checks. + set server / check-port Change the port used for health checking to diff --git a/src/server.c b/src/server.c index 453c59866..74bd972e2 100644 --- a/src/server.c +++ b/src/server.c @@ -54,6 +54,8 @@ static int srv_set_fqdn(struct server *srv, const char *fqdn, int dns_locked); static void srv_state_parse_line(char *buf, const int version, char **params, char **srv_params); static int srv_state_get_version(FILE *f); static void srv_cleanup_connections(struct server *srv); +static const char *update_server_check_addr_port(struct server *s, const char *addr, +const char *port); /* List head of all known server keywords */ static struct srv_kw_list srv_keywords = { @@ -3574,6 +3576,59 @@ int update_server_addr(struct server *s, void *ip, int ip_sin_family, const char return 0; } +/* update server health check address and port + * addr must be ip4 or ip6, it won't be resolved + * if one error occurs, don't apply anything + * must be called with the server lock held. + */ +static const char *update_server_check_addr_port(struct server *s, const char *addr, +const char *port) +{ + struct sockaddr_storage sk; + struct buffer *msg; + int new_port; + + msg = get_trash_chunk(); + chunk_reset(msg); + + if (!(s->check.state & CHK_ST_ENABLED)) { + chunk_strcat(msg, "health checks are not enabled on this server"); + goto out; + } + if (addr) { + memset(, 0, sizeof(struct sockaddr_storage)); + if (str2ip2(addr, , 0) == NULL) { + chunk_appendf(msg, "invalid addr '%s'", addr); + goto out; + } + } + if (port) { + if (strl2irc(port, strlen(port), _port) != 0) { + chunk_appendf(msg, "provided port is not an integer"); + goto out; + } + if (new_port < 0 || new_port > 65535) { + chunk_appendf(msg, "provided port is invalid"); + goto out; + } + /* prevent the update of port to 0 if MAPPORTS are in use */ + if ((s->flags & SRV_F_MAPPORTS) && new_port == 0) { + chunk_appendf(msg, "can't unset 'port' since MAPPORTS is in use"); + goto out; + } + } +out: + if (msg->data) + return msg->area; + else { + if (addr) + s->check.addr = sk; + if (port) + s->check.port = new_port; + } + return NULL; +} + /* * This function update a server's addr and port only for AF_INET and AF_INET6 families. * @@ -4406,23 +4461,32 @@ static int cli_parse_set_server(char **args, char *payload, struct appctx *appct cli_err(appctx, "cannot allocate memory for new string.\n"); } } - else if (strcmp(args[3], "check-port") == 0) { - int i = 0; - if (strl2irc(args[4], strlen(args[4]), ) != 0) { - cli_err(appctx, "'set server check-port' expects an integer as argument.\n"); - goto out_unlock; - } - if ((i < 0) || (i > 65535)) { - cli_err(appctx, "provided port is not valid.\n"); + else if (strcmp(args[
[PATCH v3 5/5] MINOR: server: enhance error precision when applying server state
server health checks and agent parameters are written the same way as others to be able to enahcne code reuse: basically we make use of parsing and assignment at the same place. It makes it difficult for error handling to know whether srv object was modified partially or not. The problem was already present with SRV resolution though. I was a bit puzzled about the approach to take to be honest, and I did not wanted to go into a full refactor, so I assumed it was ok to simply notify whether the line was failed or partially applied. Signed-off-by: William Dauchy --- src/server.c | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/src/server.c b/src/server.c index 739bcfba6..b0a8f90d4 100644 --- a/src/server.c +++ b/src/server.c @@ -2625,6 +2625,7 @@ static void srv_update_state(struct server *srv, int version, char **params) unsigned int port_svc; char *srvrecord; char *addr; + int partial_apply = 0; #ifdef USE_OPENSSL int use_ssl; #endif @@ -2795,6 +2796,7 @@ static void srv_update_state(struct server *srv, int version, char **params) /* don't apply anything if one error has been detected */ if (msg->data) goto out; + partial_apply = 1; /* recover operational state and apply it to this server * and all servers tracking this one */ @@ -3032,8 +3034,12 @@ static void srv_update_state(struct server *srv, int version, char **params) HA_SPIN_UNLOCK(SERVER_LOCK, >lock); if (msg->data) { chunk_appendf(msg, "\n"); - ha_warning("server-state application failed for server '%s/%s'%s", - srv->proxy->id, srv->id, msg->area); + if (partial_apply == 1) + ha_warning("server-state partially applied for server '%s/%s'%s", + srv->proxy->id, srv->id, msg->area); + else + ha_warning("server-state application failed for server '%s/%s'%s", + srv->proxy->id, srv->id, msg->area); } end: free_trash_chunk(msg); -- 2.30.0
[PATCH v3 3/5] MEDIUM: server: add server-states version 2
Even if it is possibly too much work for the current usage, it makes sure we don't break states file from v2.3 to v2.4; indeed, since v2.3, we introduced two new fields, so we put them aside to guarantee we can easily reload from a version 1. The diff seems huge but there is no specific change apart from: - introduce v2 where it is needed (parsing, update) - move away from switch/case in update to be able to reuse code - move srv lock to the whole function to make it easier this patch confirm how painful it is to maintain this functionality. Signed-off-by: William Dauchy --- include/haproxy/server-t.h | 7 +- src/server.c | 742 ++--- 2 files changed, 367 insertions(+), 382 deletions(-) diff --git a/include/haproxy/server-t.h b/include/haproxy/server-t.h index b326305e8..2ec70dde4 100644 --- a/include/haproxy/server-t.h +++ b/include/haproxy/server-t.h @@ -101,9 +101,9 @@ enum srv_initaddr { } __attribute__((packed)); /* server-state-file version */ -#define SRV_STATE_FILE_VERSION 1 +#define SRV_STATE_FILE_VERSION 2 #define SRV_STATE_FILE_VERSION_MIN 1 -#define SRV_STATE_FILE_VERSION_MAX 1 +#define SRV_STATE_FILE_VERSION_MAX 2 #define SRV_STATE_FILE_FIELD_NAMES \ "be_id " \ "be_name "\ @@ -129,7 +129,8 @@ enum srv_initaddr { "srv_check_port" #define SRV_STATE_FILE_MAX_FIELDS 22 -#define SRV_STATE_FILE_NB_FIELDS_VERSION_1 22 +#define SRV_STATE_FILE_NB_FIELDS_VERSION_1 20 +#define SRV_STATE_FILE_NB_FIELDS_VERSION_2 22 #define SRV_STATE_LINE_MAXLEN 512 /* server flags -- 32 bits */ diff --git a/src/server.c b/src/server.c index 9a2a8c520..c2b272068 100644 --- a/src/server.c +++ b/src/server.c @@ -2632,391 +2632,383 @@ static void srv_update_state(struct server *srv, int version, char **params) fqdn = NULL; port_svc = port_check = 0; msg = get_trash_chunk(); - switch (version) { - case 1: - /* -* now we can proceed with server's state update: -* srv_addr: params[0] -* srv_op_state: params[1] -* srv_admin_state: params[2] -* srv_uweight: params[3] -* srv_iweight: params[4] -* srv_last_time_change: params[5] -* srv_check_status: params[6] -* srv_check_result: params[7] -* srv_check_health: params[8] -* srv_check_state: params[9] -* srv_agent_state: params[10] -* bk_f_forced_id: params[11] -* srv_f_forced_id: params[12] -* srv_fqdn: params[13] -* srv_port: params[14] -* srvrecord:params[15] -* srv_use_ssl: params[16] -* srv_check_port: params[17] -*/ + HA_SPIN_LOCK(SERVER_LOCK, >lock); + + if (version >= 1) { + /* srv_addr: params[0] +* srv_op_state: params[1] +* srv_admin_state: params[2] +* srv_uweight: params[3] +* srv_iweight: params[4] +* srv_last_time_change: params[5] +* srv_check_status: params[6] +* srv_check_result: params[7] +* srv_check_health: params[8] +* srv_check_state: params[9] +* srv_agent_state: params[10] +* bk_f_forced_id: params[11] +* srv_f_forced_id: params[12] +* srv_fqdn: params[13] +* srv_port: params[14] +* srvrecord:params[15] +*/ - /* validating srv_op_state */ - p = NULL; - errno = 0; - srv_op_state = strtol(params[1], , 10); - if ((p == params[1]) || errno == EINVAL || errno == ERANGE || - (srv_op_state != SRV_ST_STOPPED && -srv_op_state != SRV_ST_STARTING && -srv_op_state != SRV_ST_RUNNING && -srv_op_state != SRV_ST_STOPPING)) { - chunk_appendf(msg, ", invalid srv_op_state value '%s'", params[1]); - } + /* validating srv_op_state */ + p = NULL; + errno = 0; + srv_op_state = strtol(params[1], , 10); +
[PATCH v3 2/5] MEDIUM: cli: add agent-port command
this patch allows to set agent port at runtime. In order to align with both `addr` and `check-addr` commands, also add the possibility to optionnaly set port on `agent-addr` command. This led to a small refactor in order to use the same function for both `agent-addr` and `agent-port` commands. Signed-off-by: William Dauchy --- doc/management.txt | 6 +++- src/server.c | 84 +- 2 files changed, 80 insertions(+), 10 deletions(-) diff --git a/doc/management.txt b/doc/management.txt index bff770e4e..423c614b2 100644 --- a/doc/management.txt +++ b/doc/management.txt @@ -1828,10 +1828,14 @@ set server / agent [ up | down ] switch a server's state regardless of some slow agent checks for example. Note that the change is propagated to tracking servers if any. -set server / agent-addr +set server / agent-addr [port ] Change addr for servers agent checks. Allows to migrate agent-checks to another address at runtime. You can specify both IP and hostname, it will be resolved. + Optionally, change the port agent. + +set server / agent-port + Change the port used for agent checks. set server / agent-send Change agent string sent to agent check target. Allows to update string while diff --git a/src/server.c b/src/server.c index 74bd972e2..9a2a8c520 100644 --- a/src/server.c +++ b/src/server.c @@ -56,6 +56,8 @@ static int srv_state_get_version(FILE *f); static void srv_cleanup_connections(struct server *srv); static const char *update_server_check_addr_port(struct server *s, const char *addr, const char *port); +static const char *update_server_agent_addr_port(struct server *s, const char *addr, +const char *port); /* List head of all known server keywords */ static struct srv_kw_list srv_keywords = { @@ -3576,6 +3578,54 @@ int update_server_addr(struct server *s, void *ip, int ip_sin_family, const char return 0; } +/* update agent health check address and port + * addr can be ip4/ip6 or a hostname + * if one error occurs, don't apply anything + * must be called with the server lock held. + */ +static const char *update_server_agent_addr_port(struct server *s, const char *addr, +const char *port) +{ + struct sockaddr_storage sk; + struct buffer *msg; + int new_port; + + msg = get_trash_chunk(); + chunk_reset(msg); + + if (!(s->agent.state & CHK_ST_ENABLED)) { + chunk_strcat(msg, "agent checks are not enabled on this server"); + goto out; + } + if (addr) { + memset(, 0, sizeof(struct sockaddr_storage)); + if (str2ip(addr, ) == NULL) { + chunk_appendf(msg, "invalid addr '%s'", addr); + goto out; + } + } + if (port) { + if (strl2irc(port, strlen(port), _port) != 0) { + chunk_appendf(msg, "provided port is not an integer"); + goto out; + } + if (new_port < 0 || new_port > 65535) { + chunk_appendf(msg, "provided port is invalid"); + goto out; + } + } +out: + if (msg->data) + return msg->area; + else { + if (addr) + set_srv_agent_addr(s, ); + if (port) + set_srv_agent_port(s, new_port); + } + return NULL; +} + /* update server health check address and port * addr must be ip4 or ip6, it won't be resolved * if one error occurs, don't apply anything @@ -4443,15 +4493,31 @@ static int cli_parse_set_server(char **args, char *payload, struct appctx *appct cli_err(appctx, "'set server agent' expects 'up' or 'down'.\n"); } else if (strcmp(args[3], "agent-addr") == 0) { - struct sockaddr_storage sk; - - memset(, 0, sizeof(sk)); - if (!(sv->agent.state & CHK_ST_ENABLED)) - cli_err(appctx, "agent checks are not enabled on this server.\n"); - else if (str2ip(args[4], )) - set_srv_agent_addr(sv, ); - else - cli_err(appctx, "incorrect addr address given for agent.\n"); + char *addr = NULL; + char *port = NULL; + if (strlen(args[4]) == 0) { + cli_err(appctx, "set server / agent-addr requires" + " an address and optionally a port.\n"); + goto out_unlock; + } + addr = args[4]; + if (strcmp(args[5]
[PATCH v3 0/5] cli commands for checks and agent
Hello Christopher, Here is some work to finish you week. I believe I addressed all the points raised: - warning are no longer emitted when we have "0" or "-" values - I enhanced the warning output as well, and understood my mistake for my previous CLEANUP patch removing a space... so I removed this patch as well. - Fixed all the chunk_appendf issues. - I was a bit lazy to address the partial vs complete failure in parsing as I was a bit puzzled about the approach to take. I think it would be sad to duplicate the code for pre validation, but on the other hand I agree it was clear to assume the whole line was not applied at all. I however concluded it was acceptable to simply know the line was not fully applied. I believe in that case the user should worry. We can probably add a way to show where it stopped, but I felt discouraged with the srv resolution, in the middle of srv port, where it is harder to have a kinda generic way to know where we stopped. William Dauchy (5): MEDIUM: cli: add check-addr command MEDIUM: cli: add agent-port command MEDIUM: server: add server-states version 2 MEDIUM: server: support {check,agent}_addr, agent_port in server state MINOR: server: enhance error precision when applying server state doc/management.txt| 15 +- include/haproxy/server-t.h| 16 +- .../checks/1be_40srv_odd_health_checks.vtc| 2 +- .../checks/40be_2srv_odd_health_checks.vtc| 2 +- reg-tests/checks/4be_1srv_health_checks.vtc | 6 +- src/proxy.c | 41 +- src/server.c | 971 ++ 7 files changed, 612 insertions(+), 441 deletions(-) -- 2.30.0
Re: [PATCH v2 0/6] cli commands for checks and agent
On Wed, Feb 10, 2021 at 4:21 PM Christopher Faulet wrote: > To conclude, I may merge the first 4 patches if you want, but I guess you > will have to adapt the first ones to produce better warnings. So I will > wait your confirmation to proceed. However, I will merge the bug fix. > There is no reason to not take it. thanks for the review. don't worry I will come back with a v3. -- William
Re: [PATCH v2 3/6] BUG/MEDIUM: server: re-align state file fields number
Hello Christopher, On Mon, Feb 8, 2021 at 11:53 PM William Dauchy wrote: > Since commit 3169471964fdc49963e63f68c1fd88686821a0c4 ("MINOR: Add > server port field to server state file.") max_fields was not increased > on version number 1. So this patch aims to fix it. This should be > backported as far as v1.8, but the numbering should be adpated depending > on the version: simply increase the field by 1. this one is simply a mistake where I changed the subject line from MEDIUM to MINOR. So ignore this one. -- William
[PATCH v2 4/6] MEDIUM: server: add server-states version 2
Even if it is possibly too much work for the current usage, it makes sure we don't break states file from v2.3 to v2.4; indeed, since v2.3, we introduced two new fields, so we put them aside to guarantee we can easily reload from a version 1. The diff seems huge but there is no specific change apart from: - introduce v2 where it is needed (parsing, update) - move away from switch/case in update to be able to reuse code - move srv lock to the whole function to make it easier this patch confirm how painful it is to maintain this functionality. Signed-off-by: William Dauchy --- include/haproxy/server-t.h | 7 +- src/server.c | 742 ++--- 2 files changed, 367 insertions(+), 382 deletions(-) diff --git a/include/haproxy/server-t.h b/include/haproxy/server-t.h index af52b6a25..89d8ecc06 100644 --- a/include/haproxy/server-t.h +++ b/include/haproxy/server-t.h @@ -101,9 +101,9 @@ enum srv_initaddr { } __attribute__((packed)); /* server-state-file version */ -#define SRV_STATE_FILE_VERSION 1 +#define SRV_STATE_FILE_VERSION 2 #define SRV_STATE_FILE_VERSION_MIN 1 -#define SRV_STATE_FILE_VERSION_MAX 1 +#define SRV_STATE_FILE_VERSION_MAX 2 #define SRV_STATE_FILE_FIELD_NAMES \ "be_id " \ "be_name "\ @@ -129,7 +129,8 @@ enum srv_initaddr { "srv_check_port" #define SRV_STATE_FILE_MAX_FIELDS 22 -#define SRV_STATE_FILE_NB_FIELDS_VERSION_1 22 +#define SRV_STATE_FILE_NB_FIELDS_VERSION_1 20 +#define SRV_STATE_FILE_NB_FIELDS_VERSION_2 22 #define SRV_STATE_LINE_MAXLEN 512 /* server flags -- 32 bits */ diff --git a/src/server.c b/src/server.c index a865fb4dd..84b283567 100644 --- a/src/server.c +++ b/src/server.c @@ -2629,391 +2629,383 @@ static void srv_update_state(struct server *srv, int version, char **params) fqdn = NULL; port_svc = port_check = 0; msg = get_trash_chunk(); - switch (version) { - case 1: - /* -* now we can proceed with server's state update: -* srv_addr: params[0] -* srv_op_state: params[1] -* srv_admin_state: params[2] -* srv_uweight: params[3] -* srv_iweight: params[4] -* srv_last_time_change: params[5] -* srv_check_status: params[6] -* srv_check_result: params[7] -* srv_check_health: params[8] -* srv_check_state: params[9] -* srv_agent_state: params[10] -* bk_f_forced_id: params[11] -* srv_f_forced_id: params[12] -* srv_fqdn: params[13] -* srv_port: params[14] -* srvrecord:params[15] -* srv_use_ssl: params[16] -* srv_check_port: params[17] -*/ + HA_SPIN_LOCK(SERVER_LOCK, >lock); + + if (version >= 1) { + /* srv_addr: params[0] +* srv_op_state: params[1] +* srv_admin_state: params[2] +* srv_uweight: params[3] +* srv_iweight: params[4] +* srv_last_time_change: params[5] +* srv_check_status: params[6] +* srv_check_result: params[7] +* srv_check_health: params[8] +* srv_check_state: params[9] +* srv_agent_state: params[10] +* bk_f_forced_id: params[11] +* srv_f_forced_id: params[12] +* srv_fqdn: params[13] +* srv_port: params[14] +* srvrecord:params[15] +*/ - /* validating srv_op_state */ - p = NULL; - errno = 0; - srv_op_state = strtol(params[1], , 10); - if ((p == params[1]) || errno == EINVAL || errno == ERANGE || - (srv_op_state != SRV_ST_STOPPED && -srv_op_state != SRV_ST_STARTING && -srv_op_state != SRV_ST_RUNNING && -srv_op_state != SRV_ST_STOPPING)) { - chunk_appendf(msg, ", invalid srv_op_state value '%s'", params[1]); - } + /* validating srv_op_state */ + p = NULL; + errno = 0; + srv_op_state = strtol(params[1], , 10); +
[PATCH v2 5/6] MEDIUM: server: support {check,agent}_addr, agent_port in server state
logical followup from cli commands addition, so that the state server file stays compatible with the changes made at runtime; use previously added helper to load server attributes. also alloc a specific chunk to avoid mixing with other called functions using it Signed-off-by: William Dauchy --- doc/management.txt| 5 ++- include/haproxy/server-t.h| 9 ++-- .../checks/1be_40srv_odd_health_checks.vtc| 2 +- .../checks/40be_2srv_odd_health_checks.vtc| 2 +- reg-tests/checks/4be_1srv_health_checks.vtc | 6 +-- src/proxy.c | 41 +++ src/server.c | 40 +++--- 7 files changed, 66 insertions(+), 39 deletions(-) diff --git a/doc/management.txt b/doc/management.txt index 423c614b2..60e25c7e1 100644 --- a/doc/management.txt +++ b/doc/management.txt @@ -2455,7 +2455,10 @@ show servers state [] srv_port:Server port. srvrecord: DNS SRV record associated to this SRV. srv_use_ssl: use ssl for server connections. - srv_check_port: Server check port. + srv_check_port: Server health check port. + srv_check_addr: Server health check address. + srv_agent_addr: Server health agent address. + srv_agent_port: Server health agent port. show sess Dump all known sessions. Avoid doing this on slow connections as this can diff --git a/include/haproxy/server-t.h b/include/haproxy/server-t.h index 89d8ecc06..7f1356024 100644 --- a/include/haproxy/server-t.h +++ b/include/haproxy/server-t.h @@ -126,11 +126,14 @@ enum srv_initaddr { "srv_port " \ "srvrecord " \ "srv_use_ssl "\ -"srv_check_port" +"srv_check_port " \ +"srv_check_addr " \ +"srv_agent_addr " \ +"srv_agent_port" -#define SRV_STATE_FILE_MAX_FIELDS 22 +#define SRV_STATE_FILE_MAX_FIELDS 25 #define SRV_STATE_FILE_NB_FIELDS_VERSION_1 20 -#define SRV_STATE_FILE_NB_FIELDS_VERSION_2 22 +#define SRV_STATE_FILE_NB_FIELDS_VERSION_2 25 #define SRV_STATE_LINE_MAXLEN 512 /* server flags -- 32 bits */ diff --git a/reg-tests/checks/1be_40srv_odd_health_checks.vtc b/reg-tests/checks/1be_40srv_odd_health_checks.vtc index f01205295..c279972aa 100644 --- a/reg-tests/checks/1be_40srv_odd_health_checks.vtc +++ b/reg-tests/checks/1be_40srv_odd_health_checks.vtc @@ -112,6 +112,6 @@ syslog S -wait haproxy h1 -cli { send "show servers state" -expect ~ "# be_id be_name srv_id srv_name srv_addr srv_op_state srv_admin_state srv_uweight srv_iweight srv_time_since_last_change srv_check_status srv_check_result srv_check_health srv_check_state srv_agent_state bk_f_forced_id srv_f_forced_id srv_fqdn srv_port srvrecord srv_use_ssl srv_check_port\n2 be1 1 srv0 ${s0_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s0_port} - 0 0\n2 be1 2 srv1 ${s1_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s1_port} - 0 0\n2 be1 3 srv2 ${s2_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s2_port} - 0 0\n2 be1 4 srv3 ${s3_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s3_port} - 0 0\n2 be1 5 srv4 ${s4_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s4_port} - 0 0\n2 be1 6 srv5 ${s5_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s5_port} - 0 0\n2 be1 7 srv6 ${s6_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s6_port} - 0 0\n2 be1 8 srv7 ${s7_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s7_port} - 0 0\n2 be1 9 srv8 ${s8_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s8_port} - 0 0\n2 be1 10 srv9 ${s9_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s9_port} - 0 0\n2 be1 11 srv10 ${s10_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s10_port} - 0 0\n2 be1 12 srv11 ${s11_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s11_port} - 0 0\n2 be1 13 srv12 ${s12_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s12_port} - 0 0\n2 be1 14 srv13 ${s13_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s13_port} - 0 0\n2 be1 15 srv14 ${s14_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s14_port} - 0 0\n2 be1 16 srv15 ${s15_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s15_port} - 0 0\n2 be1 17 srv16 ${s16_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s16_port} - 0 0\n2 be1 18 srv17 ${s17_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s17_port} - 0 0\n2 be1 19 srv18 ${s18_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s18_port} - 0 0\n2 be1 20 srv19 ${s19_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s19_port} - 0 0\n2 be1 21 srv20 ${s20_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s20_port} - 0 0\n2 be1 22 srv21 ${s21_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s21_port} - 0
[PATCH v2 6/6] CLEANUP: server: add missing space in server-state error output
a space was missing in the output to make it more readable. Signed-off-by: William Dauchy --- src/server.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/server.c b/src/server.c index 6b360291d..673844dd7 100644 --- a/src/server.c +++ b/src/server.c @@ -3015,7 +3015,7 @@ static void srv_update_state(struct server *srv, int version, char **params) HA_SPIN_UNLOCK(SERVER_LOCK, >lock); if (msg->data) { chunk_appendf(msg, "\n"); - ha_warning("server-state application failed for server '%s/%s'%s", + ha_warning("server-state application failed for server '%s/%s' %s", srv->proxy->id, srv->id, msg->area); } end: -- 2.30.0
[PATCH v2 3/6] BUG/MINOR: server: re-align state file fields number
Since commit 3169471964fdc49963e63f68c1fd88686821a0c4 ("MINOR: Add server port field to server state file.") max_fields was not increased on version number 1. So this patch aims to fix it. This should be backported as far as v1.8, but the numbering should be adpated depending on the version: simply increase the field by 1. Signed-off-by: William Dauchy --- include/haproxy/server-t.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/haproxy/server-t.h b/include/haproxy/server-t.h index 32697a9c4..af52b6a25 100644 --- a/include/haproxy/server-t.h +++ b/include/haproxy/server-t.h @@ -129,7 +129,7 @@ enum srv_initaddr { "srv_check_port" #define SRV_STATE_FILE_MAX_FIELDS 22 -#define SRV_STATE_FILE_NB_FIELDS_VERSION_1 21 +#define SRV_STATE_FILE_NB_FIELDS_VERSION_1 22 #define SRV_STATE_LINE_MAXLEN 512 /* server flags -- 32 bits */ -- 2.30.0
[PATCH v2 2/6] MEDIUM: cli: add agent-port command
this patch allows to set agent port at runtime. In order to align with both `addr` and `check-addr` commands, also add the possibility to optionnaly set port on `agent-addr` command. This led to a small refactor in order to use the same function for both `agent-addr` and `agent-port` commands. Signed-off-by: William Dauchy --- doc/management.txt | 6 +++- src/server.c | 76 -- 2 files changed, 72 insertions(+), 10 deletions(-) diff --git a/doc/management.txt b/doc/management.txt index bff770e4e..423c614b2 100644 --- a/doc/management.txt +++ b/doc/management.txt @@ -1828,10 +1828,14 @@ set server / agent [ up | down ] switch a server's state regardless of some slow agent checks for example. Note that the change is propagated to tracking servers if any. -set server / agent-addr +set server / agent-addr [port ] Change addr for servers agent checks. Allows to migrate agent-checks to another address at runtime. You can specify both IP and hostname, it will be resolved. + Optionally, change the port agent. + +set server / agent-port + Change the port used for agent checks. set server / agent-send Change agent string sent to agent check target. Allows to update string while diff --git a/src/server.c b/src/server.c index 58347d215..a865fb4dd 100644 --- a/src/server.c +++ b/src/server.c @@ -56,6 +56,8 @@ static int srv_state_get_version(FILE *f); static void srv_cleanup_connections(struct server *srv); static const char *update_server_check_addr_port(struct server *s, const char *addr, const char *port); +static const char *update_server_agent_addr_port(struct server *s, const char *addr, +const char *port); /* List head of all known server keywords */ static struct srv_kw_list srv_keywords = { @@ -3573,6 +3575,46 @@ int update_server_addr(struct server *s, void *ip, int ip_sin_family, const char return 0; } +/* update agent health check address and port + * addr can be ip4/ip6 or a hostname + * must be called with the server lock held. + */ +static const char *update_server_agent_addr_port(struct server *s, const char *addr, +const char *port) +{ + struct sockaddr_storage sk; + struct buffer *msg; + int new_port; + + msg = get_trash_chunk(); + + if (!(s->agent.state & CHK_ST_ENABLED)) { + chunk_appendf(msg, "agent checks are not enabled on this server.\n"); + goto out; + } + if (addr) { + memset(, 0, sizeof(struct sockaddr_storage)); + if (str2ip(addr, ) == NULL) { + chunk_appendf(msg, "invalid addr '%s'\n", addr); + goto out; + } + set_srv_agent_addr(s, ); + } + if (port) { + if (strl2irc(port, strlen(port), _port) != 0) { + chunk_appendf(msg, "provided port is not an integer\n"); + goto out; + } + if (new_port < 0 || new_port > 65535) { + chunk_appendf(msg, "provided port is invalid\n"); + goto out; + } + set_srv_agent_port(s, new_port); + } +out: + return msg->area; +} + /* update server health check address and port * addr must be ip4 or ip6, it won't be resolved * must be called with the server lock held. @@ -4432,15 +4474,31 @@ static int cli_parse_set_server(char **args, char *payload, struct appctx *appct cli_err(appctx, "'set server agent' expects 'up' or 'down'.\n"); } else if (strcmp(args[3], "agent-addr") == 0) { - struct sockaddr_storage sk; - - memset(, 0, sizeof(sk)); - if (!(sv->agent.state & CHK_ST_ENABLED)) - cli_err(appctx, "agent checks are not enabled on this server.\n"); - else if (str2ip(args[4], )) - set_srv_agent_addr(sv, ); - else - cli_err(appctx, "incorrect addr address given for agent.\n"); + char *addr = NULL; + char *port = NULL; + if (strlen(args[4]) == 0) { + cli_err(appctx, "set server / agent-addr requires" + " an address and optionally a port.\n"); + goto out_unlock; + } + addr = args[4]; + if (strcmp(args[5], "port") == 0) + port = args[6]; + warning = update_server_agent_addr_port(sv, addr, port); + if (warning) + cli_
[PATCH v2 1/6] MEDIUM: cli: add check-addr command
this patch allows to set server health check address at runtime. In order to align with `addr` command, also allow to set port optionnaly. This led to a small refactor in order to use the same function for both `check-addr` and `check-port` commands. for `check-port`, we however don't permit the change anymore if checks are not enabled on the server. This command becomes more and more useful for people having a consul like architecture: - the backend server is located on a container with its own IP - the health checks are done the consul instance located on the host with the host IP Signed-off-by: William Dauchy --- doc/management.txt | 4 +++ src/server.c | 87 ++ 2 files changed, 76 insertions(+), 15 deletions(-) diff --git a/doc/management.txt b/doc/management.txt index b74aba769..bff770e4e 100644 --- a/doc/management.txt +++ b/doc/management.txt @@ -1842,6 +1842,10 @@ set server / health [ up | stopping | down ] switch a server's state regardless of some slow health checks for example. Note that the change is propagated to tracking servers if any. +set server / check-addr [port ] + Change the IP address used for server health checks. + Optionally, change the port used for server health checks. + set server / check-port Change the port used for health checking to diff --git a/src/server.c b/src/server.c index da2325e9a..58347d215 100644 --- a/src/server.c +++ b/src/server.c @@ -54,6 +54,8 @@ static int srv_set_fqdn(struct server *srv, const char *fqdn, int dns_locked); static void srv_state_parse_line(char *buf, const int version, char **params, char **srv_params); static int srv_state_get_version(FILE *f); static void srv_cleanup_connections(struct server *srv); +static const char *update_server_check_addr_port(struct server *s, const char *addr, +const char *port); /* List head of all known server keywords */ static struct srv_kw_list srv_keywords = { @@ -3571,6 +3573,51 @@ int update_server_addr(struct server *s, void *ip, int ip_sin_family, const char return 0; } +/* update server health check address and port + * addr must be ip4 or ip6, it won't be resolved + * must be called with the server lock held. + */ +static const char *update_server_check_addr_port(struct server *s, const char *addr, +const char *port) +{ + struct sockaddr_storage sk; + struct buffer *msg; + int new_port; + + msg = get_trash_chunk(); + + if (!(s->check.state & CHK_ST_ENABLED)) { + chunk_appendf(msg, "health checks are not enabled on this server.\n"); + goto out; + } + if (addr) { + memset(, 0, sizeof(struct sockaddr_storage)); + if (str2ip2(addr, , 0) == NULL) { + chunk_appendf(msg, "invalid addr '%s'\n", addr); + goto out; + } + s->check.addr = sk; + } + if (port) { + if (strl2irc(port, strlen(port), _port) != 0) { + chunk_appendf(msg, "provided port is not an integer\n"); + goto out; + } + if (new_port < 0 || new_port > 65535) { + chunk_appendf(msg, "provided port is invalid\n"); + goto out; + } + /* prevent the update of port to 0 if MAPPORTS are in use */ + if ((s->flags & SRV_F_MAPPORTS) && new_port == 0) { + chunk_appendf(msg, "can't unset 'port' since MAPPORTS is in use\n"); + goto out; + } + s->check.port = new_port; + } +out: + return msg->area; +} + /* * This function update a server's addr and port only for AF_INET and AF_INET6 families. * @@ -4403,23 +4450,32 @@ static int cli_parse_set_server(char **args, char *payload, struct appctx *appct cli_err(appctx, "cannot allocate memory for new string.\n"); } } - else if (strcmp(args[3], "check-port") == 0) { - int i = 0; - if (strl2irc(args[4], strlen(args[4]), ) != 0) { - cli_err(appctx, "'set server check-port' expects an integer as argument.\n"); - goto out_unlock; - } - if ((i < 0) || (i > 65535)) { - cli_err(appctx, "provided port is not valid.\n"); + else if (strcmp(args[3], "check-addr") == 0) { + char *addr = NULL; + char *port = NULL; + if (strlen(args[4]) == 0) { + cli_err(appctx, "set se
[PATCH v2 3/6] BUG/MEDIUM: server: re-align state file fields number
Since commit 3169471964fdc49963e63f68c1fd88686821a0c4 ("MINOR: Add server port field to server state file.") max_fields was not increased on version number 1. So this patch aims to fix it. This should be backported as far as v1.8, but the numbering should be adpated depending on the version: simply increase the field by 1. Signed-off-by: William Dauchy --- include/haproxy/server-t.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/haproxy/server-t.h b/include/haproxy/server-t.h index 32697a9c4..af52b6a25 100644 --- a/include/haproxy/server-t.h +++ b/include/haproxy/server-t.h @@ -129,7 +129,7 @@ enum srv_initaddr { "srv_check_port" #define SRV_STATE_FILE_MAX_FIELDS 22 -#define SRV_STATE_FILE_NB_FIELDS_VERSION_1 21 +#define SRV_STATE_FILE_NB_FIELDS_VERSION_1 22 #define SRV_STATE_LINE_MAXLEN 512 /* server flags -- 32 bits */ -- 2.30.0
[PATCH v2 0/6] cli commands for checks and agent
Hello Christopher, Here is the v2 addressing the points raised yesterday. The patch 4/6 clearly looks scary but I made sure to not change anything crazy apart from adding support for a version 2. I will probably start to dream about a server-state-file burning every night. I hope this will be good enough, but otherwise, don't hesitate to put more comments, I will come back with a v3. William Dauchy (6): MEDIUM: cli: add check-addr command MEDIUM: cli: add agent-port command BUG/MEDIUM: server: re-align state file fields number MEDIUM: server: add server-states version 2 MEDIUM: server: support {check,agent}_addr, agent_port in server state CLEANUP: server: add missing space in server-state error output doc/management.txt| 15 +- include/haproxy/server-t.h| 16 +- .../checks/1be_40srv_odd_health_checks.vtc| 2 +- .../checks/40be_2srv_odd_health_checks.vtc| 2 +- reg-tests/checks/4be_1srv_health_checks.vtc | 6 +- src/proxy.c | 41 +- src/server.c | 929 ++ 7 files changed, 573 insertions(+), 438 deletions(-) -- 2.30.0
Re: [PATCH 0/6] cli commands coherency
Hi Christopher, On Mon, Feb 8, 2021 at 12:21 PM Christopher Faulet wrote: > First, there is a test to be sure the agent-check is enabled before updating > the > agent address and/or port. Do you think it should also be done for the > health-check? Because, for now, it is possible to set an address on a disabled > (or even not configured) health-check. But it is not possible for an > agent-check. Given that it is not possible to enable/disable an agent-check or > an health-check from the CLI, I guess the warning for both makes sense. yes ok. > Then, there are some problems when the server-state file is loaded. The trash > chunk used by srv_update_state() is overwritten when > update_server_agent_addr_port() is called. It is not obvious, but the > get_trash_chunk() function returns cyclically one of two trash chunks. > srv_update_state() uses a trash chunk. When called, > update_server_check_addr_port() uses the other one. So, > update_server_agent_addr_port() re-uses the first trash chunk, the same than > srv_update_state(). A solution may be to allocate a dedicated chunk in > srv_update_state(), using alloc_trash_chunk(). ah ok, I thought it was ok as long as there was no reset, but I indeed overlooked the second chunk. I will fix that. > A more annoying problem happens when an old state file is loaded (prior these > changes). Last parameters are not defined, params[18] (check-addr), params[19] > (agent-addr) and params[20] (agent-port) are set to NULL. Thus, HAProxy > crashes > when the health-check address is compared to "-". In fact, this comes from the > SRV_STATE_FILE_NB_FIELDS_VERSION_1 macro. It should be set to 24. You added 3 > new fields but just incremented it by 1. Hum... No, there is also another > problem. In srv_state_parse_line(). SRV_STATE_FILE_NB_FIELDS_VERSION_1 must be > equal to SRV_STATE_FILE_MAX_FIELDS. Or better, it should really reflect the > argument number of the version 1, so 21. And it must be compared to srv_arg > instead of arg. It is a bug since the 1.8 (commit 316947196). ah true I overlooked SRV_STATE_FILE_NB_FIELDS_VERSION_1 meaning. let me fix that as well. > However, this means with these changes, a cold restart is required. In the > 2.4, > 5 new fields were added to the server-state file. Is it expected to perform a > cold restart when upgrading to the 2.4 from a prior major version ? If so, I'm > ok. But, it may be a good idea to change the state file version then to not > silently ignore the state file. Otherwise, I guess it is possible to make > these > 5 new fields optional, isn't it ? good idea. We never did it in the past, but that's probably a good thing to do. > William, I'm sorry if I'm not really clear but the subject is a bit foggy for > me > :) Do you feel confident to handle all the changes ? Thanks for the review. I will come back with a v2. -- William
Re: [ANNOUNCE] haproxy-2.4-dev7
On Fri, Feb 5, 2021 at 4:14 PM Willy Tarreau wrote: > HAProxy 2.4-dev7 was released on 2021/02/05. It added 153 new commits > after version 2.4-dev6. > - Some significant lifting was done to the Prometheus exporter, including > new fields, better descriptions and some filtering. I've seen quite a > bunch pass in front of me but do not well understand what it does, all > that interests me is that some users are happy with these changes so I > guess they were long awaited :-) about that, please note two breaking changes: - objects' status are no longer a gauge value which you need to translate manually; instead the state is a label with a proper string value. The value of the metric simply informs whether the state is active or not. so we went from: haproxy_server_status{proxy="be_foo",server="srv0"} 2 (which meant MAINT) to: haproxy_server_status{proxy="be_foo",server="srv0",state="DOWN"} 0 haproxy_server_status{proxy="be_foo",server="srv0",state="UP"} 0 haproxy_server_status{proxy="be_foo",server="srv0",state="MAINT"} 1 haproxy_server_status{proxy="be_foo",server="srv0",state="DRAIN"} 0 haproxy_server_status{proxy="be_foo",server="srv0",state="NOLB"} 0 This change is valid for frontend, backend, server with different label values. - similar change with health checks where we put a state label: haproxy_server_check_status{proxy="be_foo",server="srv0",state="HANA"} 0 haproxy_server_check_status{proxy="be_foo",server="srv0",state="SOCKERR"} 0 haproxy_server_check_status{proxy="be_foo",server="srv0",state="L4OK"} 0 haproxy_server_check_status{proxy="be_foo",server="srv0",state="L4TOUT"} 0 haproxy_server_check_status{proxy="be_foo",server="srv0",state="L4CON"} 1 haproxy_server_check_status{proxy="be_foo",server="srv0",state="L6OK"} 0 haproxy_server_check_status{proxy="be_foo",server="srv0",state="L6TOUT"} 0 haproxy_server_check_status{proxy="be_foo",server="srv0",state="L6RSP"} 0 haproxy_server_check_status{proxy="be_foo",server="srv0",state="L7TOUT"} 0 haproxy_server_check_status{proxy="be_foo",server="srv0",state="L7RSP"} 0 haproxy_server_check_status{proxy="be_foo",server="srv0",state="L7OK"} 0 haproxy_server_check_status{proxy="be_foo",server="srv0",state="L7OKC"} 0 haproxy_server_check_status{proxy="be_foo",server="srv0",state="L7STS"} 0 haproxy_server_check_status{proxy="be_foo",server="srv0",state="PROCERR"} 0 haproxy_server_check_status{proxy="be_foo",server="srv0",state="PROCTOUT"} 0 haproxy_server_check_status{proxy="be_foo",server="srv0",state="PROCOK"} 0 It means: * a lot more metrics for large setup (but you can still filter as explained in the doc) * easier use on prometheus side: you will be able to group per state very easily now. Generally speaking I'm very interested in feedback regarding this change. This was motivated by the usage I saw in several companies, where people were struggling making use of those metrics. Willy: it is probably a wise idea to keep it for the 2.4 final release notes, some people might want to know that during their update; a lot of people have their production alerts on those metrics. Thanks, -- William
[PATCH 2/2] MEDIUM: contrib/prometheus-exporter: export base stick table stats
I saw some people falling back to unix socket to collect some data they could not find in prometheus exporter. One of them is base info from stick tables (used/size). I do not plan to extend it more for now; keys are quite a mess to handle. This should resolve github issue #1008. Signed-off-by: William Dauchy --- contrib/prometheus-exporter/README| 10 ++ .../prometheus-exporter/service-prometheus.c | 148 +++--- reg-tests/contrib/prometheus.vtc | 4 + 3 files changed, 142 insertions(+), 20 deletions(-) diff --git a/contrib/prometheus-exporter/README b/contrib/prometheus-exporter/README index a85981597..d882b092f 100644 --- a/contrib/prometheus-exporter/README +++ b/contrib/prometheus-exporter/README @@ -72,6 +72,7 @@ exported. Here are examples: /metrics?scope=frontend=backend # ==> Frontend and backend metrics will be exported /metrics?scope=*= # ==> no metrics will be exported /metrics?scope==global # ==> global metrics will be exported + /metrics?scope=sticktable # ==> stick tables metrics will be exported * How do I prevent my prometheus instance to explode? @@ -320,3 +321,12 @@ See prometheus export for the description of each field. | haproxy_server_need_connections_current| | haproxy_server_uweight | ++ + +* Stick table metrics + +++ +|Metric name | +++ +| haproxy_sticktable_size| +| haproxy_sticktable_used| +++ diff --git a/contrib/prometheus-exporter/service-prometheus.c b/contrib/prometheus-exporter/service-prometheus.c index 769389735..521fe1056 100644 --- a/contrib/prometheus-exporter/service-prometheus.c +++ b/contrib/prometheus-exporter/service-prometheus.c @@ -47,28 +47,33 @@ enum { /* Prometheus exporter dumper states (appctx->st1) */ enum { -PROMEX_DUMPER_INIT = 0, /* initialized */ -PROMEX_DUMPER_GLOBAL, /* dump metrics of globals */ -PROMEX_DUMPER_FRONT,/* dump metrics of frontend proxies */ -PROMEX_DUMPER_BACK, /* dump metrics of backend proxies */ -PROMEX_DUMPER_LI, /* dump metrics of listeners */ -PROMEX_DUMPER_SRV, /* dump metrics of servers */ - PROMEX_DUMPER_DONE, /* finished */ + PROMEX_DUMPER_INIT = 0, /* initialized */ + PROMEX_DUMPER_GLOBAL, /* dump metrics of globals */ + PROMEX_DUMPER_FRONT, /* dump metrics of frontend proxies */ + PROMEX_DUMPER_BACK, /* dump metrics of backend proxies */ + PROMEX_DUMPER_LI, /* dump metrics of listeners */ + PROMEX_DUMPER_SRV,/* dump metrics of servers */ + PROMEX_DUMPER_STICKTABLE, /* dump metrics of stick tables */ + PROMEX_DUMPER_DONE, /* finished */ }; /* Prometheus exporter flags (appctx->ctx.stats.flags) */ -#define PROMEX_FL_METRIC_HDR0x0001 -#define PROMEX_FL_INFO_METRIC 0x0002 -#define PROMEX_FL_FRONT_METRIC 0x0004 -#define PROMEX_FL_BACK_METRIC 0x0008 -#define PROMEX_FL_SRV_METRIC0x0010 -#define PROMEX_FL_SCOPE_GLOBAL 0x0020 -#define PROMEX_FL_SCOPE_FRONT 0x0040 -#define PROMEX_FL_SCOPE_BACK0x0080 -#define PROMEX_FL_SCOPE_SERVER 0x0100 -#define PROMEX_FL_NO_MAINT_SRV 0x0200 - -#define PROMEX_FL_SCOPE_ALL (PROMEX_FL_SCOPE_GLOBAL|PROMEX_FL_SCOPE_FRONT|PROMEX_FL_SCOPE_BACK|PROMEX_FL_SCOPE_SERVER) +#define PROMEX_FL_METRIC_HDR0x0001 +#define PROMEX_FL_INFO_METRIC 0x0002 +#define PROMEX_FL_FRONT_METRIC 0x0004 +#define PROMEX_FL_BACK_METRIC 0x0008 +#define PROMEX_FL_SRV_METRIC0x0010 +#define PROMEX_FL_SCOPE_GLOBAL 0x0020 +#define PROMEX_FL_SCOPE_FRONT 0x0040 +#define PROMEX_FL_SCOPE_BACK0x0080 +#define PROMEX_FL_SCOPE_SERVER 0x0100 +#define PROMEX_FL_NO_MAINT_SRV 0x0200 +#define PROMEX_FL_STICKTABLE_METRIC 0x0400 +#define PROMEX_FL_SCOPE_STICKTABLE 0x0800 + +#define PROMEX_FL_SCOPE_ALL (PROMEX_FL_SCOPE_GLOBAL | PROMEX_FL_SCOPE_FRONT | \ +PROMEX_FL_SCOPE_BACK | PROMEX_FL_SCOPE_SERVER | \ +PROMEX_FL_SCOPE_STICKTABLE) /* Promtheus metric type (gauge or counter) */ enum promex_mt_type { @@ -298,6 +303,25 @@ const struct ist promex_st_metric_desc[ST_F_TOTAL_FIELDS] = { [ST_F_TT_MAX] = IST("Maximum observed total request+response time (request+queue+connect+response+processing)"), }; +/* stick table base fields */ +enum sticktable_field { + STICKTABLE_SIZE = 0, + STICKTABLE_USED, + /* must always be the last one */ + STICKTABLE_TOTA
[PATCH 1/2] MINOR: contrib/prometheus-exporter: use stats desc when possible followup
Remove remaining descrition which are common to stats.c. This patch is a followup of commit 82b2ce2f967d967139adb7afab064416fadad615 ("MINOR: contrib/prometheus-exporter: use stats desc when possible"). I probably messed up with one of my rebase because I'm pretty sure I removed them at some point, but who knows what happened. Signed-off-by: William Dauchy --- .../prometheus-exporter/service-prometheus.c | 35 --- 1 file changed, 35 deletions(-) diff --git a/contrib/prometheus-exporter/service-prometheus.c b/contrib/prometheus-exporter/service-prometheus.c index 126962f5e..769389735 100644 --- a/contrib/prometheus-exporter/service-prometheus.c +++ b/contrib/prometheus-exporter/service-prometheus.c @@ -284,42 +284,7 @@ const struct promex_metric promex_st_metrics[ST_F_TOTAL_FIELDS] = { /* Description of overriden stats fields */ const struct ist promex_st_metric_desc[ST_F_TOTAL_FIELDS] = { - [ST_F_PXNAME] = IST("The proxy name."), - [ST_F_SVNAME] = IST("The service name (FRONTEND for frontend, BACKEND for backend, any name for server/listener)."), - [ST_F_QCUR] = IST("Current number of queued requests."), - [ST_F_QMAX] = IST("Maximum observed number of queued requests."), - [ST_F_SCUR] = IST("Current number of active sessions."), - [ST_F_SMAX] = IST("Maximum observed number of active sessions."), - [ST_F_SLIM] = IST("Configured session limit."), - [ST_F_STOT] = IST("Total number of sessions."), - [ST_F_BIN]= IST("Current total of incoming bytes."), - [ST_F_BOUT] = IST("Current total of outgoing bytes."), - [ST_F_DREQ] = IST("Total number of denied requests."), - [ST_F_DRESP] = IST("Total number of denied responses."), - [ST_F_EREQ] = IST("Total number of request errors."), - [ST_F_ECON] = IST("Total number of connection errors."), - [ST_F_ERESP] = IST("Total number of response errors."), - [ST_F_WRETR] = IST("Total number of retry warnings."), - [ST_F_WREDIS] = IST("Total number of redispatch warnings."), [ST_F_STATUS] = IST("Current status of the service, per state label value."), - [ST_F_WEIGHT] = IST("Service weight."), - [ST_F_ACT]= IST("Current number of active servers."), - [ST_F_BCK]= IST("Current number of backup servers."), - [ST_F_CHKFAIL]= IST("Total number of failed check (Only counts checks failed when the server is up)."), - [ST_F_CHKDOWN]= IST("Total number of UP->DOWN transitions."), - [ST_F_LASTCHG]= IST("Number of seconds since the last UP<->DOWN transition."), - [ST_F_DOWNTIME] = IST("Total downtime (in seconds) for the service."), - [ST_F_QLIMIT] = IST("Configured maxqueue for the server (0 meaning no limit)."), - [ST_F_PID]= IST("Process id (0 for first instance, 1 for second, ...)"), - [ST_F_IID]= IST("Unique proxy id."), - [ST_F_SID]= IST("Server id (unique inside a proxy)."), - [ST_F_THROTTLE] = IST("Current throttle percentage for the server, when slowstart is active, or no value if not in slowstart."), - [ST_F_LBTOT] = IST("Total number of times a service was selected, either for new sessions, or when redispatching."), - [ST_F_TRACKED]= IST("Id of proxy/server if tracking is enabled."), - [ST_F_TYPE] = IST("Service type (0=frontend, 1=backend, 2=server, 3=socket/listener)."), - [ST_F_RATE] = IST("Current number of sessions per second over last elapsed second."), - [ST_F_RATE_LIM] = IST("Configured limit on new sessions per second."), - [ST_F_RATE_MAX] = IST("Maximum observed number of sessions per second."), [ST_F_CHECK_STATUS] = IST("Status of last health check, per state label value."), [ST_F_CHECK_CODE] = IST("layer5-7 code, if available of the last health check."), [ST_F_CHECK_DURATION] = IST("Total duration of the latest server health check, in seconds."), -- 2.30.0
[PATCH 2/6] CLEANUP: tools: typo in `strl2irc` mention
`str2irc` does not exist Signed-off-by: William Dauchy --- src/tools.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/tools.c b/src/tools.c index 8fef15b4d..2d40d8910 100644 --- a/src/tools.c +++ b/src/tools.c @@ -2178,7 +2178,7 @@ int strl2irc(const char *s, int len, int *ret) * applications designed for hostile environments. It returns zero when the * number has successfully been converted, non-zero otherwise. When an error * is returned, the value is left untouched. It is about 3 times slower - * than str2irc(). + * than strl2irc(). */ int strl2llrc(const char *s, int len, long long *ret) -- 2.30.0
[PATCH 5/6] MEDIUM: server: support {check,agent}_addr, agent_port in server state
logical followup from cli commands addition, so that the state server file stays compatible with the changes made at runtime; use previously added helper to load server attributes. Signed-off-by: William Dauchy --- doc/management.txt| 5 ++- include/haproxy/server-t.h| 9 ++-- .../checks/1be_40srv_odd_health_checks.vtc| 2 +- .../checks/40be_2srv_odd_health_checks.vtc| 2 +- reg-tests/checks/4be_1srv_health_checks.vtc | 6 +-- src/proxy.c | 41 +++ src/server.c | 30 -- 7 files changed, 57 insertions(+), 38 deletions(-) diff --git a/doc/management.txt b/doc/management.txt index 423c614b2..60e25c7e1 100644 --- a/doc/management.txt +++ b/doc/management.txt @@ -2455,7 +2455,10 @@ show servers state [] srv_port:Server port. srvrecord: DNS SRV record associated to this SRV. srv_use_ssl: use ssl for server connections. - srv_check_port: Server check port. + srv_check_port: Server health check port. + srv_check_addr: Server health check address. + srv_agent_addr: Server health agent address. + srv_agent_port: Server health agent port. show sess Dump all known sessions. Avoid doing this on slow connections as this can diff --git a/include/haproxy/server-t.h b/include/haproxy/server-t.h index 32697a9c4..102eb4483 100644 --- a/include/haproxy/server-t.h +++ b/include/haproxy/server-t.h @@ -126,10 +126,13 @@ enum srv_initaddr { "srv_port " \ "srvrecord " \ "srv_use_ssl "\ -"srv_check_port" +"srv_check_port " \ +"srv_check_addr " \ +"srv_agent_addr " \ +"srv_agent_port" -#define SRV_STATE_FILE_MAX_FIELDS 22 -#define SRV_STATE_FILE_NB_FIELDS_VERSION_1 21 +#define SRV_STATE_FILE_MAX_FIELDS 25 +#define SRV_STATE_FILE_NB_FIELDS_VERSION_1 22 #define SRV_STATE_LINE_MAXLEN 512 /* server flags -- 32 bits */ diff --git a/reg-tests/checks/1be_40srv_odd_health_checks.vtc b/reg-tests/checks/1be_40srv_odd_health_checks.vtc index f01205295..c279972aa 100644 --- a/reg-tests/checks/1be_40srv_odd_health_checks.vtc +++ b/reg-tests/checks/1be_40srv_odd_health_checks.vtc @@ -112,6 +112,6 @@ syslog S -wait haproxy h1 -cli { send "show servers state" -expect ~ "# be_id be_name srv_id srv_name srv_addr srv_op_state srv_admin_state srv_uweight srv_iweight srv_time_since_last_change srv_check_status srv_check_result srv_check_health srv_check_state srv_agent_state bk_f_forced_id srv_f_forced_id srv_fqdn srv_port srvrecord srv_use_ssl srv_check_port\n2 be1 1 srv0 ${s0_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s0_port} - 0 0\n2 be1 2 srv1 ${s1_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s1_port} - 0 0\n2 be1 3 srv2 ${s2_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s2_port} - 0 0\n2 be1 4 srv3 ${s3_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s3_port} - 0 0\n2 be1 5 srv4 ${s4_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s4_port} - 0 0\n2 be1 6 srv5 ${s5_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s5_port} - 0 0\n2 be1 7 srv6 ${s6_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s6_port} - 0 0\n2 be1 8 srv7 ${s7_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s7_port} - 0 0\n2 be1 9 srv8 ${s8_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s8_port} - 0 0\n2 be1 10 srv9 ${s9_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s9_port} - 0 0\n2 be1 11 srv10 ${s10_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s10_port} - 0 0\n2 be1 12 srv11 ${s11_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s11_port} - 0 0\n2 be1 13 srv12 ${s12_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s12_port} - 0 0\n2 be1 14 srv13 ${s13_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s13_port} - 0 0\n2 be1 15 srv14 ${s14_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s14_port} - 0 0\n2 be1 16 srv15 ${s15_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s15_port} - 0 0\n2 be1 17 srv16 ${s16_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s16_port} - 0 0\n2 be1 18 srv17 ${s17_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s17_port} - 0 0\n2 be1 19 srv18 ${s18_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s18_port} - 0 0\n2 be1 20 srv19 ${s19_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s19_port} - 0 0\n2 be1 21 srv20 ${s20_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s20_port} - 0 0\n2 be1 22 srv21 ${s21_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s21_port} - 0 0\n2 be1 23 srv22 ${s22_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s22_port} - 0 0\n2 be1 24 srv23 ${s23_addr} 2 0 1 1 [[:digit:
[PATCH 4/6] MEDIUM: cli: add agent-port command
this patch allows to set agent port at runtime. In order to align with both `addr` and `check-addr` commands, also add the possibility to optionnaly set port on `agent-addr` command. This led to a small refactor in order to use the same function for both `agent-addr` and `agent-port` commands. Signed-off-by: William Dauchy --- doc/management.txt | 6 +++- src/server.c | 77 -- 2 files changed, 73 insertions(+), 10 deletions(-) diff --git a/doc/management.txt b/doc/management.txt index bff770e4e..423c614b2 100644 --- a/doc/management.txt +++ b/doc/management.txt @@ -1828,10 +1828,14 @@ set server / agent [ up | down ] switch a server's state regardless of some slow agent checks for example. Note that the change is propagated to tracking servers if any. -set server / agent-addr +set server / agent-addr [port ] Change addr for servers agent checks. Allows to migrate agent-checks to another address at runtime. You can specify both IP and hostname, it will be resolved. + Optionally, change the port agent. + +set server / agent-port + Change the port used for agent checks. set server / agent-send Change agent string sent to agent check target. Allows to update string while diff --git a/src/server.c b/src/server.c index 533755f1e..a983d5d68 100644 --- a/src/server.c +++ b/src/server.c @@ -56,6 +56,8 @@ static int srv_state_get_version(FILE *f); static void srv_cleanup_connections(struct server *srv); static const char *update_server_check_addr_port(struct server *s, const char *addr, const char *port); +static const char *update_server_agent_addr_port(struct server *s, const char *addr, +const char *port); /* List head of all known server keywords */ static struct srv_kw_list srv_keywords = { @@ -3573,6 +3575,47 @@ int update_server_addr(struct server *s, void *ip, int ip_sin_family, const char return 0; } +/* update agent health check address and port + * addr can be ip4/ip6 or a hostname + * must be called with the server lock held. + */ +static const char *update_server_agent_addr_port(struct server *s, const char *addr, +const char *port) +{ + struct sockaddr_storage sk; + struct buffer *msg; + int new_port; + + msg = get_trash_chunk(); + + if (!(s->agent.state & CHK_ST_ENABLED)) { + chunk_appendf(msg, "agent checks are not enabled on this server.\n"); + goto out; + } + + if (addr) { + memset(, 0, sizeof(struct sockaddr_storage)); + if (str2ip(addr, ) == NULL) { + chunk_appendf(msg, "invalid addr '%s'\n", addr); + goto out; + } + set_srv_agent_addr(s, ); + } + if (port) { + if (strl2irc(port, strlen(port), _port) != 0) { + chunk_appendf(msg, "provided port is not an integer\n"); + goto out; + } + if (new_port < 0 || new_port > 65535) { + chunk_appendf(msg, "provided port is invalid\n"); + goto out; + } + set_srv_agent_port(s, new_port); + } +out: + return msg->area; +} + /* update server health check address and port * addr must be ip4 or ip6, it won't be resolved * must be called with the server lock held. @@ -4428,15 +4471,31 @@ static int cli_parse_set_server(char **args, char *payload, struct appctx *appct cli_err(appctx, "'set server agent' expects 'up' or 'down'.\n"); } else if (strcmp(args[3], "agent-addr") == 0) { - struct sockaddr_storage sk; - - memset(, 0, sizeof(sk)); - if (!(sv->agent.state & CHK_ST_ENABLED)) - cli_err(appctx, "agent checks are not enabled on this server.\n"); - else if (str2ip(args[4], )) - set_srv_agent_addr(sv, ); - else - cli_err(appctx, "incorrect addr address given for agent.\n"); + char *addr = NULL; + char *port = NULL; + if (strlen(args[4]) == 0) { + cli_err(appctx, "set server / agent-addr requires" + " an address and optionally a port.\n"); + goto out_unlock; + } + addr = args[4]; + if (strcmp(args[5], "port") == 0) + port = args[6]; + warning = update_server_agent_addr_port(sv, addr, port); + if (warning) + cli_
[PATCH 0/6] cli commands coherency
Hello, This is a followup from last week cleaning regarding check and agent check. This patch series brings some more coherency on the CLI side. I also put some minor cleaning. William Dauchy (6): CLEANUP: check: fix some typo in comments CLEANUP: tools: typo in `strl2irc` mention MEDIUM: cli: add check-addr command MEDIUM: cli: add agent-port command MEDIUM: server: support {check,agent}_addr, agent_port in server state CLEANUP: server: add missing space in server-state error output doc/management.txt| 15 +- include/haproxy/server-t.h| 9 +- .../checks/1be_40srv_odd_health_checks.vtc| 2 +- .../checks/40be_2srv_odd_health_checks.vtc| 2 +- reg-tests/checks/4be_1srv_health_checks.vtc | 6 +- src/check.c | 18 +- src/proxy.c | 41 ++-- src/server.c | 192 ++ src/tools.c | 2 +- 9 files changed, 213 insertions(+), 74 deletions(-) -- 2.30.0
[PATCH 1/6] CLEANUP: check: fix some typo in comments
a few obvious english typo in comments, some of which introduced by myself quite recently Signed-off-by: William Dauchy --- src/check.c | 18 +- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/src/check.c b/src/check.c index edb2ac29f..5de867d7f 100644 --- a/src/check.c +++ b/src/check.c @@ -1004,7 +1004,7 @@ int check_buf_available(void *target) } /* - * Allocate a buffer. If if fails, it adds the check in buffer wait queue. + * Allocate a buffer. If it fails, it adds the check in buffer wait queue. */ struct buffer *check_get_buf(struct check *check, struct buffer *bptr) { @@ -1211,10 +1211,10 @@ static int start_checks() srand((unsigned)time(NULL)); - /* -* 2- start them as far as possible from each others. For this, we will -* start them after their interval set to the min interval divided by -* the number of servers, weighted by the server's position in the list. + /* 2- start them as far as possible from each other. For this, we will +* start them after their interval is set to the min interval divided +* by the number of servers, weighted by the server's position in the +* list. */ for (px = proxies_list; px; px = px->next) { if ((px->options2 & PR_O2_CHK_ANY) == PR_O2_EXT_CHK) { @@ -1261,7 +1261,7 @@ static int srv_check_healthcheck_port(struct check *chk) srv = chk->server; - /* by default, we use the health check port ocnfigured */ + /* by default, we use the health check port configured */ if (chk->port > 0) return chk->port; @@ -1734,14 +1734,14 @@ int set_srv_agent_send(struct server *srv, const char *send) return 0; } -/* set agent addr and apprropriate flag */ +/* set agent addr and appropriate flag */ inline void set_srv_agent_addr(struct server *srv, struct sockaddr_storage *sk) { srv->agent.addr = *sk; srv->flags |= SRV_F_AGENTADDR; } -/* set agent port and apprropriate flag */ +/* set agent port and appropriate flag */ inline void set_srv_agent_port(struct server *srv, int port) { srv->agent.port = port; @@ -2092,7 +2092,7 @@ static struct srv_kw_list srv_kws = { "CHK", { }, { { "check-via-socks4",srv_parse_check_via_socks4,0, 1 }, /* Enable socks4 proxy for health checks */ { "no-agent-check", srv_parse_no_agent_check, 0, 1 }, /* Do not enable any auxiliary agent check */ { "no-check",srv_parse_no_check,0, 1 }, /* Disable health checks */ - { "no-check-send-proxy", srv_parse_no_check_send_proxy, 0, 1 }, /* Disable PROXY protol for health checks */ + { "no-check-send-proxy", srv_parse_no_check_send_proxy, 0, 1 }, /* Disable PROXY protocol for health checks */ { "rise",srv_parse_check_rise, 1, 1 }, /* Set rise value for health checks */ { "fall",srv_parse_check_fall, 1, 1 }, /* Set fall value for health checks */ { "inter", srv_parse_check_inter, 1, 1 }, /* Set inter value for health checks */ -- 2.30.0
[PATCH 6/6] CLEANUP: server: add missing space in server-state error output
a space was missing in the output to make it more readable. Signed-off-by: William Dauchy --- src/server.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/server.c b/src/server.c index 42191eda5..33375e638 100644 --- a/src/server.c +++ b/src/server.c @@ -3017,7 +3017,7 @@ static void srv_update_state(struct server *srv, int version, char **params) out: if (msg->data) { chunk_appendf(msg, "\n"); - ha_warning("server-state application failed for server '%s/%s'%s", + ha_warning("server-state application failed for server '%s/%s' %s", srv->proxy->id, srv->id, msg->area); } } -- 2.30.0
[PATCH 3/6] MEDIUM: cli: add check-addr command
this patch allows to set server health check address at runtime. In order to align with `addr` command, also allow to set port optionnaly. This led to a small refactor in order to use the same function for both `check-addr` and `check-port` commands. This command becomes more and more useful for people having a consul like architecture: - the backend server is located on a container with its own IP - the health checks are done the consul instance located on the host with the host IP Signed-off-by: William Dauchy --- doc/management.txt | 4 +++ src/server.c | 83 +- 2 files changed, 72 insertions(+), 15 deletions(-) diff --git a/doc/management.txt b/doc/management.txt index b74aba769..bff770e4e 100644 --- a/doc/management.txt +++ b/doc/management.txt @@ -1842,6 +1842,10 @@ set server / health [ up | stopping | down ] switch a server's state regardless of some slow health checks for example. Note that the change is propagated to tracking servers if any. +set server / check-addr [port ] + Change the IP address used for server health checks. + Optionally, change the port used for server health checks. + set server / check-port Change the port used for health checking to diff --git a/src/server.c b/src/server.c index da2325e9a..533755f1e 100644 --- a/src/server.c +++ b/src/server.c @@ -54,6 +54,8 @@ static int srv_set_fqdn(struct server *srv, const char *fqdn, int dns_locked); static void srv_state_parse_line(char *buf, const int version, char **params, char **srv_params); static int srv_state_get_version(FILE *f); static void srv_cleanup_connections(struct server *srv); +static const char *update_server_check_addr_port(struct server *s, const char *addr, +const char *port); /* List head of all known server keywords */ static struct srv_kw_list srv_keywords = { @@ -3571,6 +3573,47 @@ int update_server_addr(struct server *s, void *ip, int ip_sin_family, const char return 0; } +/* update server health check address and port + * addr must be ip4 or ip6, it won't be resolved + * must be called with the server lock held. + */ +static const char *update_server_check_addr_port(struct server *s, const char *addr, +const char *port) +{ + struct sockaddr_storage sk; + struct buffer *msg; + int new_port; + + msg = get_trash_chunk(); + + if (addr) { + memset(, 0, sizeof(struct sockaddr_storage)); + if (str2ip2(addr, , 0) == NULL) { + chunk_appendf(msg, "invalid addr '%s'\n", addr); + goto out; + } + s->check.addr = sk; + } + if (port) { + if (strl2irc(port, strlen(port), _port) != 0) { + chunk_appendf(msg, "provided port is not an integer\n"); + goto out; + } + if (new_port < 0 || new_port > 65535) { + chunk_appendf(msg, "provided port is invalid\n"); + goto out; + } + /* prevent the update of port to 0 if MAPPORTS are in use */ + if ((s->flags & SRV_F_MAPPORTS) && new_port == 0) { + chunk_appendf(msg, "can't unset 'port' since MAPPORTS is in use\n"); + goto out; + } + s->check.port = new_port; + } +out: + return msg->area; +} + /* * This function update a server's addr and port only for AF_INET and AF_INET6 families. * @@ -4403,23 +4446,32 @@ static int cli_parse_set_server(char **args, char *payload, struct appctx *appct cli_err(appctx, "cannot allocate memory for new string.\n"); } } - else if (strcmp(args[3], "check-port") == 0) { - int i = 0; - if (strl2irc(args[4], strlen(args[4]), ) != 0) { - cli_err(appctx, "'set server check-port' expects an integer as argument.\n"); - goto out_unlock; - } - if ((i < 0) || (i > 65535)) { - cli_err(appctx, "provided port is not valid.\n"); + else if (strcmp(args[3], "check-addr") == 0) { + char *addr = NULL; + char *port = NULL; + if (strlen(args[4]) == 0) { + cli_err(appctx, "set server / check-addr requires" + " an address and optionally a port.\n"); goto out_unlock; } - /* prevent the update of port to 0 if MAPPORTS are in use */ - if ((sv->flags & SRV_F_MAPPORTS) && (
Re: [PATCH v3 0/5] fix check port/addr consistency
Hello Christopher, Thanks for your continuous support on this review :) On Thu, Feb 4, 2021 at 10:09 AM Christopher Faulet wrote: > Thanks William, it seems good. I've just a question (sorry :). When the server > state file is loaded, why the check port is set only if there is a DNS > resolution ? I know it was done this way before the support of check_port > parameter in the server state. But I suppose that now we support it, it should > always be set, isn't it ? yes you are totally right. > And is there any usage to add "agent-addr" in the server state file? Because > it > can also be set on the CLI. And later, same question will appear for > "check-addr" and "agent-port". the general rule is indeed, anything which can be changed through the CLI should be loaded through server states. Truth is server states becomes a nightmare to manage; I don't remember if you were in the discussions a few months ago, but we concluded we should change that long term by https://github.com/haproxy/haproxy/issues/953 So while waiting for a proper solution, we are indeed supposed to keep server states up to date. > I don't want to bother you again. So, I propose you to merge the first patch > and > to add a new one to not set the check port when the server state file is > loaded. > Then I can merge the third patch and amend the second one to move the check > port > assignment before merging it. And finally I can merge the fourth and fifth > patches. Don't worry, I can send you a v4. -- William
[PATCH v3 5/5] MEDIUM: check: align agentaddr and agentport behaviour
in the same manner of agentaddr, we now: - permit to set agentport through `port` keyword, like it is the case for agentaddr through `addr` - set the priority on `agent-port` keyword when used - add a flag to be able to test when the value is set like for agentaddr it makes the behaviour between `addr` and `port` more consistent. Signed-off-by: William Dauchy --- doc/configuration.txt | 10 +- include/haproxy/check.h| 1 + include/haproxy/server-t.h | 1 + src/check.c| 12 +++- 4 files changed, 18 insertions(+), 6 deletions(-) diff --git a/doc/configuration.txt b/doc/configuration.txt index 2abd684fa..eb685785d 100644 --- a/doc/configuration.txt +++ b/doc/configuration.txt @@ -14061,11 +14061,11 @@ pool-purge-delay port Using the "port" parameter, it becomes possible to use a different port to - send health-checks. On some servers, it may be desirable to dedicate a port - to a specific component able to perform complex tests which are more suitable - to health-checks than the application. It is common to run a simple script in - inetd for instance. This parameter is ignored if the "check" parameter is not - set. See also the "addr" parameter. + send health-checks or to probe the agent-check. On some servers, it may be + desirable to dedicate a port to a specific component able to perform complex + tests which are more suitable to health-checks than the application. It is + common to run a simple script in inetd for instance. This parameter is + ignored if the "check" parameter is not set. See also the "addr" parameter. proto Forces the multiplexer's protocol to use for the outgoing connections to this diff --git a/include/haproxy/check.h b/include/haproxy/check.h index ed8124470..ffeef4e22 100644 --- a/include/haproxy/check.h +++ b/include/haproxy/check.h @@ -53,6 +53,7 @@ int spoe_handle_healthcheck_response(char *frame, size_t size, char *err, int er int set_srv_agent_send(struct server *srv, const char *send); void set_srv_agent_addr(struct server *srv, struct sockaddr_storage *sk); +void set_srv_agent_port(struct server *srv, int port); /* Use this one only. This inline version only ensures that we don't * call the function when the observe mode is disabled. diff --git a/include/haproxy/server-t.h b/include/haproxy/server-t.h index 45f41959c..32697a9c4 100644 --- a/include/haproxy/server-t.h +++ b/include/haproxy/server-t.h @@ -138,6 +138,7 @@ enum srv_initaddr { #define SRV_F_NON_STICK0x0004/* never add connections allocated to this server to a stick table */ #define SRV_F_USE_NS_FROM_PP 0x0008 /* use namespace associated with connection if present */ #define SRV_F_FORCED_ID0x0010/* server's ID was forced in the configuration */ +#define SRV_F_AGENTPORT0x0040/* this server has a agent port configured */ #define SRV_F_AGENTADDR0x0080/* this server has a agent addr configured */ #define SRV_F_COOKIESET0x0100/* this server has a cookie configured, so don't generate dynamic cookies */ #define SRV_F_FASTOPEN 0x0200/* Use TCP Fast Open to connect to server */ diff --git a/src/check.c b/src/check.c index 31f108bff..194f3b4a6 100644 --- a/src/check.c +++ b/src/check.c @@ -1697,7 +1697,7 @@ static int srv_parse_agent_port(char **args, int *cur_arg, struct proxy *curpx, } global.maxsock++; - srv->agent.port = atol(args[*cur_arg+1]); + set_srv_agent_port(srv, atol(args[*cur_arg + 1])); out: return err_code; @@ -1741,6 +1741,13 @@ inline void set_srv_agent_addr(struct server *srv, struct sockaddr_storage *sk) srv->flags |= SRV_F_AGENTADDR; } +/* set agent port and apprropriate flag */ +inline void set_srv_agent_port(struct server *srv, int port) +{ + srv->agent.port = port; + srv->flags |= SRV_F_AGENTPORT; +} + /* Parse the "agent-send" server keyword */ static int srv_parse_agent_send(char **args, int *cur_arg, struct proxy *curpx, struct server *srv, char **errmsg) @@ -2060,6 +2067,9 @@ static int srv_parse_check_port(char **args, int *cur_arg, struct proxy *curpx, global.maxsock++; srv->check.port = atol(args[*cur_arg+1]); + /* if agentport was never set, we can use port */ + if (!(srv->flags & SRV_F_AGENTPORT)) + set_srv_agent_port(srv, srv->check.port); out: return err_code; -- 2.30.0
[PATCH v3 3/5] MEDIUM: check: remove checkport checkaddr flag
While trying to fix some consistency problem with the config file/cli (e.g. check-port cli command does not set the flag), we realised checkport flag was not necessarily needed. Indeed tcpcheck uses service port as the last choice if check.port is zero. So we can assume if check.port is zero, it means it was never set by the user, regardless if it is by the cli or config file. In the longterm this will avoid to introduce a new consistency issue if we forget to set the flag. in the same manner of checkport flag, we don't really need checkaddr flag. We can assume if checkaddr is not set, it means it was never set by the user or config. Signed-off-by: William Dauchy --- include/haproxy/server-t.h | 2 -- src/check.c| 2 -- src/dns.c | 3 +-- src/server.c | 7 ++- 4 files changed, 3 insertions(+), 11 deletions(-) diff --git a/include/haproxy/server-t.h b/include/haproxy/server-t.h index e42b1c7ed..45f41959c 100644 --- a/include/haproxy/server-t.h +++ b/include/haproxy/server-t.h @@ -138,8 +138,6 @@ enum srv_initaddr { #define SRV_F_NON_STICK0x0004/* never add connections allocated to this server to a stick table */ #define SRV_F_USE_NS_FROM_PP 0x0008 /* use namespace associated with connection if present */ #define SRV_F_FORCED_ID0x0010/* server's ID was forced in the configuration */ -#define SRV_F_CHECKADDR0x0020/* this server has a check addr configured */ -#define SRV_F_CHECKPORT0x0040/* this server has a check port configured */ #define SRV_F_AGENTADDR0x0080/* this server has a agent addr configured */ #define SRV_F_COOKIESET0x0100/* this server has a cookie configured, so don't generate dynamic cookies */ #define SRV_F_FASTOPEN 0x0200/* Use TCP Fast Open to connect to server */ diff --git a/src/check.c b/src/check.c index 879fe84ce..e24050ff2 100644 --- a/src/check.c +++ b/src/check.c @@ -1527,7 +1527,6 @@ static int srv_parse_addr(char **args, int *cur_arg, struct proxy *curpx, struct } srv->check.addr = srv->agent.addr = *sk; - srv->flags |= SRV_F_CHECKADDR; srv->flags |= SRV_F_AGENTADDR; out: @@ -2050,7 +2049,6 @@ static int srv_parse_check_port(char **args, int *cur_arg, struct proxy *curpx, global.maxsock++; srv->check.port = atol(args[*cur_arg+1]); - srv->flags |= SRV_F_CHECKPORT; out: return err_code; diff --git a/src/dns.c b/src/dns.c index 8c97df46b..8fc9089e7 100644 --- a/src/dns.c +++ b/src/dns.c @@ -712,8 +712,7 @@ static void dns_check_dns_response(struct dns_resolution *res) srv->svc_port = item->port; srv->flags &= ~SRV_F_MAPPORTS; - if ((srv->check.state & CHK_ST_CONFIGURED) && - !(srv->flags & SRV_F_CHECKPORT)) + if (!srv->check.port) srv->check.port = item->port; if (!srv->dns_opts.ignore_weight) { diff --git a/src/server.c b/src/server.c index d8216058f..1fd71e403 100644 --- a/src/server.c +++ b/src/server.c @@ -1660,8 +1660,6 @@ static void srv_settings_cpy(struct server *srv, struct server *src, int srv_tmp srv->flags |= src->flags; srv->do_check = src->do_check; srv->do_agent = src->do_agent; - if (srv->check.port) - srv->flags |= SRV_F_CHECKPORT; srv->check.inter = src->check.inter; srv->check.fastinter = src->check.fastinter; srv->check.downinter = src->check.downinter; @@ -2985,8 +2983,7 @@ static void srv_update_state(struct server *srv, int version, char **params) } /* configure check.port accordingly */ - if ((srv->check.state & CHK_ST_CONFIGURED) && - !(srv->flags & SRV_F_CHECKPORT)) + if (!srv->check.port) srv->check.port = port_check; /* Unset SRV_F_MAPPORTS for SRV records. @@ -3673,7 +3670,7 @@ const char *update_server_addr_port(struct server *s, const char *addr, const ch * we're switching from a fixed port to a SRV_F_MAPPORTS (mapped) port * prevent PORT change if check doesn't have it's dedicated port while switching * to port mapping */ - if ((s->check.state & CHK_ST_CONFIGURED) && !(s->flags & SRV_F_CHECKPORT)) { +
[PATCH v3 0/5] fix check port/addr consistency
Hello Christopher, Here is my last update on port/addr consistency. I addressed all the point you mentioned. I hope I did not forgot anything. I will come back with `check-addr` and `agent-port` on the cli once those patches are accepted. William Dauchy (5): BUG/MINOR: cli: fix set server addr/port coherency with health checks MEDIUM: server: adding support for check_port in server state MEDIUM: check: remove checkport checkaddr flag BUG/MINOR: check: consitent way to set agentaddr MEDIUM: check: align agentaddr and agentport behaviour doc/configuration.txt | 10 +-- doc/management.txt| 1 + include/haproxy/check.h | 2 + include/haproxy/server-t.h| 10 +-- .../checks/1be_40srv_odd_health_checks.vtc| 2 +- .../checks/40be_2srv_odd_health_checks.vtc| 2 +- reg-tests/checks/4be_1srv_health_checks.vtc | 6 +- src/check.c | 33 -- src/dns.c | 3 +- src/proxy.c | 4 +- src/server.c | 64 +-- 11 files changed, 78 insertions(+), 59 deletions(-) -- 2.30.0
[PATCH v3 1/5] BUG/MINOR: cli: fix set server addr/port coherency with health checks
while reading `update_server_addr_port` I found out some things which can be seen as incoherency. I hope I did not overlooked anything: - one comment is stating check's address should be updated if it uses the server one; however the condition checks if `SRV_F_CHECKADDR` is set; this flag is set when a check address is set; result is that we override the check address where I was not expecting it. In fact we don't need to update anything here as server addr is used when check addr is not set. - same goes for check agent addr - for port, it is a bit different, we update the check port if it is unset. This is harmless because we also use server port if check port is unset. However it creates some incoherency before/after using this command, as check port should stay unset througout the life of the process unless it is is set by `set server check-port` command. quite hard to locate the origin of this this issue but the function was introduced in commit d458adcc52b74608e2fe6a2a95f09ce5e94932b7 ("MINOR: new update_server_addr_port() function to change both server's ADDR and service PORT"). I was however not able to determine whether this is due to a change of behavior along the years. So this patch can potentially be backported up to v1.8 but we must be careful while doing so, as the code has changed a lot. That being said, the bug being not very impacting I would be fine keeping it for 2.4 only. Signed-off-by: William Dauchy --- src/server.c | 15 --- 1 file changed, 15 deletions(-) diff --git a/src/server.c b/src/server.c index 10f528640..99b7e9181 100644 --- a/src/server.c +++ b/src/server.c @@ -3625,16 +3625,6 @@ const char *update_server_addr_port(struct server *s, const char *addr, const ch ipcpy(, >addr); changed = 1; - /* we also need to update check's ADDR only if it uses the server's one */ - if ((s->check.state & CHK_ST_CONFIGURED) && (s->flags & SRV_F_CHECKADDR)) { - ipcpy(, >check.addr); - } - - /* we also need to update agent ADDR only if it use the server's one */ - if ((s->agent.state & CHK_ST_CONFIGURED) && (s->flags & SRV_F_AGENTADDR)) { - ipcpy(, >agent.addr); - } - /* update report for caller */ chunk_printf(msg, "IP changed from '%s' to '%s'", current_addr, addr); } @@ -3714,11 +3704,6 @@ const char *update_server_addr_port(struct server *s, const char *addr, const ch } chunk_appendf(msg, "%d'", new_port); - - /* we also need to update health checks port only if it uses server's realport */ - if ((s->check.state & CHK_ST_CONFIGURED) && !(s->flags & SRV_F_CHECKPORT)) { - s->check.port = new_port; - } } else { chunk_appendf(msg, "no need to change the port"); -- 2.30.0
[PATCH v3 4/5] BUG/MINOR: check: consitent way to set agentaddr
small consistency problem with `addr` and `agent-addr` options: for the both options, the last one parsed is always used to set the agent-check addr. Thus these two lines don't have the same behavior: server ... addr agent-addr server ... agent-addr addr After this patch `agent-addr` will always be the priority option over `addr`. It means we test the flag before setting agentaddr. We also fix all the places where we did not set the flag to be coherent everywhere. I was not really able to determine where this issue is coming from. So it is probable we may backport it to all stable version where the agent is supported. Signed-off-by: William Dauchy --- include/haproxy/check.h | 1 + src/check.c | 19 +++ src/server.c| 7 ++- 3 files changed, 22 insertions(+), 5 deletions(-) diff --git a/include/haproxy/check.h b/include/haproxy/check.h index 8d70040a3..ed8124470 100644 --- a/include/haproxy/check.h +++ b/include/haproxy/check.h @@ -52,6 +52,7 @@ int spoe_prepare_healthcheck_request(char **req, int *len); int spoe_handle_healthcheck_response(char *frame, size_t size, char *err, int errlen); int set_srv_agent_send(struct server *srv, const char *send); +void set_srv_agent_addr(struct server *srv, struct sockaddr_storage *sk); /* Use this one only. This inline version only ensures that we don't * call the function when the observe mode is disabled. diff --git a/src/check.c b/src/check.c index e24050ff2..31f108bff 100644 --- a/src/check.c +++ b/src/check.c @@ -1526,8 +1526,10 @@ static int srv_parse_addr(char **args, int *cur_arg, struct proxy *curpx, struct goto error; } - srv->check.addr = srv->agent.addr = *sk; - srv->flags |= SRV_F_AGENTADDR; + srv->check.addr = *sk; + /* if agentaddr was never set, we can use addr */ + if (!(srv->flags & SRV_F_AGENTADDR)) + set_srv_agent_addr(srv, sk); out: return err_code; @@ -1537,21 +1539,23 @@ static int srv_parse_addr(char **args, int *cur_arg, struct proxy *curpx, struct goto out; } - /* Parse the "agent-addr" server keyword */ static int srv_parse_agent_addr(char **args, int *cur_arg, struct proxy *curpx, struct server *srv, char **errmsg) { + struct sockaddr_storage sk; int err_code = 0; if (!*(args[*cur_arg+1])) { memprintf(errmsg, "'%s' expects an address as argument.", args[*cur_arg]); goto error; } - if(str2ip(args[*cur_arg+1], >agent.addr) == NULL) { + memset(, 0, sizeof(sk)); + if (str2ip(args[*cur_arg + 1], ) == NULL) { memprintf(errmsg, "parsing agent-addr failed. Check if '%s' is correct address.", args[*cur_arg+1]); goto error; } + set_srv_agent_addr(srv, ); out: return err_code; @@ -1730,6 +1734,13 @@ int set_srv_agent_send(struct server *srv, const char *send) return 0; } +/* set agent addr and apprropriate flag */ +inline void set_srv_agent_addr(struct server *srv, struct sockaddr_storage *sk) +{ + srv->agent.addr = *sk; + srv->flags |= SRV_F_AGENTADDR; +} + /* Parse the "agent-send" server keyword */ static int srv_parse_agent_send(char **args, int *cur_arg, struct proxy *curpx, struct server *srv, char **errmsg) diff --git a/src/server.c b/src/server.c index 1fd71e403..b6b3b0893 100644 --- a/src/server.c +++ b/src/server.c @@ -4384,9 +4384,14 @@ static int cli_parse_set_server(char **args, char *payload, struct appctx *appct cli_err(appctx, "'set server agent' expects 'up' or 'down'.\n"); } else if (strcmp(args[3], "agent-addr") == 0) { + struct sockaddr_storage sk; + + memset(, 0, sizeof(sk)); if (!(sv->agent.state & CHK_ST_ENABLED)) cli_err(appctx, "agent checks are not enabled on this server.\n"); - else if (str2ip(args[4], >agent.addr) == NULL) + else if (str2ip(args[4], )) + set_srv_agent_addr(sv, ); + else cli_err(appctx, "incorrect addr address given for agent.\n"); } else if (strcmp(args[3], "agent-send") == 0) { -- 2.30.0
[PATCH v3 2/5] MEDIUM: server: adding support for check_port in server state
We can currently change the check-port using the cli command `set server check-port` but there is a consistency issue when using server state. This patch aims to fix this problem but will be also a good preparation work to get rid of checkport flag, so we are able to know when checkport was set by config. I am fully aware this is not making github #953 moving forward, I however think this might be acceptable while waiting for a proper solution and resolve consistency problem faced with port settings. Signed-off-by: William Dauchy --- doc/management.txt| 1 + include/haproxy/server-t.h| 7 ++-- .../checks/1be_40srv_odd_health_checks.vtc| 2 +- .../checks/40be_2srv_odd_health_checks.vtc| 2 +- reg-tests/checks/4be_1srv_health_checks.vtc | 6 ++-- src/proxy.c | 4 +-- src/server.c | 35 --- 7 files changed, 35 insertions(+), 22 deletions(-) diff --git a/doc/management.txt b/doc/management.txt index de3fbf607..b74aba769 100644 --- a/doc/management.txt +++ b/doc/management.txt @@ -2447,6 +2447,7 @@ show servers state [] srv_port:Server port. srvrecord: DNS SRV record associated to this SRV. srv_use_ssl: use ssl for server connections. + srv_check_port: Server check port. show sess Dump all known sessions. Avoid doing this on slow connections as this can diff --git a/include/haproxy/server-t.h b/include/haproxy/server-t.h index b29c75c0b..e42b1c7ed 100644 --- a/include/haproxy/server-t.h +++ b/include/haproxy/server-t.h @@ -125,10 +125,11 @@ enum srv_initaddr { "srv_fqdn " \ "srv_port " \ "srvrecord " \ -"srv_use_ssl" +"srv_use_ssl "\ +"srv_check_port" -#define SRV_STATE_FILE_MAX_FIELDS 21 -#define SRV_STATE_FILE_NB_FIELDS_VERSION_1 20 +#define SRV_STATE_FILE_MAX_FIELDS 22 +#define SRV_STATE_FILE_NB_FIELDS_VERSION_1 21 #define SRV_STATE_LINE_MAXLEN 512 /* server flags -- 32 bits */ diff --git a/reg-tests/checks/1be_40srv_odd_health_checks.vtc b/reg-tests/checks/1be_40srv_odd_health_checks.vtc index bd07d8840..f01205295 100644 --- a/reg-tests/checks/1be_40srv_odd_health_checks.vtc +++ b/reg-tests/checks/1be_40srv_odd_health_checks.vtc @@ -112,6 +112,6 @@ syslog S -wait haproxy h1 -cli { send "show servers state" -expect ~ "# be_id be_name srv_id srv_name srv_addr srv_op_state srv_admin_state srv_uweight srv_iweight srv_time_since_last_change srv_check_status srv_check_result srv_check_health srv_check_state srv_agent_state bk_f_forced_id srv_f_forced_id srv_fqdn srv_port srvrecord srv_use_ssl\n2 be1 1 srv0 ${s0_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s0_port} - 0\n2 be1 2 srv1 ${s1_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s1_port} - 0\n2 be1 3 srv2 ${s2_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s2_port} - 0\n2 be1 4 srv3 ${s3_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s3_port} - 0\n2 be1 5 srv4 ${s4_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s4_port} - 0\n2 be1 6 srv5 ${s5_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s5_port} - 0\n2 be1 7 srv6 ${s6_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s6_port} - 0\n2 be1 8 srv7 ${s7_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s7_port} - 0\n2 be1 9 srv8 ${s8_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s8_port} - 0\n2 be1 10 srv9 ${s9_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s9_port} - 0\n2 be1 11 srv10 ${s10_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s10_port} - 0\n2 be1 12 srv11 ${s11_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s11_port} - 0\n2 be1 13 srv12 ${s12_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s12_port} - 0\n2 be1 14 srv13 ${s13_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s13_port} - 0\n2 be1 15 srv14 ${s14_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s14_port} - 0\n2 be1 16 srv15 ${s15_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s15_port} - 0\n2 be1 17 srv16 ${s16_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s16_port} - 0\n2 be1 18 srv17 ${s17_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s17_port} - 0\n2 be1 19 srv18 ${s18_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s18_port} - 0\n2 be1 20 srv19 ${s19_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s19_port} - 0\n2 be1 21 srv20 ${s20_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s20_port} - 0\n2 be1 22 srv21 ${s21_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s21_port} - 0\n2 be1 23 srv22 ${s22_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s22_port} - 0\n2 be1 24 srv23 ${s23_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s23_port} - 0\n2 be1 25 srv24 ${s24_addr} 2 0 1 1 [[:digit:]]+ 1
Re: [PATCH v2 0/5] fix check port/addr consistency
On Wed, Feb 3, 2021 at 9:59 AM Christopher Faulet wrote: > At first glance, I'm just a bit annoyed with the patch 5. In the > documentation, > it is stated that "addr" option will be used for agent-check too. And there is > no info about interactions between "addr" and "agent-addr" options when both > are > configured. For me, for an agent-check, the "agent-addr" option must be used > in > priority, regardless the options order. If not defined, the "addr" option must > be used, if defined. And at the end, we use the server address by default if > none is specified. ok I missed that part. it is a bit sad honestly, it makes things less explicit. > There is another problem with "port" and "agent-port" options. It is mentioned > in the documentation that "agent-port" is required to perform agent-checks. > And > "port" option is not used for agent-check. It is not really consistent. I > propose to deal with port/agent-port options in the same way than > addr/agent-addr ones. And we keep the test to be sure to have a dedicated port > for the agent-check to not use the server's one. This way, we keep backward > compatibility and improve consistency. > Thus, if you agree with these changes, I guess we should keep SRV_F_AGENTADDR > flag and add SRV_F_AGENTPORT. ok I will come back with a v3. I honestly don't like to have port/addr used for agent, as stated previously because it is creating some non explicit things. But indeed as we need to keep backward compat, I will fix that. > About the CLI. I agree, the "check-addr" parameter on the "set server" command > must be added. And the "agent-port" parameter must bee added too. For me, it > is > a consistency matter. But I understand it is also mandatory for dynamic > environments. thanks, let me come back with a v3, so we can move forward for the next improvements. -- William
Re: [PATCH v2 4/5] MEDIUM: get rid of checkaddr and agentaddr flag
On Tue, Feb 2, 2021 at 10:57 PM William Dauchy wrote: > in the same manner of checkport flag, we found out some consistency > problem. For instance, SRV_F_AGENTADDR flag is set when the "addr" > option is set on a server line but not for the "agent-addr" option. > The issue is similar in the cli where we never set the flag. > A consequence of this patch is that agentaddr is no longer used, at > least for now. We might reintroduce it later if we need it. sorry for that, I forgot "server:" in the subject. will send v3 if there are other feedbacks -- William
[PATCH v2 4/5] MEDIUM: get rid of checkaddr and agentaddr flag
in the same manner of checkport flag, we found out some consistency problem. For instance, SRV_F_AGENTADDR flag is set when the "addr" option is set on a server line but not for the "agent-addr" option. The issue is similar in the cli where we never set the flag. A consequence of this patch is that agentaddr is no longer used, at least for now. We might reintroduced it later if we need it. Signed-off-by: William Dauchy --- include/haproxy/server-t.h | 2 -- src/check.c| 2 -- 2 files changed, 4 deletions(-) diff --git a/include/haproxy/server-t.h b/include/haproxy/server-t.h index 131b97cb6..382caa218 100644 --- a/include/haproxy/server-t.h +++ b/include/haproxy/server-t.h @@ -138,8 +138,6 @@ enum srv_initaddr { #define SRV_F_NON_STICK0x0004/* never add connections allocated to this server to a stick table */ #define SRV_F_USE_NS_FROM_PP 0x0008 /* use namespace associated with connection if present */ #define SRV_F_FORCED_ID0x0010/* server's ID was forced in the configuration */ -#define SRV_F_CHECKADDR0x0020/* this server has a check addr configured */ -#define SRV_F_AGENTADDR0x0080/* this server has a agent addr configured */ #define SRV_F_COOKIESET0x0100/* this server has a cookie configured, so don't generate dynamic cookies */ #define SRV_F_FASTOPEN 0x0200/* Use TCP Fast Open to connect to server */ #define SRV_F_SOCKS4_PROXY 0x0400/* this server uses SOCKS4 proxy */ diff --git a/src/check.c b/src/check.c index f4b11fc46..f4d5bd452 100644 --- a/src/check.c +++ b/src/check.c @@ -1527,8 +1527,6 @@ static int srv_parse_addr(char **args, int *cur_arg, struct proxy *curpx, struct } srv->check.addr = srv->agent.addr = *sk; - srv->flags |= SRV_F_CHECKADDR; - srv->flags |= SRV_F_AGENTADDR; out: return err_code; -- 2.29.2
[PATCH v2 2/5] MEDIUM: server: adding support for check_port in server state
We can currently change the check-port using the cli command `set server check-port` but there is a consistency issue when using server state. This patch aims to fix this problem but will be also a good preparation work to get rid of checkport flag, so we are able to know when checkport was set by config. I am fully aware this is not making github #953 moving forward, I however think this might be acceptable while waiting for a proper solution and resolve consistency problem faced with port settings. Signed-off-by: William Dauchy --- doc/management.txt| 1 + include/haproxy/server-t.h| 7 ++-- .../checks/1be_40srv_odd_health_checks.vtc| 2 +- .../checks/40be_2srv_odd_health_checks.vtc| 2 +- reg-tests/checks/4be_1srv_health_checks.vtc | 6 ++-- src/proxy.c | 4 +-- src/server.c | 35 --- 7 files changed, 35 insertions(+), 22 deletions(-) diff --git a/doc/management.txt b/doc/management.txt index de3fbf607..b74aba769 100644 --- a/doc/management.txt +++ b/doc/management.txt @@ -2447,6 +2447,7 @@ show servers state [] srv_port:Server port. srvrecord: DNS SRV record associated to this SRV. srv_use_ssl: use ssl for server connections. + srv_check_port: Server check port. show sess Dump all known sessions. Avoid doing this on slow connections as this can diff --git a/include/haproxy/server-t.h b/include/haproxy/server-t.h index b29c75c0b..e42b1c7ed 100644 --- a/include/haproxy/server-t.h +++ b/include/haproxy/server-t.h @@ -125,10 +125,11 @@ enum srv_initaddr { "srv_fqdn " \ "srv_port " \ "srvrecord " \ -"srv_use_ssl" +"srv_use_ssl "\ +"srv_check_port" -#define SRV_STATE_FILE_MAX_FIELDS 21 -#define SRV_STATE_FILE_NB_FIELDS_VERSION_1 20 +#define SRV_STATE_FILE_MAX_FIELDS 22 +#define SRV_STATE_FILE_NB_FIELDS_VERSION_1 21 #define SRV_STATE_LINE_MAXLEN 512 /* server flags -- 32 bits */ diff --git a/reg-tests/checks/1be_40srv_odd_health_checks.vtc b/reg-tests/checks/1be_40srv_odd_health_checks.vtc index bd07d8840..f01205295 100644 --- a/reg-tests/checks/1be_40srv_odd_health_checks.vtc +++ b/reg-tests/checks/1be_40srv_odd_health_checks.vtc @@ -112,6 +112,6 @@ syslog S -wait haproxy h1 -cli { send "show servers state" -expect ~ "# be_id be_name srv_id srv_name srv_addr srv_op_state srv_admin_state srv_uweight srv_iweight srv_time_since_last_change srv_check_status srv_check_result srv_check_health srv_check_state srv_agent_state bk_f_forced_id srv_f_forced_id srv_fqdn srv_port srvrecord srv_use_ssl\n2 be1 1 srv0 ${s0_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s0_port} - 0\n2 be1 2 srv1 ${s1_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s1_port} - 0\n2 be1 3 srv2 ${s2_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s2_port} - 0\n2 be1 4 srv3 ${s3_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s3_port} - 0\n2 be1 5 srv4 ${s4_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s4_port} - 0\n2 be1 6 srv5 ${s5_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s5_port} - 0\n2 be1 7 srv6 ${s6_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s6_port} - 0\n2 be1 8 srv7 ${s7_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s7_port} - 0\n2 be1 9 srv8 ${s8_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s8_port} - 0\n2 be1 10 srv9 ${s9_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s9_port} - 0\n2 be1 11 srv10 ${s10_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s10_port} - 0\n2 be1 12 srv11 ${s11_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s11_port} - 0\n2 be1 13 srv12 ${s12_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s12_port} - 0\n2 be1 14 srv13 ${s13_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s13_port} - 0\n2 be1 15 srv14 ${s14_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s14_port} - 0\n2 be1 16 srv15 ${s15_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s15_port} - 0\n2 be1 17 srv16 ${s16_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s16_port} - 0\n2 be1 18 srv17 ${s17_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s17_port} - 0\n2 be1 19 srv18 ${s18_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s18_port} - 0\n2 be1 20 srv19 ${s19_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s19_port} - 0\n2 be1 21 srv20 ${s20_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s20_port} - 0\n2 be1 22 srv21 ${s21_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s21_port} - 0\n2 be1 23 srv22 ${s22_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s22_port} - 0\n2 be1 24 srv23 ${s23_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s23_port} - 0\n2 be1 25 srv24 ${s24_addr} 2 0 1 1 [[:digit:]]+ 1
[PATCH v2 3/5] MEDIUM: server: get rid of checkport flag
While trying to fix some consistency problem with the config file/cli (e.g. check-port cli command does not set the flag), we realised checkport flag was not necessarily needed. Indeed tcpcheck uses service port as the last choice if check.port is zero. So we can assume if check.port is zero, it means it was never set by the user, regardless if it is by the cli or config file. In the longterm this will avoid to introduce a new consistency issue if we forget to set the flag. Signed-off-by: William Dauchy --- include/haproxy/server-t.h | 1 - src/check.c| 1 - src/dns.c | 3 +-- src/server.c | 7 ++- 4 files changed, 3 insertions(+), 9 deletions(-) diff --git a/include/haproxy/server-t.h b/include/haproxy/server-t.h index e42b1c7ed..131b97cb6 100644 --- a/include/haproxy/server-t.h +++ b/include/haproxy/server-t.h @@ -139,7 +139,6 @@ enum srv_initaddr { #define SRV_F_USE_NS_FROM_PP 0x0008 /* use namespace associated with connection if present */ #define SRV_F_FORCED_ID0x0010/* server's ID was forced in the configuration */ #define SRV_F_CHECKADDR0x0020/* this server has a check addr configured */ -#define SRV_F_CHECKPORT0x0040/* this server has a check port configured */ #define SRV_F_AGENTADDR0x0080/* this server has a agent addr configured */ #define SRV_F_COOKIESET0x0100/* this server has a cookie configured, so don't generate dynamic cookies */ #define SRV_F_FASTOPEN 0x0200/* Use TCP Fast Open to connect to server */ diff --git a/src/check.c b/src/check.c index 879fe84ce..f4b11fc46 100644 --- a/src/check.c +++ b/src/check.c @@ -2050,7 +2050,6 @@ static int srv_parse_check_port(char **args, int *cur_arg, struct proxy *curpx, global.maxsock++; srv->check.port = atol(args[*cur_arg+1]); - srv->flags |= SRV_F_CHECKPORT; out: return err_code; diff --git a/src/dns.c b/src/dns.c index 8c97df46b..8fc9089e7 100644 --- a/src/dns.c +++ b/src/dns.c @@ -712,8 +712,7 @@ static void dns_check_dns_response(struct dns_resolution *res) srv->svc_port = item->port; srv->flags &= ~SRV_F_MAPPORTS; - if ((srv->check.state & CHK_ST_CONFIGURED) && - !(srv->flags & SRV_F_CHECKPORT)) + if (!srv->check.port) srv->check.port = item->port; if (!srv->dns_opts.ignore_weight) { diff --git a/src/server.c b/src/server.c index d8216058f..1fd71e403 100644 --- a/src/server.c +++ b/src/server.c @@ -1660,8 +1660,6 @@ static void srv_settings_cpy(struct server *srv, struct server *src, int srv_tmp srv->flags |= src->flags; srv->do_check = src->do_check; srv->do_agent = src->do_agent; - if (srv->check.port) - srv->flags |= SRV_F_CHECKPORT; srv->check.inter = src->check.inter; srv->check.fastinter = src->check.fastinter; srv->check.downinter = src->check.downinter; @@ -2985,8 +2983,7 @@ static void srv_update_state(struct server *srv, int version, char **params) } /* configure check.port accordingly */ - if ((srv->check.state & CHK_ST_CONFIGURED) && - !(srv->flags & SRV_F_CHECKPORT)) + if (!srv->check.port) srv->check.port = port_check; /* Unset SRV_F_MAPPORTS for SRV records. @@ -3673,7 +3670,7 @@ const char *update_server_addr_port(struct server *s, const char *addr, const ch * we're switching from a fixed port to a SRV_F_MAPPORTS (mapped) port * prevent PORT change if check doesn't have it's dedicated port while switching * to port mapping */ - if ((s->check.state & CHK_ST_CONFIGURED) && !(s->flags & SRV_F_CHECKPORT)) { + if (!s->check.port) { chunk_appendf(msg, "can't change to port map because it is incompatible with current health check port configuration (use 'port' statement from the 'server' directive."); goto out; } -- 2.29.2
[PATCH v2 0/5] fix check port/addr consistency
Hello Christopher, As discussed, I revisited my previous series regarding check addr and port consistency. I don't think I missed anything. I won't hide my aim here, I would like to add support to set `check.addr` on the cli like it is possible for the service port. More and more people have a setup with containers having their own IP, but with a health check on the host (e.g. on the consul side for example). So it becomes more and more needed. That's mostly why I wanted to clarify the situation first with the different things I've seen while revisting this part of the code. William Dauchy (5): BUG/MINOR: cli: fix set server addr/port coherency with health checks MEDIUM: server: adding support for check_port in server state MEDIUM: server: get rid of checkport flag MEDIUM: get rid of checkaddr and agentaddr flag BUG/MINOR: server: don't set agent addr for addr parsing doc/management.txt| 1 + include/haproxy/server-t.h| 10 ++-- .../checks/1be_40srv_odd_health_checks.vtc| 2 +- .../checks/40be_2srv_odd_health_checks.vtc| 2 +- reg-tests/checks/4be_1srv_health_checks.vtc | 6 +- src/check.c | 5 +- src/dns.c | 3 +- src/proxy.c | 4 +- src/server.c | 57 --- 9 files changed, 39 insertions(+), 51 deletions(-) -- 2.29.2
[PATCH v2 5/5] BUG/MINOR: server: don't set agent addr for addr parsing
small consistency problem with `addr` and `agent-addr` options: for the both options, the last one parsed is always used to set the agent-check addr. Thus these two lines don't have the same behavior: server ... addr agent-addr server ... agent-addr addr I was not really able to determine where this issue is coming from. So it is probable we may backport it to all stable version where the agent is supported. Signed-off-by: William Dauchy --- src/check.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/check.c b/src/check.c index f4d5bd452..aab86dc8c 100644 --- a/src/check.c +++ b/src/check.c @@ -1526,7 +1526,7 @@ static int srv_parse_addr(char **args, int *cur_arg, struct proxy *curpx, struct goto error; } - srv->check.addr = srv->agent.addr = *sk; + srv->check.addr = *sk; out: return err_code; -- 2.29.2
[PATCH v2 1/5] BUG/MINOR: cli: fix set server addr/port coherency with health checks
while reading `update_server_addr_port` I found out some things which can be seen as incoherency. I hope I did not overlooked anything: - one comment is stating check's address should be updated if it uses the server one; however the condition checks if `SRV_F_CHECKADDR` is set; this flag is set when a check address is set; result is that we override the check address where I was not expecting it. In fact we don't need to update anything here as server addr is used when check addr is not set. - same goes for check agent addr - for port, it is a bit different, we update the check port if it is unset. This is harmless because we also use server port if check port is unset. However it creates some incoherency before/after using this command, as check port should stay unset througout the life of the process unless it is is set by `set server check-port` command. quite hard to locate the origin of this this issue but the function was introduced in commit d458adcc52b74608e2fe6a2a95f09ce5e94932b7 ("MINOR: new update_server_addr_port() function to change both server's ADDR and service PORT"). I was however not able to determine whether this is due to a change of behavior along the years. So this patch can potentially be backported up to v1.8 but we must be careful while doing so, as the code has changed a lot. That being said, the bug being not very impacting I would be fine keeping it for 2.4 only. Signed-off-by: William Dauchy --- src/server.c | 15 --- 1 file changed, 15 deletions(-) diff --git a/src/server.c b/src/server.c index 10f528640..99b7e9181 100644 --- a/src/server.c +++ b/src/server.c @@ -3625,16 +3625,6 @@ const char *update_server_addr_port(struct server *s, const char *addr, const ch ipcpy(, >addr); changed = 1; - /* we also need to update check's ADDR only if it uses the server's one */ - if ((s->check.state & CHK_ST_CONFIGURED) && (s->flags & SRV_F_CHECKADDR)) { - ipcpy(, >check.addr); - } - - /* we also need to update agent ADDR only if it use the server's one */ - if ((s->agent.state & CHK_ST_CONFIGURED) && (s->flags & SRV_F_AGENTADDR)) { - ipcpy(, >agent.addr); - } - /* update report for caller */ chunk_printf(msg, "IP changed from '%s' to '%s'", current_addr, addr); } @@ -3714,11 +3704,6 @@ const char *update_server_addr_port(struct server *s, const char *addr, const ch } chunk_appendf(msg, "%d'", new_port); - - /* we also need to update health checks port only if it uses server's realport */ - if ((s->check.state & CHK_ST_CONFIGURED) && !(s->flags & SRV_F_CHECKPORT)) { - s->check.port = new_port; - } } else { chunk_appendf(msg, "no need to change the port"); -- 2.29.2
Re: [PATCH 1/2] BUG/MINOR: cli: fix set server addr/port coherency with health checks
Hi Christopher, Thanks for the review. On Tue, Feb 2, 2021 at 10:21 AM Christopher Faulet wrote: > So, it may be good to take a global look at this stuff. I may missed > something. > And be carefull for the backports because the health-checks were refactored in > the 2.2. ok I will come back with a more global look and see whether I can get rid of those flags. Thanks, -- William
Re: [PATCH 1/9] MAJOR: contrib/prometheus-exporter: move health check status to labels
On Mon, Feb 1, 2021 at 12:05 PM Pierre Cheynier wrote: > In addition to this update, I would add some recommendations about the user > setup in the README ("how do I prevent my prometheus instance to explode > when scrapping this endpoint?"). > For server-template users: > - > params: > no-maint: > - empty > > Generally speaking, to prevent all server metrics to be saved, except this > one: > - >metric_relabel_configs: >- source_labels: ['__name__'] > regex: 'haproxy_(process_|frontend_|backend_|server_check_status).*' > action: keep agreed taken into account in v2 > "Status of last health check, per state value" ? updated in v2. -- William
Re: [PATCH 0/9] prometheus: health check as labels + cleanup
On Mon, Feb 01, 2021 at 11:30:18AM +0100, Christopher Faulet wrote: > Don't be sorry to send patches ! If you think the number of labels for the > check_status metric is not a problem, even for huge configurations, I trust > you. > And I guess we can reduce this number to 16 status only, removing all status > with an unknown check result (HCHK_STATUS_UNKNOWN, HCHK_STATUS_INI, > HCHK_STATUS_START, HCHK_STATUS_L57DATA) and the status regarding the > agent-check > only (HCHK_STATUS_CHECKED). > > For info, HCHK_STATUS_UNKNOWN is only used for uninitialized health-check > (during configuration parsing). HCHK_STATUS_INI is used for initialized > health-check and before the first run. HCHK_STATUS_START and > HCHK_STATUS_L57DATA > are dummy status and never assigned to a health-check. And as said, > HCHK_STATUS_CHECKED is only used for agent-check. > > It is a slight improvement, but still better than nothing. To do so, I propose > you to add a function in check.c to get the corresponding check result of a > status (the attached patch). This way, in your first patch, we can filter on > check result. I may amend your first patch this way if you're ok: definitely worth it indeed. I let you do the amend > --- a/contrib/prometheus-exporter/service-prometheus.c > +++ b/contrib/prometheus-exporter/service-prometheus.c > @@ -880,6 +880,8 @@ static int promex_dump_srv_metrics(struct appctx *appctx, > struct htx *htx) > goto next_sv; > for (i = 0; i < > HCHK_STATUS_SIZE; i++) { > + if > (get_check_status_result(i) < CHK_RES_FAILED) > + continue; > val = > mkf_u32(FO_STATUS, sv->check.status == i); > check_state = > get_check_status_info(i); > labels[2].name = > ist("state"); > > No comments about the other patches. Except the README is now outdated and > does not reflect recent changes. It could be good to keep it up-to-date as far > as possible. ah true, I will send an update about it soon. -- William
[PATCH 1/2] BUG/MINOR: cli: fix set server addr/port coherency with health checks
while reading `update_server_addr_port` I found out some things which can be seen as incoherency. I hope I did not overlooked anything: - one comment is stating check's address should be updated if it uses the server one; however the condition checks if `SRV_F_CHECKADDR` is set; this flag is set when a check address is set; result is that we override the check address where I was not expecting it. In fact we don't need to update anything here as server addr is used when check addr is not set. - same goes for check agent addr - for port, it is a bit different, we update the check port if it is unset. This is harmless because we also use server port if check port is unset. However it creates some incoherency before/after using this command, as check port should stay unset througout the life of the process unless it is is set by `set server check-port` command. quite hard to locate the origin of this this issue but the function was introduced in commit d458adcc52b74608e2fe6a2a95f09ce5e94932b7 ("MINOR: new update_server_addr_port() function to change both server's ADDR and service PORT"). I was however not able to determine whether this is due to a change of behavior along the years. So this patch can potentially be backported up to v1.8. Signed-off-by: William Dauchy --- src/server.c | 15 --- 1 file changed, 15 deletions(-) diff --git a/src/server.c b/src/server.c index 10f528640..99b7e9181 100644 --- a/src/server.c +++ b/src/server.c @@ -3625,16 +3625,6 @@ const char *update_server_addr_port(struct server *s, const char *addr, const ch ipcpy(, >addr); changed = 1; - /* we also need to update check's ADDR only if it uses the server's one */ - if ((s->check.state & CHK_ST_CONFIGURED) && (s->flags & SRV_F_CHECKADDR)) { - ipcpy(, >check.addr); - } - - /* we also need to update agent ADDR only if it use the server's one */ - if ((s->agent.state & CHK_ST_CONFIGURED) && (s->flags & SRV_F_AGENTADDR)) { - ipcpy(, >agent.addr); - } - /* update report for caller */ chunk_printf(msg, "IP changed from '%s' to '%s'", current_addr, addr); } @@ -3714,11 +3704,6 @@ const char *update_server_addr_port(struct server *s, const char *addr, const ch } chunk_appendf(msg, "%d'", new_port); - - /* we also need to update health checks port only if it uses server's realport */ - if ((s->check.state & CHK_ST_CONFIGURED) && !(s->flags & SRV_F_CHECKPORT)) { - s->check.port = new_port; - } } else { chunk_appendf(msg, "no need to change the port"); -- 2.29.2
[PATCH 2/2] BUG/MINOR: cli: set server check-port missing flag
when we set check port flag, we need to set a flag on server. this is missing since the origin of the feature: commit 5094656a67fa1b56f30cd2316adb675ca9d2a79a ("MINOR: cli: change a server health check port through the stats socket"). So it can potentially be backport up to v1.8. Signed-off-by: William Dauchy --- src/server.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/server.c b/src/server.c index 99b7e9181..ded3d47ff 100644 --- a/src/server.c +++ b/src/server.c @@ -4405,6 +4405,7 @@ static int cli_parse_set_server(char **args, char *payload, struct appctx *appct goto out_unlock; } sv->check.port = i; + sv->flags |= SRV_F_CHECKPORT; cli_msg(appctx, LOG_NOTICE, "health check port updated.\n"); } else if (strcmp(args[3], "addr") == 0) { -- 2.29.2
[PATCH 9/9] CLEANUP: contrib/prometheus-exporter: align and reorder fields
- align safe_idle_connections_current field fix minor typo added in commit 37286a5ac595069a491c3e8a7a7e4faf3d9ea8ad ("MEDIUM: contrib/prometheus-exporter: Rework matrices defining Promex metrics") - reorder info fields to be able to compare them easily - add missing ignored info fields as comment Signed-off-by: William Dauchy --- contrib/prometheus-exporter/service-prometheus.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/contrib/prometheus-exporter/service-prometheus.c b/contrib/prometheus-exporter/service-prometheus.c index 98f6fca62..29524cd91 100644 --- a/contrib/prometheus-exporter/service-prometheus.c +++ b/contrib/prometheus-exporter/service-prometheus.c @@ -108,7 +108,6 @@ const struct promex_metric promex_global_metrics[INF_TOTAL_FIELDS] = { //[INF_NAME] ignored //[INF_VERSION], ignored //[INF_RELEASE_DATE] ignored - [INF_BUILD_INFO] = { .n = IST("build_info"), .type = PROMEX_MT_GAUGE, .flags = PROMEX_FL_INFO_METRIC }, [INF_NBTHREAD] = { .n = IST("nbthread"), .type = PROMEX_MT_GAUGE, .flags = PROMEX_FL_INFO_METRIC }, [INF_NBPROC] = { .n = IST("nbproc"), .type = PROMEX_MT_GAUGE, .flags = PROMEX_FL_INFO_METRIC }, [INF_PROCESS_NUM]= { .n = IST("relative_process_id"), .type = PROMEX_MT_GAUGE, .flags = PROMEX_FL_INFO_METRIC }, @@ -116,8 +115,11 @@ const struct promex_metric promex_global_metrics[INF_TOTAL_FIELDS] = { //[INF_UPTIME] ignored [INF_UPTIME_SEC] = { .n = IST("uptime_seconds"), .type = PROMEX_MT_GAUGE, .flags = PROMEX_FL_INFO_METRIC }, [INF_START_TIME_SEC] = { .n = IST("start_time_seconds"),.type = PROMEX_MT_GAUGE, .flags = PROMEX_FL_INFO_METRIC }, + //[INF_MEMMAX_MB] ignored [INF_MEMMAX_BYTES] = { .n = IST("max_memory_bytes"), .type = PROMEX_MT_GAUGE, .flags = PROMEX_FL_INFO_METRIC }, + //[INF_POOL_ALLOC_MB] ignored [INF_POOL_ALLOC_BYTES] = { .n = IST("pool_allocated_bytes"), .type = PROMEX_MT_GAUGE, .flags = PROMEX_FL_INFO_METRIC }, + //[INF_POOL_USED_MB] ignored [INF_POOL_USED_BYTES]= { .n = IST("pool_used_bytes"), .type = PROMEX_MT_GAUGE, .flags = PROMEX_FL_INFO_METRIC }, [INF_POOL_FAILED]= { .n = IST("pool_failures_total"), .type = PROMEX_MT_COUNTER, .flags = PROMEX_FL_INFO_METRIC }, [INF_ULIMIT_N] = { .n = IST("max_fds"), .type = PROMEX_MT_GAUGE, .flags = PROMEX_FL_INFO_METRIC }, @@ -173,6 +175,7 @@ const struct promex_metric promex_global_metrics[INF_TOTAL_FIELDS] = { [INF_BYTES_OUT_RATE] = { .n = IST("bytes_out_rate"), .type = PROMEX_MT_GAUGE, .flags = PROMEX_FL_INFO_METRIC }, //[INF_DEBUG_COMMANDS_ISSUED] ignored [INF_CUM_LOG_MSGS] = { .n = IST("recv_logs_total"), .type = PROMEX_MT_COUNTER, .flags = PROMEX_FL_INFO_METRIC }, + [INF_BUILD_INFO] = { .n = IST("build_info"), .type = PROMEX_MT_GAUGE, .flags = PROMEX_FL_INFO_METRIC }, }; /* frontend/backend/server fields */ @@ -273,7 +276,7 @@ const struct promex_metric promex_st_metrics[ST_F_TOTAL_FIELDS] = { [ST_F_TT_MAX] = { .n = IST("max_total_time_seconds"), .type = PROMEX_MT_GAUGE,.flags = ( PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) }, [ST_F_EINT] = { .n = IST("internal_errors_total"), .type = PROMEX_MT_COUNTER, .flags = (PROMEX_FL_FRONT_METRIC | PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) }, [ST_F_IDLE_CONN_CUR] = { .n = IST("unsafe_idle_connections_current"), .type = PROMEX_MT_GAUGE,.flags = ( PROMEX_FL_SRV_METRIC) }, - [ST_F_SAFE_CONN_CUR]= { .n = IST("safe_idle_connections_current"), .type = PROMEX_MT_GAUGE,.flags = ( PROMEX_FL_SRV_METRIC) }, + [ST_F_SAFE_CONN_CUR] = { .n = IST("safe_idle_connections_current"), .type = PROMEX_MT_GAUGE,.flags = ( PROMEX_FL_SRV_METRIC) }, [ST_F_USED_CONN_CUR] = { .n = IST("used_connections_current"), .type =
[PATCH 7/9] MINOR: contrib/prometheus-exporter: add recv logs_logs_total field
this field was added by commit 45c457a62941a7c4a86ce4327d7755edcd4b230e ("MINOR: log: adds counters on received syslog messages.") Signed-off-by: William Dauchy --- contrib/prometheus-exporter/service-prometheus.c | 1 + 1 file changed, 1 insertion(+) diff --git a/contrib/prometheus-exporter/service-prometheus.c b/contrib/prometheus-exporter/service-prometheus.c index 0db75b6ae..6c5072b16 100644 --- a/contrib/prometheus-exporter/service-prometheus.c +++ b/contrib/prometheus-exporter/service-prometheus.c @@ -177,6 +177,7 @@ const struct promex_metric promex_global_metrics[INF_TOTAL_FIELDS] = { [INF_TOTAL_SPLICED_BYTES_OUT]= { .n = IST("spliced_bytes_out_total"), .type = PROMEX_MT_COUNTER, .flags = PROMEX_FL_INFO_METRIC }, [INF_BYTES_OUT_RATE] = { .n = IST("bytes_out_rate"), .type = PROMEX_MT_GAUGE, .flags = PROMEX_FL_INFO_METRIC }, //[INF_DEBUG_COMMANDS_ISSUED] ignored + [INF_CUM_LOG_MSGS] = { .n = IST("recv_logs_total"), .type = PROMEX_MT_COUNTER, .flags = PROMEX_FL_INFO_METRIC }, }; /* frontend/backend/server fields */ -- 2.29.2
[PATCH 8/9] CLEANUP: contrib/prometheus-exporter: remove unused includes
unless I'm wrong, those includes are no longer needed. The only recent one I remember is ssl-sock include since commit 5d9b8f3c9347a1a10b86f81d70b22c3cab0e6925 ("MINOR: contrib/prometheus-exporter: use fill_info for process dump") where we make use of the code from stats.c Signed-off-by: William Dauchy --- contrib/prometheus-exporter/service-prometheus.c | 5 - 1 file changed, 5 deletions(-) diff --git a/contrib/prometheus-exporter/service-prometheus.c b/contrib/prometheus-exporter/service-prometheus.c index 6c5072b16..98f6fca62 100644 --- a/contrib/prometheus-exporter/service-prometheus.c +++ b/contrib/prometheus-exporter/service-prometheus.c @@ -19,8 +19,6 @@ #include #include #include -#include -#include #include #include #include @@ -29,12 +27,9 @@ #include #include #include -#include -#include #include #include #include -#include #include #include #include -- 2.29.2
[PATCH 5/9] MINOR: contrib/prometheus-exporter: use stats desc when possible
It is a followup work of commit a191b77e54c26a97064cb42ab4927d4f5c95b896 ("MINOR: contrib/prometheus-exporter: merge info description from stats") but for all other stats fields; we however keep a way to override them when needed (e.g. units, specific cases) this is another step which will avoid duplicating work between stats.c and prometheus. Signed-off-by: William Dauchy --- .../prometheus-exporter/service-prometheus.c | 107 ++ 1 file changed, 10 insertions(+), 97 deletions(-) diff --git a/contrib/prometheus-exporter/service-prometheus.c b/contrib/prometheus-exporter/service-prometheus.c index 8baef82d7..f10e78f49 100644 --- a/contrib/prometheus-exporter/service-prometheus.c +++ b/contrib/prometheus-exporter/service-prometheus.c @@ -282,107 +282,20 @@ const struct promex_metric promex_st_metrics[ST_F_TOTAL_FIELDS] = { [ST_F_NEED_CONN_EST] = { .n = IST("need_connections_current"), .type = PROMEX_MT_GAUGE,.flags = ( PROMEX_FL_SRV_METRIC) }, }; -/* Description of all stats fields */ +/* Description of overriden stats fields */ const struct ist promex_st_metric_desc[ST_F_TOTAL_FIELDS] = { - [ST_F_PXNAME] = IST("The proxy name."), - [ST_F_SVNAME] = IST("The service name (FRONTEND for frontend, BACKEND for backend, any name for server/listener)."), - [ST_F_QCUR] = IST("Current number of queued requests."), - [ST_F_QMAX] = IST("Maximum observed number of queued requests."), - [ST_F_SCUR] = IST("Current number of active sessions."), - [ST_F_SMAX] = IST("Maximum observed number of active sessions."), - [ST_F_SLIM] = IST("Configured session limit."), - [ST_F_STOT] = IST("Total number of sessions."), - [ST_F_BIN]= IST("Current total of incoming bytes."), - [ST_F_BOUT] = IST("Current total of outgoing bytes."), - [ST_F_DREQ] = IST("Total number of denied requests."), - [ST_F_DRESP] = IST("Total number of denied responses."), - [ST_F_EREQ] = IST("Total number of request errors."), - [ST_F_ECON] = IST("Total number of connection errors."), - [ST_F_ERESP] = IST("Total number of response errors."), - [ST_F_WRETR] = IST("Total number of retry warnings."), - [ST_F_WREDIS] = IST("Total number of redispatch warnings."), [ST_F_STATUS] = IST("Current status of the service (0/1 depending on current `state` label value)."), - [ST_F_WEIGHT] = IST("Service weight."), - [ST_F_ACT]= IST("Current number of active servers."), - [ST_F_BCK]= IST("Current number of backup servers."), - [ST_F_CHKFAIL]= IST("Total number of failed check (Only counts checks failed when the server is up)."), - [ST_F_CHKDOWN]= IST("Total number of UP->DOWN transitions."), - [ST_F_LASTCHG]= IST("Number of seconds since the last UP<->DOWN transition."), - [ST_F_DOWNTIME] = IST("Total downtime (in seconds) for the service."), - [ST_F_QLIMIT] = IST("Configured maxqueue for the server (0 meaning no limit)."), - [ST_F_PID]= IST("Process id (0 for first instance, 1 for second, ...)"), - [ST_F_IID]= IST("Unique proxy id."), - [ST_F_SID]= IST("Server id (unique inside a proxy)."), - [ST_F_THROTTLE] = IST("Current throttle percentage for the server, when slowstart is active, or no value if not in slowstart."), - [ST_F_LBTOT] = IST("Total number of times a service was selected, either for new sessions, or when redispatching."), - [ST_F_TRACKED]= IST("Id of proxy/server if tracking is enabled."), - [ST_F_TYPE] = IST("Service type (0=frontend, 1=backend, 2=server, 3=socket/listener)."), - [ST_F_RATE] = IST("Current number of sessions per second over last elapsed second."), - [ST_F_RATE_LIM] = IST("Configured limit on new sessions per second."), - [ST_F_RATE_MAX] = IST("Maximum observed number of sessions per second."), [ST_F_CHECK_STATUS] = IST("Status of last health check (0/1 depending on current `state` label value)."), [ST_F_CHECK_CODE] = IST("layer5-7 code, if available of the last health check."), [ST_F_CHECK_DURATION] = IST("Total duration of the latest server health che
[PATCH 6/9] MINOR: contrib/prometheus-exporter: add uweight field
this field was added in commit bd7151002437af1a034a9fdbb582b3cbef5a78d1 ("MINOR: stats: report server's user-configured weight next to effective weight") Signed-off-by: William Dauchy --- contrib/prometheus-exporter/service-prometheus.c | 1 + 1 file changed, 1 insertion(+) diff --git a/contrib/prometheus-exporter/service-prometheus.c b/contrib/prometheus-exporter/service-prometheus.c index f10e78f49..0db75b6ae 100644 --- a/contrib/prometheus-exporter/service-prometheus.c +++ b/contrib/prometheus-exporter/service-prometheus.c @@ -280,6 +280,7 @@ const struct promex_metric promex_st_metrics[ST_F_TOTAL_FIELDS] = { [ST_F_SAFE_CONN_CUR]= { .n = IST("safe_idle_connections_current"), .type = PROMEX_MT_GAUGE,.flags = ( PROMEX_FL_SRV_METRIC) }, [ST_F_USED_CONN_CUR] = { .n = IST("used_connections_current"), .type = PROMEX_MT_GAUGE,.flags = ( PROMEX_FL_SRV_METRIC) }, [ST_F_NEED_CONN_EST] = { .n = IST("need_connections_current"), .type = PROMEX_MT_GAUGE,.flags = ( PROMEX_FL_SRV_METRIC) }, + [ST_F_UWEIGHT]= { .n = IST("uweight"), .type = PROMEX_MT_GAUGE,.flags = ( PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) }, }; /* Description of overriden stats fields */ -- 2.29.2
[PATCH 4/9] MINOR: stats: improve max stats descriptions
In order to unify prometheus and stats description, we need to remove some field reference which are specific to stats implementation: - `scur` in max current sessions (also reword current session) - `rate` in max sessions - `req_rate` in max requests - `conn_rate` in max connections Signed-off-by: William Dauchy --- src/stats.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/src/stats.c b/src/stats.c index fdb8e9adf..6fa59a9ef 100644 --- a/src/stats.c +++ b/src/stats.c @@ -160,8 +160,8 @@ const struct name_desc stat_fields[ST_F_TOTAL_FIELDS] = { [ST_F_SVNAME]= { .name = "svname", .desc = "Server name" }, [ST_F_QCUR] = { .name = "qcur", .desc = "Number of current queued connections" }, [ST_F_QMAX] = { .name = "qmax", .desc = "Highest value of queued connections encountered since process started" }, - [ST_F_SCUR] = { .name = "scur", .desc = "Current number of sessions on the frontend, backend or server" }, - [ST_F_SMAX] = { .name = "smax", .desc = "Highest value of scur encountered since process started" }, + [ST_F_SCUR] = { .name = "scur", .desc = "Number of current sessions on the frontend, backend or server" }, + [ST_F_SMAX] = { .name = "smax", .desc = "Highest value of current sessions encountered since process started" }, [ST_F_SLIM] = { .name = "slim", .desc = "Frontend/listener/server's maxconn, backend's fullconn" }, [ST_F_STOT] = { .name = "stot", .desc = "Total number of sessions since process started" }, [ST_F_BIN] = { .name = "bin", .desc = "Total number of request bytes since process started" }, @@ -191,7 +191,7 @@ const struct name_desc stat_fields[ST_F_TOTAL_FIELDS] = { [ST_F_TYPE] = { .name = "type", .desc = "Type of the object (Listener, Frontend, Backend, Server)" }, [ST_F_RATE] = { .name = "rate", .desc = "Total number of sessions processed by this object over the last second (sessions for listeners/frontends, requests for backends/servers)" }, [ST_F_RATE_LIM] = { .name = "rate_lim", .desc = "Limit on the number of sessions accepted in a second (frontend only, 'rate-limit sessions' setting)" }, - [ST_F_RATE_MAX] = { .name = "rate_max", .desc = "Highest value of 'rate' observed since the worker process started" }, + [ST_F_RATE_MAX] = { .name = "rate_max", .desc = "Highest value of sessions per second observed since the worker process started" }, [ST_F_CHECK_STATUS] = { .name = "check_status", .desc = "Status report of the server's latest health check, prefixed with '*' if a check is currently in progress" }, [ST_F_CHECK_CODE]= { .name = "check_code", .desc = "HTTP/SMTP/LDAP status code reported by the latest server health check" }, [ST_F_CHECK_DURATION]= { .name = "check_duration", .desc = "Total duration of the latest server health check, in milliseconds" }, @@ -203,7 +203,7 @@ const struct name_desc stat_fields[ST_F_TOTAL_FIELDS] = { [ST_F_HRSP_OTHER]= { .name = "hrsp_other", .desc = "Total number of HTTP responses with status <100, >599 returned by this object since the worker process started (error -1 included)" }, [ST_F_HANAFAIL] = { .name = "hanafail", .desc = "Total number of failed checks caused by an 'on-error' directive after an 'observe' condition matched" }, [ST_F_REQ_RATE] = { .name = "req_rate", .desc = "Number of HTTP requests processed over the last second on this object" }, - [ST_F_REQ_RATE_MAX] = { .name = "req_rate_max", .desc = "Highest value of 'req_rate' observed since the worker p
[PATCH 3/9] MINOR: stats: improve pending connections description
In order to unify prometheus and stats description, we need to clarify the description for pending connections. - remove the BE reference in counters struct, as it is also used in servers - remove reference of `qcur` field in description as it is specific to stats implemention - try to reword cur and max pending connections description Signed-off-by: William Dauchy --- include/haproxy/counters-t.h | 2 +- src/stats.c | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/include/haproxy/counters-t.h b/include/haproxy/counters-t.h index b896c4d36..1ea68dd6c 100644 --- a/include/haproxy/counters-t.h +++ b/include/haproxy/counters-t.h @@ -73,7 +73,7 @@ struct be_counters { unsigned int cps_max; /* maximum of new connections received per second */ unsigned int sps_max; /* maximum of new connections accepted per second (sessions) */ - unsigned int nbpend_max;/* max number of pending connections with no server assigned yet (BE only) */ + unsigned int nbpend_max;/* max number of pending connections with no server assigned yet */ unsigned int cur_sess_max; /* max number of currently active sessions */ long long bytes_in; /* number of bytes transferred from the client to the server */ diff --git a/src/stats.c b/src/stats.c index 4b8bd89b4..fdb8e9adf 100644 --- a/src/stats.c +++ b/src/stats.c @@ -158,8 +158,8 @@ const struct name_desc info_fields[INF_TOTAL_FIELDS] = { const struct name_desc stat_fields[ST_F_TOTAL_FIELDS] = { [ST_F_PXNAME]= { .name = "pxname", .desc = "Proxy name" }, [ST_F_SVNAME]= { .name = "svname", .desc = "Server name" }, - [ST_F_QCUR] = { .name = "qcur", .desc = "Current number of connections waiting in the server of backend queue" }, - [ST_F_QMAX] = { .name = "qmax", .desc = "Highest value of qcur encountered since process started" }, + [ST_F_QCUR] = { .name = "qcur", .desc = "Number of current queued connections" }, + [ST_F_QMAX] = { .name = "qmax", .desc = "Highest value of queued connections encountered since process started" }, [ST_F_SCUR] = { .name = "scur", .desc = "Current number of sessions on the frontend, backend or server" }, [ST_F_SMAX] = { .name = "smax", .desc = "Highest value of scur encountered since process started" }, [ST_F_SLIM] = { .name = "slim", .desc = "Frontend/listener/server's maxconn, backend's fullconn" }, -- 2.29.2
[PATCH 1/9] MAJOR: contrib/prometheus-exporter: move health check status to labels
this patch is a breaking change between v2.3 and v2.4: we move from using gauge value for health check states to labels values. The diff is quite small thanks to the preparation work from Christopher to allow more flexibility in labels, see commit 5a2f938732126f43bbf4cea5482c01552b0d0314 ("MEDIUM: contrib/prometheus-exporter: Use dynamic labels instead of static ones") this is a follow up of commit c6464591a365bfcf509b322bdaa4d608c9395d75 ("MAJOR: contrib/prometheus-exporter: move ftd/bkd/srv states to labels"). The main goal being to be better aligned with prometheus use cases in terms of queries. More specifically to health checks, Pierre C. mentioned the possible quirks he had to put in place in order to make use of those metrics through prometheus: by(proxy, check_status) (count_values by(proxy, instance) ("check_status", haproxy_server_check_status)) I am perfectly aware this introduces a lot more metrics but I don't see how we can improve the usability without it. The main issue remains in the cardinality of the states which are > 20. Prometheus recommends to stay below a cardinality of 10 for a given metric but I consider our case very specific, because highly linked to the level of precision haproxy exposes. Even before this patch I saw several large production setup (a few hundreds of MB in output) which are making use of the scope parameter to simply ignore the server metrics, so that the scrapping can be faster, and memory consumed on client side not too high. So I believe we should eventually continue in that direction and offer more granularity of filtering of the output. That being said it is already possible to filter out the data on prometheus client side. this is related to github issue #1029 Signed-off-by: William Dauchy --- .../prometheus-exporter/service-prometheus.c| 17 ++--- 1 file changed, 14 insertions(+), 3 deletions(-) diff --git a/contrib/prometheus-exporter/service-prometheus.c b/contrib/prometheus-exporter/service-prometheus.c index dbf4c7f39..a3141d39d 100644 --- a/contrib/prometheus-exporter/service-prometheus.c +++ b/contrib/prometheus-exporter/service-prometheus.c @@ -18,6 +18,7 @@ #include #include #include +#include #include #include #include @@ -319,7 +320,7 @@ const struct ist promex_st_metric_desc[ST_F_TOTAL_FIELDS] = { [ST_F_RATE] = IST("Current number of sessions per second over last elapsed second."), [ST_F_RATE_LIM] = IST("Configured limit on new sessions per second."), [ST_F_RATE_MAX] = IST("Maximum observed number of sessions per second."), - [ST_F_CHECK_STATUS] = IST("Status of last health check (HCHK_STATUS_* values)."), + [ST_F_CHECK_STATUS] = IST("Status of last health check (0/1 depending on current `state` label value)."), [ST_F_CHECK_CODE] = IST("layer5-7 code, if available of the last health check."), [ST_F_CHECK_DURATION] = IST("Total duration of the latest server health check, in seconds."), [ST_F_HRSP_1XX] = IST("Total number of HTTP responses."), @@ -886,6 +887,7 @@ static int promex_dump_srv_metrics(struct appctx *appctx, struct htx *htx) int ret = 1; double secs; enum promex_srv_state state; + const char *check_state; int i; for (;appctx->st2 < ST_F_TOTAL_FIELDS; appctx->st2++) { @@ -963,8 +965,17 @@ static int promex_dump_srv_metrics(struct appctx *appctx, struct htx *htx) case ST_F_CHECK_STATUS: if ((sv->check.state & (CHK_ST_ENABLED|CHK_ST_PAUSED)) != CHK_ST_ENABLED) goto next_sv; - val = mkf_u32(FN_OUTPUT, sv->check.status); - break; + + for (i = 0; i < HCHK_STATUS_SIZE; i++) { + val = mkf_u32(FO_STATUS, sv->check.status == i); + check_state = get_check_status_info(i); + labels[2].name = ist("state"); + labels[2].value = ist2(check_state, strlen(check_state)); + if (!promex_dump_metric(appctx, htx, prefix, _st_metrics[appctx->st2], + , labels, , max)) + goto full; + } + goto next_sv;
[PATCH 2/9] MINOR: contrib/prometheus-exporter: improve service status description field
Since we changed the behaviour of this metric, improve the description to better explain what is the meaning of the new gauge value; it also reflects the description we did for health check status. Signed-off-by: William Dauchy --- contrib/prometheus-exporter/service-prometheus.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/contrib/prometheus-exporter/service-prometheus.c b/contrib/prometheus-exporter/service-prometheus.c index a3141d39d..8baef82d7 100644 --- a/contrib/prometheus-exporter/service-prometheus.c +++ b/contrib/prometheus-exporter/service-prometheus.c @@ -301,7 +301,7 @@ const struct ist promex_st_metric_desc[ST_F_TOTAL_FIELDS] = { [ST_F_ERESP] = IST("Total number of response errors."), [ST_F_WRETR] = IST("Total number of retry warnings."), [ST_F_WREDIS] = IST("Total number of redispatch warnings."), - [ST_F_STATUS] = IST("Current status of the service."), + [ST_F_STATUS] = IST("Current status of the service (0/1 depending on current `state` label value)."), [ST_F_WEIGHT] = IST("Service weight."), [ST_F_ACT]= IST("Current number of active servers."), [ST_F_BCK]= IST("Current number of backup servers."), -- 2.29.2
[PATCH 0/9] prometheus: health check as labels + cleanup
Hi Christopher, Sorry for this long series but I believe this one is fairly easy to review: - the major change is for health checks on prometheus side, there are more details in the commit message; this is something I wanted early so people can start testing it, especially on large setup. I however think we will need to improve the `scope` filtering. The good point is that after this patch, this metric will be way easier to use; the bad point is that it is a breaking change between v2.3 and v2.4. But I believe this is acceptable following the previous state changes already merged. - next patch is the continuation of the cleaning work: * try to improve descriptions so we can use them in both case * merge them when possible, override otherwise - I finished with a bit of minor cleaning which pointed me a few missing fields. I know we probably can improve that to avoid forgetting adding those fields in the future, but I assume it is ok for now to simply come back with a sane result. The postive point is I did not had to add description and implementation, as we make use of stats.c William Dauchy (9): MAJOR: contrib/prometheus-exporter: move health check status to labels MINOR: contrib/prometheus-exporter: improve service status description field MINOR: stats: improve pending connections description MINOR: stats: improve max stats descriptions MINOR: contrib/prometheus-exporter: use stats desc when possible MINOR: contrib/prometheus-exporter: add uweight field MINOR: contrib/prometheus-exporter: add recv logs_logs_total field CLEANUP: contrib/prometheus-exporter: remove unused includes CLEANUP: contrib/prometheus-exporter: align and reorder fields .../prometheus-exporter/service-prometheus.c | 140 -- include/haproxy/counters-t.h | 2 +- src/stats.c | 14 +- 3 files changed, 40 insertions(+), 116 deletions(-) -- 2.29.2
Re: lua function core.get_info() broken in haproxy 2.2.7
Hi James, On Sat, Jan 30, 2021 at 3:07 AM James Brown wrote: > Ah, never mind, I see that this was already fixed in master in > 3ddec3ee7d344112b4e4fbde317f8886a20d66a0. yeah, sorry about that. This will be fixed in v2.2.8 and v2.3.4 -- William
Re: Makefile, environment variables and REGTESTS_TYPES
On Fri, Jan 29, 2021 at 2:46 PM William Lallemand wrote: > According to `make reg-tests-help` the REGTESTS_TYPES parameter must be > configured as an environment variable and not a make argument. > However since patch 3bad3d5 ("BUILD: Makefile: exclude broken tests by > default"), it does not work anymore with an environment variable. > Looking at the several CI files we have, it is used as a make > argument everywhere. wow indeed, I did not realise that; yet another thing I broke ;-) > I'm going to update the `make reg-tests-help` command with the correct > syntax if there isn't any complain. Thanks for looking into this! -- William