Re: Remaining issues on 2.5 for the release

2021-11-15 Thread William Dauchy
Hi Willy,

On Fri, Nov 12, 2021 at 3:48 PM Willy Tarreau  wrote:
> I intended to emit the final 2.5 this week-end, but a few users having
> upgraded to the latest 2.4, 2.3 or 2.2 reported strange issues that we
> couldn't reproduce and for which we don't have more info yet. Some seem
> related to connections taking longer to vanish, others to possibly
> truncated responses. These include some of the recent 2.5 fixes, and
> it's very possible that a bug has been hiding another one, and that
> fixing one revealed the other one.
> For now I'm failing to reproduce any such problem on any of these
> versions, and the information gathered for now are insufficient to draw
> any conclusion. I'm not thrilled at the idea of releasing 2.5 and having
> to emit yet another a few days one later. We're not late in the cycle so
> I prefer that we focus our efforts on finding that problem and fixing
> the affected versions first.
> For the record, the related issues are #1448, #1451 and #1453 for the
> strange behavior on the connections, and #1450 for the truncation. If
> anyone manages to set up any reliable reproducer, do not hesitate to
> share it!

Do we know whether v2.4.7 is affected by this regression?

Thanks,
-- 
William



Re: [PATCH] MINOR: promex: backend aggregated server check status

2021-11-08 Thread William Dauchy
On Mon, Nov 8, 2021 at 1:52 PM Willy Tarreau  wrote:
> Just to be sure, is this something you want to merge into 2.5 or is it
> to be queued next ? I'm fine with both, but I prefer to ask as it's not
> just a one-liner and I don't know the impact on promex.

I know Christopher was possibly thinking about a more advanced
implementation but I thought it could be ok for a first version.
Probably a good idea to wait for Christopher feedbacks anyway.
I was indeed targeting a late v2.5, despite being a new thing. Sorry
for that and I forgot to add a message about it.
If it works well enough, I will also probably ask for a backport in
2.4 before the end of the year as I know large clusters are waiting
for this patch to lower their memory consumption in prometheus.

Thanks,
-- 
William



[PATCH] MINOR: promex: backend aggregated server check status

2021-11-07 Thread William Dauchy
- add new metric: `haproxy_backend_agg_server_check_status`
  it counts the number of servers matching a specific check status
  this permits to exclude per server check status as the usage is often
  to rely on the total. Indeed in large setup having thousands of
  servers per backend the memory impact is not neglible to store the per
  server metric.
- realign promex_str_metrics array

quite simple implementation - we could improve it later by adding an
internal state to the prometheus exporter, thus to avoid counting at
every dump.

this patch is an attempt to close github issue #1312

Signed-off-by: William Dauchy 
---
 addons/promex/service-prometheus.c | 229 -
 include/haproxy/stats-t.h  |   1 +
 src/stats.c|   4 +
 3 files changed, 131 insertions(+), 103 deletions(-)

diff --git a/addons/promex/service-prometheus.c 
b/addons/promex/service-prometheus.c
index 221e40705..ef217007d 100644
--- a/addons/promex/service-prometheus.c
+++ b/addons/promex/service-prometheus.c
@@ -189,106 +189,107 @@ const struct promex_metric 
promex_global_metrics[INF_TOTAL_FIELDS] = {
 
 /* frontend/backend/server fields */
 const struct promex_metric promex_st_metrics[ST_F_TOTAL_FIELDS] = {
-   //[ST_F_PXNAME] ignored
-   //[ST_F_SVNAME] ignored
-   [ST_F_QCUR]   = { .n = IST("current_queue"),
.type = PROMEX_MT_GAUGE,.flags = (  
 PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) },
-   [ST_F_QMAX]   = { .n = IST("max_queue"),
.type = PROMEX_MT_GAUGE,.flags = (  
 PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) },
-   [ST_F_SCUR]   = { .n = IST("current_sessions"), 
.type = PROMEX_MT_GAUGE,.flags = (PROMEX_FL_FRONT_METRIC | 
PROMEX_FL_LI_METRIC | PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) },
-   [ST_F_SMAX]   = { .n = IST("max_sessions"), 
.type = PROMEX_MT_GAUGE,.flags = (PROMEX_FL_FRONT_METRIC | 
PROMEX_FL_LI_METRIC | PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) },
-   [ST_F_SLIM]   = { .n = IST("limit_sessions"),   
.type = PROMEX_MT_GAUGE,.flags = (PROMEX_FL_FRONT_METRIC | 
PROMEX_FL_LI_METRIC | PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) },
-   [ST_F_STOT]   = { .n = IST("sessions_total"),   
.type = PROMEX_MT_COUNTER,  .flags = (PROMEX_FL_FRONT_METRIC | 
PROMEX_FL_LI_METRIC | PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) },
-   [ST_F_BIN]= { .n = IST("bytes_in_total"),   
.type = PROMEX_MT_COUNTER,  .flags = (PROMEX_FL_FRONT_METRIC | 
PROMEX_FL_LI_METRIC | PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) },
-   [ST_F_BOUT]   = { .n = IST("bytes_out_total"),  
.type = PROMEX_MT_COUNTER,  .flags = (PROMEX_FL_FRONT_METRIC | 
PROMEX_FL_LI_METRIC | PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) },
-   [ST_F_DREQ]   = { .n = IST("requests_denied_total"),
.type = PROMEX_MT_COUNTER,  .flags = (PROMEX_FL_FRONT_METRIC | 
PROMEX_FL_LI_METRIC | PROMEX_FL_BACK_METRIC   ) },
-   [ST_F_DRESP]  = { .n = IST("responses_denied_total"),   
.type = PROMEX_MT_COUNTER,  .flags = (PROMEX_FL_FRONT_METRIC | 
PROMEX_FL_LI_METRIC | PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) },
-   [ST_F_EREQ]   = { .n = IST("request_errors_total"), 
.type = PROMEX_MT_COUNTER,  .flags = (PROMEX_FL_FRONT_METRIC | 
PROMEX_FL_LI_METRIC   ) },
-   [ST_F_ECON]   = { .n = IST("connection_errors_total"),  
.type = PROMEX_MT_COUNTER,  .flags = (  
 PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) },
-   [ST_F_ERESP]  = { .n = IST("response_errors_total"),
.type = PROMEX_MT_COUNTER,  .flags = (  
 PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) },
-   [ST_F_WRETR]  = { .n = IST("retry_warnings_total"), 
.type = PROMEX_MT_COUNTER,  .flags = (  
 PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) },
-   [ST_F_WREDIS] = { .n = IST("redispatch_warnings_total"),
.type = PROMEX_MT_COUNTER,  .flags = (  
 PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) },
-   [ST_F_STATUS] = { .n = IST("status"),   
.type = PROMEX_MT_GAUGE,.flags = (PROMEX_FL_FRONT_METRIC | 
PROMEX_FL_LI_METRIC | PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) },
-   [ST_F_WEIGHT]   

[PATCH] DOC: stats: fix location of the text representation

2021-11-06 Thread William Dauchy
`info_field_names` and `stat_field_names` no longer exist and have been
moved in stats.c
To avoid changing this comment, just mention the name of the new table
`info_fields` and `stat_fields`

Signed-off-by: William Dauchy 
---
 include/haproxy/stats-t.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/haproxy/stats-t.h b/include/haproxy/stats-t.h
index 312013364..9ac875e73 100644
--- a/include/haproxy/stats-t.h
+++ b/include/haproxy/stats-t.h
@@ -253,8 +253,8 @@ enum field_scope {
FS_MASK = 0xFF00,
 };
 
-/* Show Info fields for CLI output. For any field added here, please add the 
text
- * representation in the info_field_names array below. Please only append at 
the end,
+/* Show info fields for CLI output. For any field added here, please add the
+ * text representation in the info_fields array. Please only append at the end,
  * before the INF_TOTAL_FIELDS entry, and never insert anything in the middle
  * nor at the beginning.
  */
@@ -338,7 +338,7 @@ enum info_field {
 
 
 /* Stats fields for CSV output. For any field added here, please add the text
- * representation in the stat_field_names array below. Please only append at 
the end,
+ * representation in the stat_fields array. Please only append at the end,
  * before the ST_F_TOTAL_FIELDS entry, and never insert anything in the middle
  * nor at the beginning.
  */
-- 
2.30.2




Re: http-request return bytes_read from v2.3 to v2.4

2021-07-26 Thread William Dauchy
On Mon, Jul 26, 2021 at 8:35 AM Christopher Faulet  wrote:
> The response size reported in the log messages is related to the internal
> representation of the HTTP message. It is a limitation of the current design. 
> It
> means this size is not equal to the raw size of the message and may change 
> when
> the HTX representation changes. In this case, in 2.4, the start-line
> representation was changed, a 32-bits integer was removed from hx_sl 
> structure.
> In addition, the end-of-message block (EOM) was removed from the HTX
> representation. It counted for 1 byte. This explains the difference of 5 bytes
> between 2.3 and 2.4.

Thanks Christophe, crystal clear explanation!

-- 
William



http-request return bytes_read from v2.3 to v2.4

2021-07-25 Thread William Dauchy
Hello,

While upgrading from v2.3.x to v2.4.x I noticed a difference of bytes
read in requests http logs with this simple frontend config:

frontend foo
bind 127.0.0.1:8000
http-request return

in v2.3, the logs (%B) tells me 54 bytes, in v2.4, I am getting 49. So
a difference of 5 bytes per request.

I was simply curious to understand that behavior change. Is it a
change in the way we count bytes in stats, or is there really a
difference in the payload we write?
When I look at a dump, the TCP payloads are identical with 57 bytes in
both versions. And the overall length of the packet is 123 bytes.
Maybe that's expected, but I wanted some clarification.

Thanks,
-- 
William



Re: [ANNOUNCE] haproxy-2.4.0

2021-05-26 Thread William Dauchy
On Fri, May 14, 2021 at 11:58 AM Willy Tarreau  wrote:
>   - interoperability / protocol support: WebSocket over HTTP/2 (RFC8441)
> is now supported on both sides, regardless of the version on the other
> side. The cache now supports the "Vary" header with a few commonly
> used headers, including "Accept-encoding" which gets normalized for
> optimal cache hit ratio. The Prometheus exporter got a significant
> liftup, requires less tricks on the Prometheus side, and supports
> listing only certain metrics for faster retrieval. Optional native
> support for Opentracing was also integrated (via USE_OT=1). The DNS
> resolvers now support talking to servers over TCP. Basic support for
> extracting information from MQTT and FIX protocol was added. Timeouts
> can now be adjusted on the fly and per-request in order to adapt to
> particuarly slow servers or special protocols.

Regarding prometheus, it should probably noted some major changes
regarding some metrics, such as for health check, where the value is
now located in a label, instead of the value of the metric itself:
see also 
http://git.haproxy.org/?p=haproxy.git;a=commit;h=de3c32638925c2071a5a84cbdafe2f112d2c4261
-- 
William



Re: [ANNOUNCE] haproxy-2.3.9

2021-03-31 Thread William Dauchy
On Tue, Mar 30, 2021 at 6:59 PM Willy Tarreau  wrote:
> HAProxy 2.3.9 was released on 2021/03/30. It added 5 new commits
> after version 2.3.8.
>
> This essentially fixes the rate counters issue that popped up in 2.3.8
> after the previous fix for the rate counters already.
>
> What happened is that the internal time in millisecond wraps every 49.7
> days and that the new global counter used to make sure rate counters are
> now stable across threads starts at zero and is initialized when older
> than the current thread's current date. It just happens that the wrapping
> happened a few hours ago at "Mon Mar 29 23:59:46 CEST 2021" exactly and
> that any process started since this date and for the next 24 days doesn't
> validate this condition anymore, hence doesn't rotate its rate counters
> anymore.

Thanks Willy for the quick update. That's a good example to avoid
pushing stable versions at the same time, so we have opportunities to
find those regressions.

-- 
William



Re: Table sticky counters decrementation problem

2021-03-30 Thread William Dauchy
On Tue, Mar 30, 2021 at 5:57 PM Willy Tarreau  wrote:
> out of curiosity I wanted to check when the overflow happened:
>
> $ date --date=@$$(date +%s) * 1000) & -0x800) / 1000))
> Mon Mar 29 23:59:46 CEST 2021
>
> So it only affects processes started since today. I'm quite tempted not
> to wait further and to emit 2.3.9 urgently to fix this before other
> people get trapped after reloading their process. Any objection ?

I do confirm the timestamp on our side but do not have the necessary
tooling to test the fix.

Thanks,
-- 
William



Re: "[ANNOUNCE] haproxy-2.3.6

2021-03-05 Thread William Dauchy
Hi,

On Wed, Mar 3, 2021 at 4:09 PM Christopher Faulet  wrote:
>- An issue leading to possible infinite loops because of a double locking
>  effect in the mt lists was fixed by Olivier. If MT_LIST_TRY_ADDQ()
>  macro, it was possible to try to lock twice the same element, making the
>  second lock attempt to fail in loop.
> Olivier Houchard (1):
>BUG/MEDIUM: lists: Avoid an infinite loop in MT_LIST_TRY_ADDQ().

not very clear in which conditions it can be triggered. Do you have
more details about it?

Thanks,

-- 
William



[PATCH] BUG/MEDIUM: contrib/prometheus-exporter: fix segfault in listener name dump

2021-02-24 Thread William Dauchy
We need to check whether listener is empty before doing anything; in
that case, we were trying to dump listerner name while name is null. So
simply move the counters check above, which validate all possible cases
when the listener is empty. This is very similar to what is done in
stats.c

see also the trace:

  Thread 1 "haproxy" received signal SIGSEGV, Segmentation fault.
  __strlen_sse2 () at ../sysdeps/x86_64/multiarch/../strlen.S:120
  120     ../sysdeps/x86_64/multiarch/../strlen.S: No such file or directory.
  (gdb) bt
  #0  __strlen_sse2 () at ../sysdeps/x86_64/multiarch/../strlen.S:120
  #1  0x555b716b in promex_dump_listener_metrics (htx=0x558fadf0, 
appctx=0x55926070) at contrib/prometheus-exporter/service-prometheus.c:722
  #2  promex_dump_metrics (htx=0x558fadf0, si=0x55925920, 
appctx=0x55926070) at contrib/prometheus-exporter/service-prometheus.c:1200
  #3  promex_appctx_handle_io (appctx=0x55926070) at 
contrib/prometheus-exporter/service-prometheus.c:1477
  #4  0x556f0c94 in task_run_applet (t=0x55926180, 
context=0x55926070, state=) at src/applet.c:88
  #5  0x556bc6d8 in run_tasks_from_lists 
(budgets=budgets@entry=0x7fffe374) at src/task.c:548
  #6  0x556bd1a0 in process_runnable_tasks () at src/task.c:750
  #7  0x55696cdd in run_poll_loop () at src/haproxy.c:2870
  #8  0x55697025 in run_thread_poll_loop (data=data@entry=0x0) at 
src/haproxy.c:3035
  #9  0x55596c90 in main (argc=, argv=0x7fffe818) at 
src/haproxy.c:3723
  quit)

this bug was introduced by commit
e3f7bd5ae9e969cbfe87e4130d06bff7a3e814c6 ("MEDIUM:
contrib/prometheus-exporter: add listen stats"), which is present for
2.4 only, so no backport needed.

Signed-off-by: William Dauchy 
---
 contrib/prometheus-exporter/service-prometheus.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/contrib/prometheus-exporter/service-prometheus.c 
b/contrib/prometheus-exporter/service-prometheus.c
index 7cf30c1f3..e6023d353 100644
--- a/contrib/prometheus-exporter/service-prometheus.c
+++ b/contrib/prometheus-exporter/service-prometheus.c
@@ -718,12 +718,12 @@ static int promex_dump_listener_metrics(struct appctx 
*appctx, struct htx *htx)
li = appctx->ctx.stats.obj2;
list_for_each_entry_from(li, >conf.listeners, 
by_fe) {
 
-   labels[1].name  = ist("listener");
-   labels[1].value = ist2(li->name, 
strlen(li->name));
-
if (!li->counters)
continue;
 
+   labels[1].name  = ist("listener");
+   labels[1].value = ist2(li->name, 
strlen(li->name));
+
if (!stats_fill_li_stats(px, li, 0, stats,
 ST_F_TOTAL_FIELDS, 
&(appctx->st2)))
return -1;
-- 
2.30.0




[PATCH 2/3] REGTESTS: contrib/prometheus-exporter: test NaN values

2021-02-18 Thread William Dauchy
In order to make sure we detect when we change default behaviour for
some metrics, test the NaN value when it is expected.

Those metrics were listed since our last rework as their default value
changed, unless the appropriate config is set.

Signed-off-by: William Dauchy 
---
 reg-tests/contrib/prometheus.vtc | 30 ++
 1 file changed, 30 insertions(+)

diff --git a/reg-tests/contrib/prometheus.vtc b/reg-tests/contrib/prometheus.vtc
index cdd0f0f55..1ebeb29cb 100644
--- a/reg-tests/contrib/prometheus.vtc
+++ b/reg-tests/contrib/prometheus.vtc
@@ -10,6 +10,11 @@ server s1 {
txresp
 } -repeat 2 -start
 
+server s2 {
+   rxreq
+   txresp
+} -repeat 2 -start
+
 haproxy h1 -conf {
 defaults
mode http
@@ -29,11 +34,13 @@ haproxy h1 -conf {
 backend be
stick-table type ip size 1m expire 10s store http_req_rate(10s)
server s1 ${s1_addr}:${s1_port}
+   server s2 ${s2_addr}:${s2_port} check maxqueue 10 maxconn 12 
pool-max-conn 42
 } -start
 
 client c1 -connect ${h1_stats_sock} {
txreq -url "/metrics"
rxresp
+   # test general metrics
expect resp.status == 200
expect resp.body ~ ".*haproxy_process.*"
expect resp.body ~ ".*haproxy_frontend.*"
@@ -42,6 +49,29 @@ client c1 -connect ${h1_stats_sock} {
expect resp.body ~ ".*haproxy_server.*"
expect resp.body ~ ".*haproxy_sticktable.*"
 
+   # test expected NaN values
+   expect resp.body ~ 
".*haproxy_server_check_failures_total{proxy=\"be\",server=\"s1\"} NaN.*"
+   expect resp.body ~ 
".*haproxy_server_check_up_down_total{proxy=\"be\",server=\"s1\"} NaN.*"
+   expect resp.body ~ 
".*haproxy_server_check_failures_total{proxy=\"be\",server=\"s2\"} 0.*"
+   expect resp.body ~ 
".*haproxy_server_check_up_down_total{proxy=\"be\",server=\"s2\"} 0.*"
+
+   expect resp.body ~ 
".*haproxy_server_queue_limit{proxy=\"be\",server=\"s1\"} NaN.*"
+   expect resp.body ~ 
".*haproxy_server_queue_limit{proxy=\"be\",server=\"s2\"} 10.*"
+
+   expect resp.body ~ 
".*haproxy_server_limit_sessions{proxy=\"be\",server=\"s1\"} NaN.*"
+   expect resp.body ~ 
".*haproxy_server_limit_sessions{proxy=\"be\",server=\"s2\"} 12.*"
+
+   expect resp.body ~ 
".*haproxy_backend_downtime_seconds_total{proxy=\"stats\"} NaN.*"
+   expect resp.body ~ 
".*haproxy_backend_downtime_seconds_total{proxy=\"be\"} 0.*"
+   expect resp.body ~ 
".*haproxy_server_downtime_seconds_total{proxy=\"be\",server=\"s1\"} NaN.*"
+   expect resp.body ~ 
".*haproxy_server_downtime_seconds_total{proxy=\"be\",server=\"s2\"} 0.*"
+
+   expect resp.body ~ 
".*haproxy_server_current_throttle{proxy=\"be\",server=\"s1\"} NaN.*"
+
+   expect resp.body ~ 
".*haproxy_server_idle_connections_limit{proxy=\"be\",server=\"s1\"} NaN.*"
+   expect resp.body ~ 
".*haproxy_server_idle_connections_limit{proxy=\"be\",server=\"s2\"} 42.*"
+
+   # test scope
txreq -url "/metrics?scope="
rxresp
expect resp.status == 200
-- 
2.30.0




[PATCH 1/3] DOC: contrib/prometheus-exporter: remove htx reference

2021-02-18 Thread William Dauchy
now that htx is the default everywhere, we can remove the need to put
htx as a mandatory option to setup prometheus.

Signed-off-by: William Dauchy 
---
 contrib/prometheus-exporter/README | 1 -
 1 file changed, 1 deletion(-)

diff --git a/contrib/prometheus-exporter/README 
b/contrib/prometheus-exporter/README
index 30154de95..fdbc50203 100644
--- a/contrib/prometheus-exporter/README
+++ b/contrib/prometheus-exporter/README
@@ -27,7 +27,6 @@ and the corresponding HTTP proxy must enable the HTX support. 
For instance:
 frontend test
 mode http
 ...
-option http-use-htx
 http-request use-service prometheus-exporter if { path /metrics }
 ...
 
-- 
2.30.0




[PATCH 3/3] REGTESTS: contrib/prometheus-exporter: test well known labels

2021-02-18 Thread William Dauchy
as we previously briefly broke labels handling, test them to make sure
we don't introduce regressions in the future.
see also commit 040b1195f70d6a24204ede081451fd1dd71e6a34 ("BUG/MINOR:
contrib/prometheus-exporter: Restart labels dump at the right pos") for
reference

Signed-off-by: William Dauchy 
---
 reg-tests/contrib/prometheus.vtc | 9 +
 1 file changed, 9 insertions(+)

diff --git a/reg-tests/contrib/prometheus.vtc b/reg-tests/contrib/prometheus.vtc
index 1ebeb29cb..ebe0b8753 100644
--- a/reg-tests/contrib/prometheus.vtc
+++ b/reg-tests/contrib/prometheus.vtc
@@ -71,6 +71,15 @@ client c1 -connect ${h1_stats_sock} {
expect resp.body ~ 
".*haproxy_server_idle_connections_limit{proxy=\"be\",server=\"s1\"} NaN.*"
expect resp.body ~ 
".*haproxy_server_idle_connections_limit{proxy=\"be\",server=\"s2\"} 42.*"
 
+   # test well known labels presence
+   expect resp.body ~ ".*haproxy_process_build_info{version=\".*\"} 1.*"
+   expect resp.body ~ 
".*haproxy_frontend_http_responses_total{proxy=\"stats\",code=\"4xx\"} 0.*"
+   expect resp.body ~ 
".*haproxy_frontend_status{proxy=\"fe\",state=\"UP\"} 1.*"
+   expect resp.body ~ 
".*haproxy_listener_status{proxy=\"stats\",listener=\"sock-1\",state=\"WAITING\"}
 0.*"
+   expect resp.body ~ ".*haproxy_backend_status{proxy=\"be\",state=\"UP\"} 
1.*"
+   expect resp.body ~ 
".*haproxy_server_status{proxy=\"be\",server=\"s1\",state=\"DOWN\"} 0.*"
+   expect resp.body ~ 
".*haproxy_server_check_status{proxy=\"be\",server=\"s2\",state=\"HANA\"} 0.*"
+
# test scope
txreq -url "/metrics?scope="
rxresp
-- 
2.30.0




[PATCH] MINOR: cli: add missing agent commands for set server

2021-02-15 Thread William Dauchy
we previously forgot to add `agent-*` commands.
Take this opportunity to rewrite the help string in a simpler way for
readability (mainly removing simple quotes)

Signed-off-by: William Dauchy 
---
 src/server.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/src/server.c b/src/server.c
index da6ee52ad..1c1eeab73 100644
--- a/src/server.c
+++ b/src/server.c
@@ -4618,9 +4618,10 @@ static int cli_parse_set_server(char **args, char 
*payload, struct appctx *appct
 #endif
} else {
cli_err(appctx,
-   "'set server ' only supports 'agent', 'health', "
-   "'state', 'weight', 'addr', 'fqdn', 'check-addr', "
-   "'check-port' and 'ssl'.\n");
+   "usage: set server / "
+   "addr | agent | agent-addr | agent-port | agent-send | "
+   "check-addr | check-port | fqdn | health | ssl | "
+   "state | weight\n");
}
  out_unlock:
HA_SPIN_UNLOCK(SERVER_LOCK, >lock);
-- 
2.30.0




[PATCH 2/3] MINOR: stats: add helper to get status string

2021-02-14 Thread William Dauchy
move listen status to a helper, defining both status enum and string
definition.
this will be helpful to be reused in prometheus code. It also removes
this hard-to-read nested ternary.

Signed-off-by: William Dauchy 
---
 include/haproxy/listener-t.h |  9 +
 include/haproxy/listener.h   |  3 +++
 src/listener.c   | 18 ++
 src/stats.c  |  2 +-
 4 files changed, 31 insertions(+), 1 deletion(-)

diff --git a/include/haproxy/listener-t.h b/include/haproxy/listener-t.h
index ce5ed408f..7d3998ece 100644
--- a/include/haproxy/listener-t.h
+++ b/include/haproxy/listener-t.h
@@ -84,6 +84,15 @@ enum li_state {
  * not rely on this state.
  */
 
+/* listener status for stats */
+enum li_status {
+   LI_STATUS_WAITING = 0,
+   LI_STATUS_OPEN,
+   LI_STATUS_FULL,
+
+   LI_STATE_COUNT /* must be last */
+};
+
 /* listener socket options */
 #define LI_O_NONE   0x
 #define LI_O_NOLINGER   0x0001  /* disable linger on this socket */
diff --git a/include/haproxy/listener.h b/include/haproxy/listener.h
index 1be8551c3..f229efae4 100644
--- a/include/haproxy/listener.h
+++ b/include/haproxy/listener.h
@@ -218,6 +218,9 @@ struct task *manage_global_listener_queue(struct task *t, 
void *context, unsigne
 
 extern struct accept_queue_ring accept_queue_rings[MAX_THREADS] 
__attribute__((aligned(64)));
 
+extern const char* li_status_st[LI_STATE_COUNT];
+enum li_status get_li_status(struct listener *l);
+
 #endif /* _HAPROXY_LISTENER_H */
 
 /*
diff --git a/src/listener.c b/src/listener.c
index 0b929b906..9ca910b37 100644
--- a/src/listener.c
+++ b/src/listener.c
@@ -46,6 +46,12 @@ static struct bind_kw_list bind_keywords = {
 static struct mt_list global_listener_queue = 
MT_LIST_HEAD_INIT(global_listener_queue);
 static struct task *global_listener_queue_task;
 
+/* listener status for stats */
+const char* li_status_st[LI_STATE_COUNT] = {
+   [LI_STATUS_WAITING] = "WAITING",
+   [LI_STATUS_OPEN]= "OPEN",
+   [LI_STATUS_FULL]= "FULL",
+};
 
 #if defined(USE_THREAD)
 
@@ -183,6 +189,18 @@ REGISTER_CONFIG_POSTPARSER("multi-threaded accept queue", 
accept_queue_init);
 
 #endif // USE_THREAD
 
+/* helper to get listener status for stats */
+enum li_status get_li_status(struct listener *l)
+{
+   if (!l->maxconn || l->nbconn < l->maxconn) {
+   if (l->state == LI_LIMITED)
+   return LI_STATUS_WAITING;
+   else
+   return LI_STATUS_OPEN;
+   }
+   return LI_STATUS_FULL;
+}
+
 /* adjust the listener's state and its proxy's listener counters if needed.
  * It must be called under the listener's lock, but uses atomic ops to change
  * the proxy's counters so that the proxy lock is not needed.
diff --git a/src/stats.c b/src/stats.c
index 1e0db0b24..a63178d5a 100644
--- a/src/stats.c
+++ b/src/stats.c
@@ -1898,7 +1898,7 @@ int stats_fill_li_stats(struct proxy *px, struct listener 
*l, int flags,
metric = mkf_u64(FN_COUNTER, 
l->counters->denied_sess);
break;
case ST_F_STATUS:
-   metric = mkf_str(FO_STATUS, (!l->maxconn || 
l->nbconn < l->maxconn) ? (l->state == LI_LIMITED) ? "WAITING" : "OPEN" : 
"FULL");
+   metric = mkf_str(FO_STATUS, 
li_status_st[get_li_status(l)]);
break;
case ST_F_PID:
metric = mkf_u32(FO_KEY, relative_pid);
-- 
2.30.0




[PATCH 1/3] MEDIUM: stats: allow to select one field in `stats_fill_li_stats`

2021-02-14 Thread William Dauchy
prometheus approach requires to output all values for a given metric
name; meaning we iterate through all metrics, and then iterate in the
inner loop on all objects for this metric.
In order to allow more code reuse, adapt the stats API to be able to
select one field or fill them all otherwise.
>From this patch it should be possible to add support for listen stats in
prometheus.

Signed-off-by: William Dauchy 
---
 include/haproxy/stats.h |   2 +-
 src/hlua_fcn.c  |   3 +-
 src/stats.c | 166 +++-
 3 files changed, 116 insertions(+), 55 deletions(-)

diff --git a/include/haproxy/stats.h b/include/haproxy/stats.h
index 6115dca8f..cbc33f279 100644
--- a/include/haproxy/stats.h
+++ b/include/haproxy/stats.h
@@ -49,7 +49,7 @@ int stats_fill_info(struct field *info, int len);
 int stats_fill_fe_stats(struct proxy *px, struct field *stats, int len,
enum stat_field *selected_field);
 int stats_fill_li_stats(struct proxy *px, struct listener *l, int flags,
-struct field *stats, int len);
+struct field *stats, int len, enum stat_field 
*selected_field);
 int stats_fill_sv_stats(struct proxy *px, struct server *sv, int flags,
 struct field *stats, int len, enum stat_field 
*selected_field);
 int stats_fill_be_stats(struct proxy *px, int flags, struct field *stats, int 
len,
diff --git a/src/hlua_fcn.c b/src/hlua_fcn.c
index 6dd9efb21..38a9cfd21 100644
--- a/src/hlua_fcn.c
+++ b/src/hlua_fcn.c
@@ -866,7 +866,8 @@ int hlua_listener_get_stats(lua_State *L)
return 1;
}
 
-   stats_fill_li_stats(li->bind_conf->frontend, li, STAT_SHLGNDS, stats, 
STATS_LEN);
+   stats_fill_li_stats(li->bind_conf->frontend, li, STAT_SHLGNDS, stats,
+   STATS_LEN, NULL);
 
lua_newtable(L);
for (i=0; i with the listener statistics.  is
- * preallocated array of length . The length of the array
- * must be at least ST_F_TOTAL_FIELDS. If this length is less
- * then this value, the function returns 0, otherwise, it
- * returns 1.  can take the value STAT_SHLGNDS.
+/* Fill  with the listener statistics.  is preallocated array of
+ * length . The length of the array must be at least ST_F_TOTAL_FIELDS. If
+ * this length is less then this value, the function returns 0, otherwise, it
+ * returns 1.  If selected_field is != NULL, only fill this one.  can
+ * take the value STAT_SHLGNDS.
  */
 int stats_fill_li_stats(struct proxy *px, struct listener *l, int flags,
-struct field *stats, int len)
+struct field *stats, int len, enum stat_field 
*selected_field)
 {
+   enum stat_field current_field = (selected_field != NULL ? 
*selected_field : 0);
struct buffer *out = get_trash_chunk();
 
if (len < ST_F_TOTAL_FIELDS)
@@ -1850,54 +1851,112 @@ int stats_fill_li_stats(struct proxy *px, struct 
listener *l, int flags,
 
chunk_reset(out);
 
-   stats[ST_F_PXNAME]   = mkf_str(FO_KEY|FN_NAME|FS_SERVICE, px->id);
-   stats[ST_F_SVNAME]   = mkf_str(FO_KEY|FN_NAME|FS_SERVICE, l->name);
-   stats[ST_F_MODE] = mkf_str(FO_CONFIG|FS_SERVICE, 
proxy_mode_str(px->mode));
-   stats[ST_F_SCUR] = mkf_u32(0, l->nbconn);
-   stats[ST_F_SMAX] = mkf_u32(FN_MAX, l->counters->conn_max);
-   stats[ST_F_SLIM] = mkf_u32(FO_CONFIG|FN_LIMIT, l->maxconn);
-   stats[ST_F_STOT] = mkf_u64(FN_COUNTER, l->counters->cum_conn);
-   stats[ST_F_BIN]  = mkf_u64(FN_COUNTER, l->counters->bytes_in);
-   stats[ST_F_BOUT] = mkf_u64(FN_COUNTER, l->counters->bytes_out);
-   stats[ST_F_DREQ] = mkf_u64(FN_COUNTER, l->counters->denied_req);
-   stats[ST_F_DRESP]= mkf_u64(FN_COUNTER, l->counters->denied_resp);
-   stats[ST_F_EREQ] = mkf_u64(FN_COUNTER, l->counters->failed_req);
-   stats[ST_F_DCON] = mkf_u64(FN_COUNTER, l->counters->denied_conn);
-   stats[ST_F_DSES] = mkf_u64(FN_COUNTER, l->counters->denied_sess);
-   stats[ST_F_STATUS]   = mkf_str(FO_STATUS, (!l->maxconn || l->nbconn < 
l->maxconn) ? (l->state == LI_LIMITED) ? "WAITING" : "OPEN" : "FULL");
-   stats[ST_F_PID]  = mkf_u32(FO_KEY, relative_pid);
-   stats[ST_F_IID]  = mkf_u32(FO_KEY|FS_SERVICE, px->uuid);
-   stats[ST_F_SID]  = mkf_u32(FO_KEY|FS_SERVICE, l->luid);
-   stats[ST_F_TYPE] = mkf_u32(FO_CONFIG|FS_SERVICE, STATS_TYPE_SO);
-   stats[ST_F_WREW] = mkf_u64(FN_COUNTER, 
l->counters->failed_rewrites);
-   stats[ST_F_EINT] = mkf_u64(FN_COUNTER, 
l->counters->internal_errors);
-
-   if (flags & STAT_SHLGNDS) {
-   char str[INET6_ADDRSTRLEN];
-   int port;
-
-   port = get_host_port(>rx.addr);

[PATCH 0/3] prometheus: add listen stats

2021-02-14 Thread William Dauchy
Hello Christopher,

I know I'm a bit late regarding the merge window; this is however the
logical followup of my prometheus work, now adding listen stats.
patch 2/3 is a proposition but I'm ok to revisit.

William Dauchy (3):
  MEDIUM: stats: allow to select one field in `stats_fill_li_stats`
  MINOR: stats: add helper to get status string
  MEDIUM: contrib/prometheus-exporter: add listen stats

 contrib/prometheus-exporter/README|  24 +-
 .../prometheus-exporter/service-prometheus.c  | 274 --
 include/haproxy/listener-t.h  |   9 +
 include/haproxy/listener.h|   3 +
 include/haproxy/stats.h   |   2 +-
 reg-tests/contrib/prometheus.vtc  |   4 +
 src/hlua_fcn.c|   3 +-
 src/listener.c|  18 ++
 src/stats.c   | 166 +++
 9 files changed, 365 insertions(+), 138 deletions(-)

-- 
2.30.0




[PATCH 3/3] MEDIUM: contrib/prometheus-exporter: add listen stats

2021-02-14 Thread William Dauchy
this was a missing piece for a while now even though it was planned. This
patch adds listen stats.
Nothing in particular but we make use of the status helper previously
added.  `promex_st_metrics` diff also looks scary, but I had to realign
all lines.

Signed-off-by: William Dauchy 
---
 contrib/prometheus-exporter/README|  24 +-
 .../prometheus-exporter/service-prometheus.c  | 274 --
 reg-tests/contrib/prometheus.vtc  |   4 +
 3 files changed, 219 insertions(+), 83 deletions(-)

diff --git a/contrib/prometheus-exporter/README 
b/contrib/prometheus-exporter/README
index d882b092f..f08cceba6 100644
--- a/contrib/prometheus-exporter/README
+++ b/contrib/prometheus-exporter/README
@@ -70,6 +70,7 @@ exported. Here are examples:
 
   /metrics?scope=server # ==> server metrics will be exported
   /metrics?scope=frontend=backend # ==> Frontend and backend metrics 
will be exported
+  /metrics?scope=listen # ==> listen metrics will be exported
   /metrics?scope=*=   # ==> no metrics will be exported
   /metrics?scope==global  # ==> global metrics will be exported
   /metrics?scope=sticktable # ==> stick tables metrics will be 
exported
@@ -102,7 +103,7 @@ except the server_check_status, you may configure 
prometheus that way:
 - 
metric_relabel_configs:
- source_labels: ['__name__']
-  regex: 'haproxy_(process_|frontend_|backend_|server_check_status).*'
+  regex: 
'haproxy_(process_|frontend_|listen_|backend_|server_check_status).*'
   action: keep
 
 Exported metrics
@@ -212,6 +213,27 @@ See prometheus export for the description of each field.
 | haproxy_frontend_internal_errors_total  |
 +-+
 
+* Listen metrics
+
++-+
+|Metric name  |
++-+
+| haproxy_listen_current_sessions |
+| haproxy_listen_max_sessions |
+| haproxy_listen_limit_sessions   |
+| haproxy_listen_sessions_total   |
+| haproxy_listen_bytes_in_total   |
+| haproxy_listen_bytes_out_total  |
+| haproxy_listen_requests_denied_total|
+| haproxy_listen_responses_denied_total   |
+| haproxy_listen_request_errors_total |
+| haproxy_listen_status   |
+| haproxy_listen_denied_connections_total |
+| haproxy_listen_denied_sessions_total|
+| haproxy_listen_failed_header_rewriting_total|
+| haproxy_listen_internal_errors_total|
++-+
+
 * Backend metrics
 
 +-+
diff --git a/contrib/prometheus-exporter/service-prometheus.c 
b/contrib/prometheus-exporter/service-prometheus.c
index 403f5a619..70ea1b7b4 100644
--- a/contrib/prometheus-exporter/service-prometheus.c
+++ b/contrib/prometheus-exporter/service-prometheus.c
@@ -63,17 +63,19 @@ enum {
 #define PROMEX_FL_FRONT_METRIC  0x0004
 #define PROMEX_FL_BACK_METRIC   0x0008
 #define PROMEX_FL_SRV_METRIC0x0010
-#define PROMEX_FL_SCOPE_GLOBAL  0x0020
-#define PROMEX_FL_SCOPE_FRONT   0x0040
-#define PROMEX_FL_SCOPE_BACK0x0080
-#define PROMEX_FL_SCOPE_SERVER  0x0100
-#define PROMEX_FL_NO_MAINT_SRV  0x0200
-#define PROMEX_FL_STICKTABLE_METRIC 0x0400
-#define PROMEX_FL_SCOPE_STICKTABLE  0x0800
+#define PROMEX_FL_LI_METRIC 0x0020
+#define PROMEX_FL_STICKTABLE_METRIC 0x0040
+#define PROMEX_FL_SCOPE_GLOBAL  0x0080
+#define PROMEX_FL_SCOPE_FRONT   0x0100
+#define PROMEX_FL_SCOPE_BACK0x0200
+#define PROMEX_FL_SCOPE_SERVER  0x0400
+#define PROMEX_FL_SCOPE_LI  0x0800
+#define PROMEX_FL_SCOPE_STICKTABLE  0x1000
+#define PROMEX_FL_NO_MAINT_SRV  0x2000
 
 #define PROMEX_FL_SCOPE_ALL (PROMEX_FL_SCOPE_GLOBAL | PROMEX_FL_SCOPE_FRONT | \
-PROMEX_FL_SCOPE_BACK | PROMEX_FL_SCOPE_SERVER | \
-PROMEX_FL_SCOPE_STICKTABLE)
+PROMEX_FL_SCOPE_LI | PROMEX_FL_SCOPE_BACK | \
+PROMEX_FL_SCOPE_SERVER | 
PROMEX_FL_SCOPE_STICKTABLE)
 
 /* Promtheus metric type (gauge or counter) */
 enum promex_mt_type {
@@ -187,66 +189,66 @@ const struct promex_metric 
promex_global_metrics[INF_TOTAL_FIELDS] = {
 const struct promex_metric promex_st_metrics[ST_F_TOTAL_FIELDS] = {
//[ST_F_PXNAME] ignored
//[ST_F_SVNAME] ignored
-   [ST_F_QCUR]   = { .n = IST("current_queue"),
.type = PROMEX_MT_GAUGE,.flags = ( 
PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) },
-   [ST_F_QMAX]  

[PATCH 2/2] CLEANUP: contrib/prometheus-exporter: align for with srv status case

2021-02-14 Thread William Dauchy
the for loop was wrongly indented following our recent rework

Signed-off-by: William Dauchy 
---
 contrib/prometheus-exporter/service-prometheus.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/contrib/prometheus-exporter/service-prometheus.c 
b/contrib/prometheus-exporter/service-prometheus.c
index 403f5a619..1b08fcfb4 100644
--- a/contrib/prometheus-exporter/service-prometheus.c
+++ b/contrib/prometheus-exporter/service-prometheus.c
@@ -861,7 +861,7 @@ static int promex_dump_srv_metrics(struct appctx *appctx, 
struct htx *htx)
switch (appctx->st2) {
case ST_F_STATUS:
state = promex_srv_status(sv);
-   for (; appctx->ctx.stats.st_code < 
PROMEX_SRV_STATE_COUNT; appctx->ctx.stats.st_code++) {
+   for (; 
appctx->ctx.stats.st_code < PROMEX_SRV_STATE_COUNT; 
appctx->ctx.stats.st_code++) {
val = 
mkf_u32(FO_STATUS, state == appctx->ctx.stats.st_code);
labels[2].name = 
ist("state");
labels[2].value = 
promex_srv_st[appctx->ctx.stats.st_code];
-- 
2.30.0




[PATCH 1/2] CLEANUP: check: fix get_check_status_info declaration

2021-02-14 Thread William Dauchy
we always put a \n between function name and `{`

Signed-off-by: William Dauchy 
---
 src/check.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/check.c b/src/check.c
index d37a55201..d17f747fe 100644
--- a/src/check.c
+++ b/src/check.c
@@ -178,8 +178,8 @@ const char *get_check_status_description(short 
check_status) {
 }
 
 /* Converts check_status code to short info */
-const char *get_check_status_info(short check_status) {
-
+const char *get_check_status_info(short check_status)
+{
const char *info;
 
if (check_status < HCHK_STATUS_SIZE)
-- 
2.30.0




[PATCH] DOC: tune: explain the origin of block size for ssl.cachesize

2021-02-12 Thread William Dauchy
A user could eventually ask himself where those 200 bytes block size are
coming from. This patch tries to better explain the origin in case
people are curious or want to double check the reality.

Signed-off-by: William Dauchy 
---
 doc/configuration.txt | 21 +++--
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/doc/configuration.txt b/doc/configuration.txt
index 391e074a7..b21c56091 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -2451,16 +2451,17 @@ tune.sndbuf.server 
 
 tune.ssl.cachesize 
   Sets the size of the global SSL session cache, in a number of blocks. A block
-  is large enough to contain an encoded session without peer certificate.
-  An encoded session with peer certificate is stored in multiple blocks
-  depending on the size of the peer certificate. A block uses approximately
-  200 bytes of memory. The default value may be forced at build time, otherwise
-  defaults to 2. When the cache is full, the most idle entries are purged
-  and reassigned. Higher values reduce the occurrence of such a purge, hence
-  the number of CPU-intensive SSL handshakes by ensuring that all users keep
-  their session as long as possible. All entries are pre-allocated upon startup
-  and are shared between all processes if "nbproc" is greater than 1. Setting
-  this value to 0 disables the SSL session cache.
+  is large enough to contain an encoded session without peer certificate.  An
+  encoded session with peer certificate is stored in multiple blocks depending
+  on the size of the peer certificate. A block uses approximately 200 bytes of
+  memory (based on `sizeof(struct sh_ssl_sess_hdr) + SHSESS_BLOCK_MIN_SIZE`
+  calculation used for `shctx_init` function). The default value may be forced
+  at build time, otherwise defaults to 2. When the cache is full, the most
+  idle entries are purged and reassigned. Higher values reduce the occurrence
+  of such a purge, hence the number of CPU-intensive SSL handshakes by ensuring
+  that all users keep their session as long as possible. All entries are
+  pre-allocated upon startup and are shared between all processes if "nbproc"
+  is greater than 1. Setting this value to 0 disables the SSL session cache.
 
 tune.ssl.force-private-cache
   This option disables SSL session cache sharing between all processes. It
-- 
2.30.0




Re: [PATCH v3 0/5] cli commands for checks and agent

2021-02-12 Thread William Dauchy
On Fri, Feb 12, 2021 at 3:06 PM Christopher Faulet  wrote:
> I just slightly amended the 3rd patch to handle the v2 in 
> apply_server_state().
> There is a test on the version when a state-file is local to a proxy. Just a
> minor change.

ok, thanks for that.

> And in the last one, I removed the "chunk_appendf(msg, "\n");" to
> move the LF in ha_warning() calls.

yes ok it is probably clearer that way.

Glad to see the end of this series, now I'm ready to receive related
bug reports ;)

-- 
William



Re: stats / "show servers conn" looses counter after reload

2021-02-12 Thread William Dauchy
Hi Christian,

On Fri, Feb 12, 2021 at 11:59 AM Christian Ruppert  wrote:
> Is this a bug? Can you confirm this behavior? Is there any other way I
> could figure out whether a backend is currently in use?

unfortunately reload does not recover stats values; it is a known
problem; see also https://github.com/haproxy/haproxy/issues/954

-- 
William



[PATCH v3 4/5] MEDIUM: server: support {check,agent}_addr, agent_port in server state

2021-02-11 Thread William Dauchy
logical followup from cli commands addition, so that the state server
file stays compatible with the changes made at runtime; use previously
added helper to load server attributes.

also alloc a specific chunk to avoid mixing with other called functions
using it

Signed-off-by: William Dauchy 
---
 doc/management.txt|  5 +-
 include/haproxy/server-t.h|  9 ++-
 .../checks/1be_40srv_odd_health_checks.vtc|  2 +-
 .../checks/40be_2srv_odd_health_checks.vtc|  2 +-
 reg-tests/checks/4be_1srv_health_checks.vtc   |  6 +-
 src/proxy.c   | 41 ++-
 src/server.c  | 68 +--
 7 files changed, 87 insertions(+), 46 deletions(-)

diff --git a/doc/management.txt b/doc/management.txt
index 423c614b2..60e25c7e1 100644
--- a/doc/management.txt
+++ b/doc/management.txt
@@ -2455,7 +2455,10 @@ show servers state []
  srv_port:Server port.
  srvrecord:   DNS SRV record associated to this SRV.
  srv_use_ssl: use ssl for server connections.
- srv_check_port:  Server check port.
+ srv_check_port:  Server health check port.
+ srv_check_addr:  Server health check address.
+ srv_agent_addr:  Server health agent address.
+ srv_agent_port:  Server health agent port.
 
 show sess
   Dump all known sessions. Avoid doing this on slow connections as this can
diff --git a/include/haproxy/server-t.h b/include/haproxy/server-t.h
index 2ec70dde4..11cb71489 100644
--- a/include/haproxy/server-t.h
+++ b/include/haproxy/server-t.h
@@ -126,11 +126,14 @@ enum srv_initaddr {
 "srv_port "   \
 "srvrecord "  \
 "srv_use_ssl "\
-"srv_check_port"
+"srv_check_port " \
+"srv_check_addr " \
+"srv_agent_addr " \
+"srv_agent_port"
 
-#define SRV_STATE_FILE_MAX_FIELDS 22
+#define SRV_STATE_FILE_MAX_FIELDS 25
 #define SRV_STATE_FILE_NB_FIELDS_VERSION_1 20
-#define SRV_STATE_FILE_NB_FIELDS_VERSION_2 22
+#define SRV_STATE_FILE_NB_FIELDS_VERSION_2 25
 #define SRV_STATE_LINE_MAXLEN 512
 
 /* server flags -- 32 bits */
diff --git a/reg-tests/checks/1be_40srv_odd_health_checks.vtc 
b/reg-tests/checks/1be_40srv_odd_health_checks.vtc
index f01205295..c279972aa 100644
--- a/reg-tests/checks/1be_40srv_odd_health_checks.vtc
+++ b/reg-tests/checks/1be_40srv_odd_health_checks.vtc
@@ -112,6 +112,6 @@ syslog S -wait
 
 haproxy h1 -cli {
 send "show servers state"
-expect ~ "# be_id be_name srv_id srv_name srv_addr srv_op_state 
srv_admin_state srv_uweight srv_iweight srv_time_since_last_change 
srv_check_status srv_check_result srv_check_health srv_check_state 
srv_agent_state bk_f_forced_id srv_f_forced_id srv_fqdn srv_port srvrecord 
srv_use_ssl srv_check_port\n2 be1 1 srv0 ${s0_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 
0 0 0 0 - ${s0_port} - 0 0\n2 be1 2 srv1 ${s1_addr} 2 0 1 1 [[:digit:]]+ 6 
([[:digit:]]+ ){3}0 0 0 - ${s1_port} - 0 0\n2 be1 3 srv2 ${s2_addr} 2 0 1 1 
[[:digit:]]+ 1 0 1 0 0 0 0 - ${s2_port} - 0 0\n2 be1 4 srv3 ${s3_addr} 2 0 1 1 
[[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s3_port} - 0 0\n2 be1 5 srv4 
${s4_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s4_port} - 0 0\n2 be1 6 srv5 
${s5_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s5_port} - 0 0\n2 
be1 7 srv6 ${s6_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s6_port} - 0 0\n2 
be1 8 srv7 ${s7_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - 
${s7_port} - 0 0\n2 be1 9 srv8 ${s8_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - 
${s8_port} - 0 0\n2 be1 10 srv9 ${s9_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ 
){3}0 0 0 - ${s9_port} - 0 0\n2 be1 11 srv10 ${s10_addr} 2 0 1 1 [[:digit:]]+ 1 
0 1 0 0 0 0 - ${s10_port} - 0 0\n2 be1 12 srv11 ${s11_addr} 2 0 1 1 
[[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s11_port} - 0 0\n2 be1 13 srv12 
${s12_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s12_port} - 0 0\n2 be1 14 
srv13 ${s13_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s13_port} 
- 0 0\n2 be1 15 srv14 ${s14_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - 
${s14_port} - 0 0\n2 be1 16 srv15 ${s15_addr} 2 0 1 1 [[:digit:]]+ 6 
([[:digit:]]+ ){3}0 0 0 - ${s15_port} - 0 0\n2 be1 17 srv16 ${s16_addr} 2 0 1 1 
[[:digit:]]+ 1 0 1 0 0 0 0 - ${s16_port} - 0 0\n2 be1 18 srv17 ${s17_addr} 2 0 
1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s17_port} - 0 0\n2 be1 19 srv18 
${s18_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s18_port} - 0 0\n2 be1 20 
srv19 ${s19_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s19_port} 
- 0 0\n2 be1 21 srv20 ${s20_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - 
${s20_port} - 0 0\n2 be1 22 srv21 ${s21_addr} 2 0 1 1 [[:digit:]]+ 6 
([[:digit:]]+ ){3}0 0 0 - ${s21_port} - 0 0\n2 b

[PATCH v3 1/5] MEDIUM: cli: add check-addr command

2021-02-11 Thread William Dauchy
this patch allows to set server health check address at runtime. In
order to align with `addr` command, also allow to set port optionnaly.
This led to a small refactor in order to use the same function for both
`check-addr` and `check-port` commands.
for `check-port`, we however don't permit the change anymore if checks
are not enabled on the server.

This command becomes more and more useful for people having a consul
like architecture:
- the backend server is located on a container with its own IP
- the health checks are done the consul instance located on the host
  with the host IP

Signed-off-by: William Dauchy 
---
 doc/management.txt |  4 ++
 src/server.c   | 95 ++
 2 files changed, 84 insertions(+), 15 deletions(-)

diff --git a/doc/management.txt b/doc/management.txt
index b74aba769..bff770e4e 100644
--- a/doc/management.txt
+++ b/doc/management.txt
@@ -1842,6 +1842,10 @@ set server / health [ up | stopping | 
down ]
   switch a server's state regardless of some slow health checks for example.
   Note that the change is propagated to tracking servers if any.
 
+set server / check-addr  [port ]
+  Change the IP address used for server health checks.
+  Optionally, change the port used for server health checks.
+
 set server / check-port 
   Change the port used for health checking to 
 
diff --git a/src/server.c b/src/server.c
index 453c59866..74bd972e2 100644
--- a/src/server.c
+++ b/src/server.c
@@ -54,6 +54,8 @@ static int srv_set_fqdn(struct server *srv, const char *fqdn, 
int dns_locked);
 static void srv_state_parse_line(char *buf, const int version, char **params, 
char **srv_params);
 static int srv_state_get_version(FILE *f);
 static void srv_cleanup_connections(struct server *srv);
+static const char *update_server_check_addr_port(struct server *s, const char 
*addr,
+const char *port);
 
 /* List head of all known server keywords */
 static struct srv_kw_list srv_keywords = {
@@ -3574,6 +3576,59 @@ int update_server_addr(struct server *s, void *ip, int 
ip_sin_family, const char
return 0;
 }
 
+/* update server health check address and port
+ * addr must be ip4 or ip6, it won't be resolved
+ * if one error occurs, don't apply anything
+ * must be called with the server lock held.
+ */
+static const char *update_server_check_addr_port(struct server *s, const char 
*addr,
+const char *port)
+{
+   struct sockaddr_storage sk;
+   struct buffer *msg;
+   int new_port;
+
+   msg = get_trash_chunk();
+   chunk_reset(msg);
+
+   if (!(s->check.state & CHK_ST_ENABLED)) {
+   chunk_strcat(msg, "health checks are not enabled on this 
server");
+   goto out;
+   }
+   if (addr) {
+   memset(, 0, sizeof(struct sockaddr_storage));
+   if (str2ip2(addr, , 0) == NULL) {
+   chunk_appendf(msg, "invalid addr '%s'", addr);
+   goto out;
+   }
+   }
+   if (port) {
+   if (strl2irc(port, strlen(port), _port) != 0) {
+   chunk_appendf(msg, "provided port is not an integer");
+   goto out;
+   }
+   if (new_port < 0 || new_port > 65535) {
+   chunk_appendf(msg, "provided port is invalid");
+   goto out;
+   }
+   /* prevent the update of port to 0 if MAPPORTS are in use */
+   if ((s->flags & SRV_F_MAPPORTS) && new_port == 0) {
+   chunk_appendf(msg, "can't unset 'port' since MAPPORTS 
is in use");
+   goto out;
+   }
+   }
+out:
+   if (msg->data)
+   return msg->area;
+   else {
+   if (addr)
+   s->check.addr = sk;
+   if (port)
+   s->check.port = new_port;
+   }
+   return NULL;
+}
+
 /*
  * This function update a server's addr and port only for AF_INET and AF_INET6 
families.
  *
@@ -4406,23 +4461,32 @@ static int cli_parse_set_server(char **args, char 
*payload, struct appctx *appct
cli_err(appctx, "cannot allocate memory for new 
string.\n");
}
}
-   else if (strcmp(args[3], "check-port") == 0) {
-   int i = 0;
-   if (strl2irc(args[4], strlen(args[4]), ) != 0) {
-   cli_err(appctx, "'set server  check-port' expects 
an integer as argument.\n");
-   goto out_unlock;
-   }
-   if ((i < 0) || (i > 65535)) {
-   cli_err(appctx, "provided port is not valid.\n");
+   else if (strcmp(args[

[PATCH v3 5/5] MINOR: server: enhance error precision when applying server state

2021-02-11 Thread William Dauchy
server health checks and agent parameters are written the same way as
others to be able to enahcne code reuse: basically we make use of
parsing and assignment at the same place. It makes it difficult for
error handling to know whether srv object was modified partially or not.
The problem was already present with SRV resolution though.

I was a bit puzzled about the approach to take to be honest, and I did
not wanted to go into a full refactor, so I assumed it was ok to simply
notify whether the line was failed or partially applied.

Signed-off-by: William Dauchy 
---
 src/server.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/src/server.c b/src/server.c
index 739bcfba6..b0a8f90d4 100644
--- a/src/server.c
+++ b/src/server.c
@@ -2625,6 +2625,7 @@ static void srv_update_state(struct server *srv, int 
version, char **params)
unsigned int port_svc;
char *srvrecord;
char *addr;
+   int partial_apply = 0;
 #ifdef USE_OPENSSL
int use_ssl;
 #endif
@@ -2795,6 +2796,7 @@ static void srv_update_state(struct server *srv, int 
version, char **params)
/* don't apply anything if one error has been detected */
if (msg->data)
goto out;
+   partial_apply = 1;
 
/* recover operational state and apply it to this server
 * and all servers tracking this one */
@@ -3032,8 +3034,12 @@ static void srv_update_state(struct server *srv, int 
version, char **params)
HA_SPIN_UNLOCK(SERVER_LOCK, >lock);
if (msg->data) {
chunk_appendf(msg, "\n");
-   ha_warning("server-state application failed for server 
'%s/%s'%s",
-  srv->proxy->id, srv->id, msg->area);
+   if (partial_apply == 1)
+   ha_warning("server-state partially applied for server 
'%s/%s'%s",
+  srv->proxy->id, srv->id, msg->area);
+   else
+   ha_warning("server-state application failed for server 
'%s/%s'%s",
+  srv->proxy->id, srv->id, msg->area);
}
  end:
free_trash_chunk(msg);
-- 
2.30.0




[PATCH v3 3/5] MEDIUM: server: add server-states version 2

2021-02-11 Thread William Dauchy
Even if it is possibly too much work for the current usage, it makes
sure we don't break states file from v2.3 to v2.4; indeed, since v2.3,
we introduced two new fields, so we put them aside to guarantee we can
easily reload from a version 1.
The diff seems huge but there is no specific change apart from:
- introduce v2 where it is needed (parsing, update)
- move away from switch/case in update to be able to reuse code
- move srv lock to the whole function to make it easier

this patch confirm how painful it is to maintain this functionality.

Signed-off-by: William Dauchy 
---
 include/haproxy/server-t.h |   7 +-
 src/server.c   | 742 ++---
 2 files changed, 367 insertions(+), 382 deletions(-)

diff --git a/include/haproxy/server-t.h b/include/haproxy/server-t.h
index b326305e8..2ec70dde4 100644
--- a/include/haproxy/server-t.h
+++ b/include/haproxy/server-t.h
@@ -101,9 +101,9 @@ enum srv_initaddr {
 } __attribute__((packed));
 
 /* server-state-file version */
-#define SRV_STATE_FILE_VERSION 1
+#define SRV_STATE_FILE_VERSION 2
 #define SRV_STATE_FILE_VERSION_MIN 1
-#define SRV_STATE_FILE_VERSION_MAX 1
+#define SRV_STATE_FILE_VERSION_MAX 2
 #define SRV_STATE_FILE_FIELD_NAMES \
 "be_id "  \
 "be_name "\
@@ -129,7 +129,8 @@ enum srv_initaddr {
 "srv_check_port"
 
 #define SRV_STATE_FILE_MAX_FIELDS 22
-#define SRV_STATE_FILE_NB_FIELDS_VERSION_1 22
+#define SRV_STATE_FILE_NB_FIELDS_VERSION_1 20
+#define SRV_STATE_FILE_NB_FIELDS_VERSION_2 22
 #define SRV_STATE_LINE_MAXLEN 512
 
 /* server flags -- 32 bits */
diff --git a/src/server.c b/src/server.c
index 9a2a8c520..c2b272068 100644
--- a/src/server.c
+++ b/src/server.c
@@ -2632,391 +2632,383 @@ static void srv_update_state(struct server *srv, int 
version, char **params)
fqdn = NULL;
port_svc = port_check = 0;
msg = get_trash_chunk();
-   switch (version) {
-   case 1:
-   /*
-* now we can proceed with server's state update:
-* srv_addr: params[0]
-* srv_op_state: params[1]
-* srv_admin_state:  params[2]
-* srv_uweight:  params[3]
-* srv_iweight:  params[4]
-* srv_last_time_change: params[5]
-* srv_check_status: params[6]
-* srv_check_result: params[7]
-* srv_check_health: params[8]
-* srv_check_state:  params[9]
-* srv_agent_state:  params[10]
-* bk_f_forced_id:   params[11]
-* srv_f_forced_id:  params[12]
-* srv_fqdn: params[13]
-* srv_port: params[14]
-* srvrecord:params[15]
-* srv_use_ssl:  params[16]
-* srv_check_port:   params[17]
-*/
+   HA_SPIN_LOCK(SERVER_LOCK, >lock);
+
+   if (version >= 1) {
+   /* srv_addr: params[0]
+* srv_op_state: params[1]
+* srv_admin_state:  params[2]
+* srv_uweight:  params[3]
+* srv_iweight:  params[4]
+* srv_last_time_change: params[5]
+* srv_check_status: params[6]
+* srv_check_result: params[7]
+* srv_check_health: params[8]
+* srv_check_state:  params[9]
+* srv_agent_state:  params[10]
+* bk_f_forced_id:   params[11]
+* srv_f_forced_id:  params[12]
+* srv_fqdn: params[13]
+* srv_port: params[14]
+* srvrecord:params[15]
+*/
 
-   /* validating srv_op_state */
-   p = NULL;
-   errno = 0;
-   srv_op_state = strtol(params[1], , 10);
-   if ((p == params[1]) || errno == EINVAL || errno == 
ERANGE ||
-   (srv_op_state != SRV_ST_STOPPED &&
-srv_op_state != SRV_ST_STARTING &&
-srv_op_state != SRV_ST_RUNNING &&
-srv_op_state != SRV_ST_STOPPING)) {
-   chunk_appendf(msg, ", invalid srv_op_state 
value '%s'", params[1]);
-   }
+   /* validating srv_op_state */
+   p = NULL;
+   errno = 0;
+   srv_op_state = strtol(params[1], , 10);
+   

[PATCH v3 2/5] MEDIUM: cli: add agent-port command

2021-02-11 Thread William Dauchy
this patch allows to set agent port at runtime. In order to align with
both `addr` and `check-addr` commands, also add the possibility to
optionnaly set port on `agent-addr` command. This led to a small
refactor in order to use the same function for both `agent-addr` and
`agent-port` commands.

Signed-off-by: William Dauchy 
---
 doc/management.txt |  6 +++-
 src/server.c   | 84 +-
 2 files changed, 80 insertions(+), 10 deletions(-)

diff --git a/doc/management.txt b/doc/management.txt
index bff770e4e..423c614b2 100644
--- a/doc/management.txt
+++ b/doc/management.txt
@@ -1828,10 +1828,14 @@ set server / agent [ up | down ]
   switch a server's state regardless of some slow agent checks for example.
   Note that the change is propagated to tracking servers if any.
 
-set server / agent-addr 
+set server / agent-addr  [port ]
   Change addr for servers agent checks. Allows to migrate agent-checks to
   another address at runtime. You can specify both IP and hostname, it will be
   resolved.
+  Optionally, change the port agent.
+
+set server / agent-port 
+  Change the port used for agent checks.
 
 set server / agent-send 
   Change agent string sent to agent check target. Allows to update string while
diff --git a/src/server.c b/src/server.c
index 74bd972e2..9a2a8c520 100644
--- a/src/server.c
+++ b/src/server.c
@@ -56,6 +56,8 @@ static int srv_state_get_version(FILE *f);
 static void srv_cleanup_connections(struct server *srv);
 static const char *update_server_check_addr_port(struct server *s, const char 
*addr,
 const char *port);
+static const char *update_server_agent_addr_port(struct server *s, const char 
*addr,
+const char *port);
 
 /* List head of all known server keywords */
 static struct srv_kw_list srv_keywords = {
@@ -3576,6 +3578,54 @@ int update_server_addr(struct server *s, void *ip, int 
ip_sin_family, const char
return 0;
 }
 
+/* update agent health check address and port
+ * addr can be ip4/ip6 or a hostname
+ * if one error occurs, don't apply anything
+ * must be called with the server lock held.
+ */
+static const char *update_server_agent_addr_port(struct server *s, const char 
*addr,
+const char *port)
+{
+   struct sockaddr_storage sk;
+   struct buffer *msg;
+   int new_port;
+
+   msg = get_trash_chunk();
+   chunk_reset(msg);
+
+   if (!(s->agent.state & CHK_ST_ENABLED)) {
+   chunk_strcat(msg, "agent checks are not enabled on this 
server");
+   goto out;
+   }
+   if (addr) {
+   memset(, 0, sizeof(struct sockaddr_storage));
+   if (str2ip(addr, ) == NULL) {
+   chunk_appendf(msg, "invalid addr '%s'", addr);
+   goto out;
+   }
+   }
+   if (port) {
+   if (strl2irc(port, strlen(port), _port) != 0) {
+   chunk_appendf(msg, "provided port is not an integer");
+   goto out;
+   }
+   if (new_port < 0 || new_port > 65535) {
+   chunk_appendf(msg, "provided port is invalid");
+   goto out;
+   }
+   }
+out:
+   if (msg->data)
+   return msg->area;
+   else {
+   if (addr)
+   set_srv_agent_addr(s, );
+   if (port)
+   set_srv_agent_port(s, new_port);
+   }
+   return NULL;
+}
+
 /* update server health check address and port
  * addr must be ip4 or ip6, it won't be resolved
  * if one error occurs, don't apply anything
@@ -4443,15 +4493,31 @@ static int cli_parse_set_server(char **args, char 
*payload, struct appctx *appct
cli_err(appctx, "'set server  agent' expects 'up' 
or 'down'.\n");
}
else if (strcmp(args[3], "agent-addr") == 0) {
-   struct sockaddr_storage sk;
-
-   memset(, 0, sizeof(sk));
-   if (!(sv->agent.state & CHK_ST_ENABLED))
-   cli_err(appctx, "agent checks are not enabled on this 
server.\n");
-   else if (str2ip(args[4], ))
-   set_srv_agent_addr(sv, );
-   else
-   cli_err(appctx, "incorrect addr address given for 
agent.\n");
+   char *addr = NULL;
+   char *port = NULL;
+   if (strlen(args[4]) == 0) {
+   cli_err(appctx, "set server / agent-addr requires"
+   " an address and optionally a port.\n");
+   goto out_unlock;
+   }
+   addr = args[4];
+   if (strcmp(args[5]

[PATCH v3 0/5] cli commands for checks and agent

2021-02-11 Thread William Dauchy
Hello Christopher,

Here is some work to finish you week.
I believe I addressed all the points raised:
- warning are no longer emitted when we have "0" or "-" values
- I enhanced the warning output as well, and understood my mistake for
  my previous CLEANUP patch removing a space... so I removed this patch
  as well.
- Fixed all the chunk_appendf issues.
- I was a bit lazy to address the partial vs complete failure in parsing
  as I was a bit puzzled about the approach to take. I think it would be
  sad to duplicate the code for pre validation, but on the other hand I
  agree it was clear to assume the whole line was not applied at all. I
  however concluded it was acceptable to simply know the line was not
  fully applied. I believe in that case the user should worry. We can
  probably add a way to show where it stopped, but I felt discouraged
  with the srv resolution, in the middle of srv port, where it is harder
  to have a kinda generic way to know where we stopped.

William Dauchy (5):
  MEDIUM: cli: add check-addr command
  MEDIUM: cli: add agent-port command
  MEDIUM: server: add server-states version 2
  MEDIUM: server: support {check,agent}_addr, agent_port in server state
  MINOR: server: enhance error precision when applying server state

 doc/management.txt|  15 +-
 include/haproxy/server-t.h|  16 +-
 .../checks/1be_40srv_odd_health_checks.vtc|   2 +-
 .../checks/40be_2srv_odd_health_checks.vtc|   2 +-
 reg-tests/checks/4be_1srv_health_checks.vtc   |   6 +-
 src/proxy.c   |  41 +-
 src/server.c  | 971 ++
 7 files changed, 612 insertions(+), 441 deletions(-)

-- 
2.30.0




Re: [PATCH v2 0/6] cli commands for checks and agent

2021-02-10 Thread William Dauchy
On Wed, Feb 10, 2021 at 4:21 PM Christopher Faulet  wrote:
> To conclude, I may merge the first 4 patches if you want, but I guess you
> will have to adapt the first ones to produce better warnings. So I will
> wait your confirmation to proceed. However, I will merge the bug fix.
> There is no reason to not take it.

thanks for the review.
don't worry I will come back with a v3.

-- 
William



Re: [PATCH v2 3/6] BUG/MEDIUM: server: re-align state file fields number

2021-02-08 Thread William Dauchy
Hello Christopher,

On Mon, Feb 8, 2021 at 11:53 PM William Dauchy  wrote:
> Since commit 3169471964fdc49963e63f68c1fd88686821a0c4 ("MINOR: Add
> server port field to server state file.") max_fields was not increased
> on version number 1. So this patch aims to fix it. This should be
> backported as far as v1.8, but the numbering should be adpated depending
> on the version: simply increase the field by 1.

this one is simply a mistake where I changed the subject line from
MEDIUM to MINOR. So ignore this one.
-- 
William



[PATCH v2 4/6] MEDIUM: server: add server-states version 2

2021-02-08 Thread William Dauchy
Even if it is possibly too much work for the current usage, it makes
sure we don't break states file from v2.3 to v2.4; indeed, since v2.3,
we introduced two new fields, so we put them aside to guarantee we can
easily reload from a version 1.
The diff seems huge but there is no specific change apart from:
- introduce v2 where it is needed (parsing, update)
- move away from switch/case in update to be able to reuse code
- move srv lock to the whole function to make it easier

this patch confirm how painful it is to maintain this functionality.

Signed-off-by: William Dauchy 
---
 include/haproxy/server-t.h |   7 +-
 src/server.c   | 742 ++---
 2 files changed, 367 insertions(+), 382 deletions(-)

diff --git a/include/haproxy/server-t.h b/include/haproxy/server-t.h
index af52b6a25..89d8ecc06 100644
--- a/include/haproxy/server-t.h
+++ b/include/haproxy/server-t.h
@@ -101,9 +101,9 @@ enum srv_initaddr {
 } __attribute__((packed));
 
 /* server-state-file version */
-#define SRV_STATE_FILE_VERSION 1
+#define SRV_STATE_FILE_VERSION 2
 #define SRV_STATE_FILE_VERSION_MIN 1
-#define SRV_STATE_FILE_VERSION_MAX 1
+#define SRV_STATE_FILE_VERSION_MAX 2
 #define SRV_STATE_FILE_FIELD_NAMES \
 "be_id "  \
 "be_name "\
@@ -129,7 +129,8 @@ enum srv_initaddr {
 "srv_check_port"
 
 #define SRV_STATE_FILE_MAX_FIELDS 22
-#define SRV_STATE_FILE_NB_FIELDS_VERSION_1 22
+#define SRV_STATE_FILE_NB_FIELDS_VERSION_1 20
+#define SRV_STATE_FILE_NB_FIELDS_VERSION_2 22
 #define SRV_STATE_LINE_MAXLEN 512
 
 /* server flags -- 32 bits */
diff --git a/src/server.c b/src/server.c
index a865fb4dd..84b283567 100644
--- a/src/server.c
+++ b/src/server.c
@@ -2629,391 +2629,383 @@ static void srv_update_state(struct server *srv, int 
version, char **params)
fqdn = NULL;
port_svc = port_check = 0;
msg = get_trash_chunk();
-   switch (version) {
-   case 1:
-   /*
-* now we can proceed with server's state update:
-* srv_addr: params[0]
-* srv_op_state: params[1]
-* srv_admin_state:  params[2]
-* srv_uweight:  params[3]
-* srv_iweight:  params[4]
-* srv_last_time_change: params[5]
-* srv_check_status: params[6]
-* srv_check_result: params[7]
-* srv_check_health: params[8]
-* srv_check_state:  params[9]
-* srv_agent_state:  params[10]
-* bk_f_forced_id:   params[11]
-* srv_f_forced_id:  params[12]
-* srv_fqdn: params[13]
-* srv_port: params[14]
-* srvrecord:params[15]
-* srv_use_ssl:  params[16]
-* srv_check_port:   params[17]
-*/
+   HA_SPIN_LOCK(SERVER_LOCK, >lock);
+
+   if (version >= 1) {
+   /* srv_addr: params[0]
+* srv_op_state: params[1]
+* srv_admin_state:  params[2]
+* srv_uweight:  params[3]
+* srv_iweight:  params[4]
+* srv_last_time_change: params[5]
+* srv_check_status: params[6]
+* srv_check_result: params[7]
+* srv_check_health: params[8]
+* srv_check_state:  params[9]
+* srv_agent_state:  params[10]
+* bk_f_forced_id:   params[11]
+* srv_f_forced_id:  params[12]
+* srv_fqdn: params[13]
+* srv_port: params[14]
+* srvrecord:params[15]
+*/
 
-   /* validating srv_op_state */
-   p = NULL;
-   errno = 0;
-   srv_op_state = strtol(params[1], , 10);
-   if ((p == params[1]) || errno == EINVAL || errno == 
ERANGE ||
-   (srv_op_state != SRV_ST_STOPPED &&
-srv_op_state != SRV_ST_STARTING &&
-srv_op_state != SRV_ST_RUNNING &&
-srv_op_state != SRV_ST_STOPPING)) {
-   chunk_appendf(msg, ", invalid srv_op_state 
value '%s'", params[1]);
-   }
+   /* validating srv_op_state */
+   p = NULL;
+   errno = 0;
+   srv_op_state = strtol(params[1], , 10);
+   

[PATCH v2 5/6] MEDIUM: server: support {check,agent}_addr, agent_port in server state

2021-02-08 Thread William Dauchy
logical followup from cli commands addition, so that the state server
file stays compatible with the changes made at runtime; use previously
added helper to load server attributes.

also alloc a specific chunk to avoid mixing with other called functions
using it

Signed-off-by: William Dauchy 
---
 doc/management.txt|  5 ++-
 include/haproxy/server-t.h|  9 ++--
 .../checks/1be_40srv_odd_health_checks.vtc|  2 +-
 .../checks/40be_2srv_odd_health_checks.vtc|  2 +-
 reg-tests/checks/4be_1srv_health_checks.vtc   |  6 +--
 src/proxy.c   | 41 +++
 src/server.c  | 40 +++---
 7 files changed, 66 insertions(+), 39 deletions(-)

diff --git a/doc/management.txt b/doc/management.txt
index 423c614b2..60e25c7e1 100644
--- a/doc/management.txt
+++ b/doc/management.txt
@@ -2455,7 +2455,10 @@ show servers state []
  srv_port:Server port.
  srvrecord:   DNS SRV record associated to this SRV.
  srv_use_ssl: use ssl for server connections.
- srv_check_port:  Server check port.
+ srv_check_port:  Server health check port.
+ srv_check_addr:  Server health check address.
+ srv_agent_addr:  Server health agent address.
+ srv_agent_port:  Server health agent port.
 
 show sess
   Dump all known sessions. Avoid doing this on slow connections as this can
diff --git a/include/haproxy/server-t.h b/include/haproxy/server-t.h
index 89d8ecc06..7f1356024 100644
--- a/include/haproxy/server-t.h
+++ b/include/haproxy/server-t.h
@@ -126,11 +126,14 @@ enum srv_initaddr {
 "srv_port "   \
 "srvrecord "  \
 "srv_use_ssl "\
-"srv_check_port"
+"srv_check_port " \
+"srv_check_addr " \
+"srv_agent_addr " \
+"srv_agent_port"
 
-#define SRV_STATE_FILE_MAX_FIELDS 22
+#define SRV_STATE_FILE_MAX_FIELDS 25
 #define SRV_STATE_FILE_NB_FIELDS_VERSION_1 20
-#define SRV_STATE_FILE_NB_FIELDS_VERSION_2 22
+#define SRV_STATE_FILE_NB_FIELDS_VERSION_2 25
 #define SRV_STATE_LINE_MAXLEN 512
 
 /* server flags -- 32 bits */
diff --git a/reg-tests/checks/1be_40srv_odd_health_checks.vtc 
b/reg-tests/checks/1be_40srv_odd_health_checks.vtc
index f01205295..c279972aa 100644
--- a/reg-tests/checks/1be_40srv_odd_health_checks.vtc
+++ b/reg-tests/checks/1be_40srv_odd_health_checks.vtc
@@ -112,6 +112,6 @@ syslog S -wait
 
 haproxy h1 -cli {
 send "show servers state"
-expect ~ "# be_id be_name srv_id srv_name srv_addr srv_op_state 
srv_admin_state srv_uweight srv_iweight srv_time_since_last_change 
srv_check_status srv_check_result srv_check_health srv_check_state 
srv_agent_state bk_f_forced_id srv_f_forced_id srv_fqdn srv_port srvrecord 
srv_use_ssl srv_check_port\n2 be1 1 srv0 ${s0_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 
0 0 0 0 - ${s0_port} - 0 0\n2 be1 2 srv1 ${s1_addr} 2 0 1 1 [[:digit:]]+ 6 
([[:digit:]]+ ){3}0 0 0 - ${s1_port} - 0 0\n2 be1 3 srv2 ${s2_addr} 2 0 1 1 
[[:digit:]]+ 1 0 1 0 0 0 0 - ${s2_port} - 0 0\n2 be1 4 srv3 ${s3_addr} 2 0 1 1 
[[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s3_port} - 0 0\n2 be1 5 srv4 
${s4_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s4_port} - 0 0\n2 be1 6 srv5 
${s5_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s5_port} - 0 0\n2 
be1 7 srv6 ${s6_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s6_port} - 0 0\n2 
be1 8 srv7 ${s7_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - 
${s7_port} - 0 0\n2 be1 9 srv8 ${s8_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - 
${s8_port} - 0 0\n2 be1 10 srv9 ${s9_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ 
){3}0 0 0 - ${s9_port} - 0 0\n2 be1 11 srv10 ${s10_addr} 2 0 1 1 [[:digit:]]+ 1 
0 1 0 0 0 0 - ${s10_port} - 0 0\n2 be1 12 srv11 ${s11_addr} 2 0 1 1 
[[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s11_port} - 0 0\n2 be1 13 srv12 
${s12_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s12_port} - 0 0\n2 be1 14 
srv13 ${s13_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s13_port} 
- 0 0\n2 be1 15 srv14 ${s14_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - 
${s14_port} - 0 0\n2 be1 16 srv15 ${s15_addr} 2 0 1 1 [[:digit:]]+ 6 
([[:digit:]]+ ){3}0 0 0 - ${s15_port} - 0 0\n2 be1 17 srv16 ${s16_addr} 2 0 1 1 
[[:digit:]]+ 1 0 1 0 0 0 0 - ${s16_port} - 0 0\n2 be1 18 srv17 ${s17_addr} 2 0 
1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s17_port} - 0 0\n2 be1 19 srv18 
${s18_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s18_port} - 0 0\n2 be1 20 
srv19 ${s19_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s19_port} 
- 0 0\n2 be1 21 srv20 ${s20_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - 
${s20_port} - 0 0\n2 be1 22 srv21 ${s21_addr} 2 0 1 1 [[:digit:]]+ 6 
([[:digit:]]+ ){3}0 0 0 - ${s21_port} - 0 

[PATCH v2 6/6] CLEANUP: server: add missing space in server-state error output

2021-02-08 Thread William Dauchy
a space was missing in the output to make it more readable.

Signed-off-by: William Dauchy 
---
 src/server.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/server.c b/src/server.c
index 6b360291d..673844dd7 100644
--- a/src/server.c
+++ b/src/server.c
@@ -3015,7 +3015,7 @@ static void srv_update_state(struct server *srv, int 
version, char **params)
HA_SPIN_UNLOCK(SERVER_LOCK, >lock);
if (msg->data) {
chunk_appendf(msg, "\n");
-   ha_warning("server-state application failed for server 
'%s/%s'%s",
+   ha_warning("server-state application failed for server '%s/%s' 
%s",
   srv->proxy->id, srv->id, msg->area);
}
  end:
-- 
2.30.0




[PATCH v2 3/6] BUG/MINOR: server: re-align state file fields number

2021-02-08 Thread William Dauchy
Since commit 3169471964fdc49963e63f68c1fd88686821a0c4 ("MINOR: Add
server port field to server state file.") max_fields was not increased
on version number 1. So this patch aims to fix it. This should be
backported as far as v1.8, but the numbering should be adpated depending
on the version: simply increase the field by 1.

Signed-off-by: William Dauchy 
---
 include/haproxy/server-t.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/haproxy/server-t.h b/include/haproxy/server-t.h
index 32697a9c4..af52b6a25 100644
--- a/include/haproxy/server-t.h
+++ b/include/haproxy/server-t.h
@@ -129,7 +129,7 @@ enum srv_initaddr {
 "srv_check_port"
 
 #define SRV_STATE_FILE_MAX_FIELDS 22
-#define SRV_STATE_FILE_NB_FIELDS_VERSION_1 21
+#define SRV_STATE_FILE_NB_FIELDS_VERSION_1 22
 #define SRV_STATE_LINE_MAXLEN 512
 
 /* server flags -- 32 bits */
-- 
2.30.0




[PATCH v2 2/6] MEDIUM: cli: add agent-port command

2021-02-08 Thread William Dauchy
this patch allows to set agent port at runtime. In order to align with
both `addr` and `check-addr` commands, also add the possibility to
optionnaly set port on `agent-addr` command. This led to a small
refactor in order to use the same function for both `agent-addr` and
`agent-port` commands.

Signed-off-by: William Dauchy 
---
 doc/management.txt |  6 +++-
 src/server.c   | 76 --
 2 files changed, 72 insertions(+), 10 deletions(-)

diff --git a/doc/management.txt b/doc/management.txt
index bff770e4e..423c614b2 100644
--- a/doc/management.txt
+++ b/doc/management.txt
@@ -1828,10 +1828,14 @@ set server / agent [ up | down ]
   switch a server's state regardless of some slow agent checks for example.
   Note that the change is propagated to tracking servers if any.
 
-set server / agent-addr 
+set server / agent-addr  [port ]
   Change addr for servers agent checks. Allows to migrate agent-checks to
   another address at runtime. You can specify both IP and hostname, it will be
   resolved.
+  Optionally, change the port agent.
+
+set server / agent-port 
+  Change the port used for agent checks.
 
 set server / agent-send 
   Change agent string sent to agent check target. Allows to update string while
diff --git a/src/server.c b/src/server.c
index 58347d215..a865fb4dd 100644
--- a/src/server.c
+++ b/src/server.c
@@ -56,6 +56,8 @@ static int srv_state_get_version(FILE *f);
 static void srv_cleanup_connections(struct server *srv);
 static const char *update_server_check_addr_port(struct server *s, const char 
*addr,
 const char *port);
+static const char *update_server_agent_addr_port(struct server *s, const char 
*addr,
+const char *port);
 
 /* List head of all known server keywords */
 static struct srv_kw_list srv_keywords = {
@@ -3573,6 +3575,46 @@ int update_server_addr(struct server *s, void *ip, int 
ip_sin_family, const char
return 0;
 }
 
+/* update agent health check address and port
+ * addr can be ip4/ip6 or a hostname
+ * must be called with the server lock held.
+ */
+static const char *update_server_agent_addr_port(struct server *s, const char 
*addr,
+const char *port)
+{
+   struct sockaddr_storage sk;
+   struct buffer *msg;
+   int new_port;
+
+   msg = get_trash_chunk();
+
+   if (!(s->agent.state & CHK_ST_ENABLED)) {
+   chunk_appendf(msg, "agent checks are not enabled on this 
server.\n");
+   goto out;
+   }
+   if (addr) {
+   memset(, 0, sizeof(struct sockaddr_storage));
+   if (str2ip(addr, ) == NULL) {
+   chunk_appendf(msg, "invalid addr '%s'\n", addr);
+   goto out;
+   }
+   set_srv_agent_addr(s, );
+   }
+   if (port) {
+   if (strl2irc(port, strlen(port), _port) != 0) {
+   chunk_appendf(msg, "provided port is not an integer\n");
+   goto out;
+   }
+   if (new_port < 0 || new_port > 65535) {
+   chunk_appendf(msg, "provided port is invalid\n");
+   goto out;
+   }
+   set_srv_agent_port(s, new_port);
+   }
+out:
+   return msg->area;
+}
+
 /* update server health check address and port
  * addr must be ip4 or ip6, it won't be resolved
  * must be called with the server lock held.
@@ -4432,15 +4474,31 @@ static int cli_parse_set_server(char **args, char 
*payload, struct appctx *appct
cli_err(appctx, "'set server  agent' expects 'up' 
or 'down'.\n");
}
else if (strcmp(args[3], "agent-addr") == 0) {
-   struct sockaddr_storage sk;
-
-   memset(, 0, sizeof(sk));
-   if (!(sv->agent.state & CHK_ST_ENABLED))
-   cli_err(appctx, "agent checks are not enabled on this 
server.\n");
-   else if (str2ip(args[4], ))
-   set_srv_agent_addr(sv, );
-   else
-   cli_err(appctx, "incorrect addr address given for 
agent.\n");
+   char *addr = NULL;
+   char *port = NULL;
+   if (strlen(args[4]) == 0) {
+   cli_err(appctx, "set server / agent-addr requires"
+   " an address and optionally a port.\n");
+   goto out_unlock;
+   }
+   addr = args[4];
+   if (strcmp(args[5], "port") == 0)
+   port = args[6];
+   warning = update_server_agent_addr_port(sv, addr, port);
+   if (warning)
+   cli_

[PATCH v2 1/6] MEDIUM: cli: add check-addr command

2021-02-08 Thread William Dauchy
this patch allows to set server health check address at runtime. In
order to align with `addr` command, also allow to set port optionnaly.
This led to a small refactor in order to use the same function for both
`check-addr` and `check-port` commands.
for `check-port`, we however don't permit the change anymore if checks
are not enabled on the server.

This command becomes more and more useful for people having a consul
like architecture:
- the backend server is located on a container with its own IP
- the health checks are done the consul instance located on the host
  with the host IP

Signed-off-by: William Dauchy 
---
 doc/management.txt |  4 +++
 src/server.c   | 87 ++
 2 files changed, 76 insertions(+), 15 deletions(-)

diff --git a/doc/management.txt b/doc/management.txt
index b74aba769..bff770e4e 100644
--- a/doc/management.txt
+++ b/doc/management.txt
@@ -1842,6 +1842,10 @@ set server / health [ up | stopping | 
down ]
   switch a server's state regardless of some slow health checks for example.
   Note that the change is propagated to tracking servers if any.
 
+set server / check-addr  [port ]
+  Change the IP address used for server health checks.
+  Optionally, change the port used for server health checks.
+
 set server / check-port 
   Change the port used for health checking to 
 
diff --git a/src/server.c b/src/server.c
index da2325e9a..58347d215 100644
--- a/src/server.c
+++ b/src/server.c
@@ -54,6 +54,8 @@ static int srv_set_fqdn(struct server *srv, const char *fqdn, 
int dns_locked);
 static void srv_state_parse_line(char *buf, const int version, char **params, 
char **srv_params);
 static int srv_state_get_version(FILE *f);
 static void srv_cleanup_connections(struct server *srv);
+static const char *update_server_check_addr_port(struct server *s, const char 
*addr,
+const char *port);
 
 /* List head of all known server keywords */
 static struct srv_kw_list srv_keywords = {
@@ -3571,6 +3573,51 @@ int update_server_addr(struct server *s, void *ip, int 
ip_sin_family, const char
return 0;
 }
 
+/* update server health check address and port
+ * addr must be ip4 or ip6, it won't be resolved
+ * must be called with the server lock held.
+ */
+static const char *update_server_check_addr_port(struct server *s, const char 
*addr,
+const char *port)
+{
+   struct sockaddr_storage sk;
+   struct buffer *msg;
+   int new_port;
+
+   msg = get_trash_chunk();
+
+   if (!(s->check.state & CHK_ST_ENABLED)) {
+   chunk_appendf(msg, "health checks are not enabled on this 
server.\n");
+   goto out;
+   }
+   if (addr) {
+   memset(, 0, sizeof(struct sockaddr_storage));
+   if (str2ip2(addr, , 0) == NULL) {
+   chunk_appendf(msg, "invalid addr '%s'\n", addr);
+   goto out;
+   }
+   s->check.addr = sk;
+   }
+   if (port) {
+   if (strl2irc(port, strlen(port), _port) != 0) {
+   chunk_appendf(msg, "provided port is not an integer\n");
+   goto out;
+   }
+   if (new_port < 0 || new_port > 65535) {
+   chunk_appendf(msg, "provided port is invalid\n");
+   goto out;
+   }
+   /* prevent the update of port to 0 if MAPPORTS are in use */
+   if ((s->flags & SRV_F_MAPPORTS) && new_port == 0) {
+   chunk_appendf(msg, "can't unset 'port' since MAPPORTS 
is in use\n");
+   goto out;
+   }
+   s->check.port = new_port;
+   }
+out:
+   return msg->area;
+}
+
 /*
  * This function update a server's addr and port only for AF_INET and AF_INET6 
families.
  *
@@ -4403,23 +4450,32 @@ static int cli_parse_set_server(char **args, char 
*payload, struct appctx *appct
cli_err(appctx, "cannot allocate memory for new 
string.\n");
}
}
-   else if (strcmp(args[3], "check-port") == 0) {
-   int i = 0;
-   if (strl2irc(args[4], strlen(args[4]), ) != 0) {
-   cli_err(appctx, "'set server  check-port' expects 
an integer as argument.\n");
-   goto out_unlock;
-   }
-   if ((i < 0) || (i > 65535)) {
-   cli_err(appctx, "provided port is not valid.\n");
+   else if (strcmp(args[3], "check-addr") == 0) {
+   char *addr = NULL;
+   char *port = NULL;
+   if (strlen(args[4]) == 0) {
+   cli_err(appctx, "set se

[PATCH v2 3/6] BUG/MEDIUM: server: re-align state file fields number

2021-02-08 Thread William Dauchy
Since commit 3169471964fdc49963e63f68c1fd88686821a0c4 ("MINOR: Add
server port field to server state file.") max_fields was not increased
on version number 1. So this patch aims to fix it. This should be
backported as far as v1.8, but the numbering should be adpated depending
on the version: simply increase the field by 1.

Signed-off-by: William Dauchy 
---
 include/haproxy/server-t.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/haproxy/server-t.h b/include/haproxy/server-t.h
index 32697a9c4..af52b6a25 100644
--- a/include/haproxy/server-t.h
+++ b/include/haproxy/server-t.h
@@ -129,7 +129,7 @@ enum srv_initaddr {
 "srv_check_port"
 
 #define SRV_STATE_FILE_MAX_FIELDS 22
-#define SRV_STATE_FILE_NB_FIELDS_VERSION_1 21
+#define SRV_STATE_FILE_NB_FIELDS_VERSION_1 22
 #define SRV_STATE_LINE_MAXLEN 512
 
 /* server flags -- 32 bits */
-- 
2.30.0




[PATCH v2 0/6] cli commands for checks and agent

2021-02-08 Thread William Dauchy
Hello Christopher,

Here is the v2 addressing the points raised yesterday.
The patch 4/6 clearly looks scary but I made sure to not change anything
crazy apart from adding support for a version 2. I will probably start
to dream about a server-state-file burning every night.
I hope this will be good enough, but otherwise, don't hesitate to put
more comments, I will come back with a v3.

William Dauchy (6):
  MEDIUM: cli: add check-addr command
  MEDIUM: cli: add agent-port command
  BUG/MEDIUM: server: re-align state file fields number
  MEDIUM: server: add server-states version 2
  MEDIUM: server: support {check,agent}_addr, agent_port in server state
  CLEANUP: server: add missing space in server-state error output

 doc/management.txt|  15 +-
 include/haproxy/server-t.h|  16 +-
 .../checks/1be_40srv_odd_health_checks.vtc|   2 +-
 .../checks/40be_2srv_odd_health_checks.vtc|   2 +-
 reg-tests/checks/4be_1srv_health_checks.vtc   |   6 +-
 src/proxy.c   |  41 +-
 src/server.c  | 929 ++
 7 files changed, 573 insertions(+), 438 deletions(-)

-- 
2.30.0




Re: [PATCH 0/6] cli commands coherency

2021-02-08 Thread William Dauchy
Hi Christopher,

On Mon, Feb 8, 2021 at 12:21 PM Christopher Faulet  wrote:
> First, there is a test to be sure the agent-check is enabled before updating 
> the
> agent address and/or port. Do you think it should also be done for the
> health-check? Because, for now, it is possible to set an address on a disabled
> (or even not configured) health-check. But it is not possible for an
> agent-check. Given that it is not possible to enable/disable an agent-check or
> an health-check from the CLI, I guess the warning for both makes sense.

yes ok.

> Then, there are some problems when the server-state file is loaded. The trash
> chunk used by srv_update_state() is overwritten when
> update_server_agent_addr_port() is called. It is not obvious, but the
> get_trash_chunk() function returns cyclically one of two trash chunks.
> srv_update_state() uses a trash chunk. When called,
> update_server_check_addr_port() uses the other one. So,
> update_server_agent_addr_port() re-uses the first trash chunk, the same than
> srv_update_state(). A solution may be to allocate a dedicated chunk in
> srv_update_state(), using alloc_trash_chunk().

ah ok, I thought it was ok as long as there was no reset, but I indeed
overlooked the second chunk. I will fix that.

> A more annoying problem happens when an old state file is loaded (prior these
> changes). Last parameters are not defined, params[18] (check-addr), params[19]
> (agent-addr) and params[20] (agent-port) are set to NULL. Thus, HAProxy 
> crashes
> when the health-check address is compared to "-". In fact, this comes from the
> SRV_STATE_FILE_NB_FIELDS_VERSION_1 macro. It should be set to 24. You added 3
> new fields but just incremented it by 1. Hum... No, there is also another
> problem. In srv_state_parse_line(). SRV_STATE_FILE_NB_FIELDS_VERSION_1 must be
> equal to SRV_STATE_FILE_MAX_FIELDS. Or better, it should really reflect the
> argument number of the version 1, so 21. And it must be compared to srv_arg
> instead of arg. It is a bug since the 1.8 (commit 316947196).

ah true I overlooked SRV_STATE_FILE_NB_FIELDS_VERSION_1 meaning.
let me fix that as well.

> However, this means with these changes, a cold restart is required. In the 
> 2.4,
> 5 new fields were added to the server-state file. Is it expected to perform a
> cold restart when upgrading to the 2.4 from a prior major version ? If so, I'm
> ok. But, it may be a good idea to change the state file version then to not
> silently ignore the state file. Otherwise, I guess it is possible to make 
> these
> 5 new fields optional, isn't it ?

good idea. We never did it in the past, but that's probably a good thing to do.

> William, I'm sorry if I'm not really clear but the subject is a bit foggy for 
> me
> :) Do you feel confident to handle all the changes ?

Thanks for the review. I will come back with a v2.

--
William



Re: [ANNOUNCE] haproxy-2.4-dev7

2021-02-07 Thread William Dauchy
On Fri, Feb 5, 2021 at 4:14 PM Willy Tarreau  wrote:
> HAProxy 2.4-dev7 was released on 2021/02/05. It added 153 new commits
> after version 2.4-dev6.
>   - Some significant lifting was done to the Prometheus exporter, including
> new fields, better descriptions and some filtering. I've seen quite a
> bunch pass in front of me but do not well understand what it does, all
> that interests me is that some users are happy with these changes so I
> guess they were long awaited :-)

about that, please note two breaking changes:

- objects' status are no longer a gauge value which you need to
translate manually; instead the state is a label with a proper string
value. The value of the metric simply informs whether the state is
active or not.
so we went from:
  haproxy_server_status{proxy="be_foo",server="srv0"} 2
(which meant MAINT)
to:
  haproxy_server_status{proxy="be_foo",server="srv0",state="DOWN"} 0
  haproxy_server_status{proxy="be_foo",server="srv0",state="UP"} 0
  haproxy_server_status{proxy="be_foo",server="srv0",state="MAINT"} 1
  haproxy_server_status{proxy="be_foo",server="srv0",state="DRAIN"} 0
  haproxy_server_status{proxy="be_foo",server="srv0",state="NOLB"} 0

This change is valid for frontend, backend, server with different label values.

- similar change with health checks where we put a state label:

haproxy_server_check_status{proxy="be_foo",server="srv0",state="HANA"} 0
haproxy_server_check_status{proxy="be_foo",server="srv0",state="SOCKERR"} 0
haproxy_server_check_status{proxy="be_foo",server="srv0",state="L4OK"} 0
haproxy_server_check_status{proxy="be_foo",server="srv0",state="L4TOUT"} 0
haproxy_server_check_status{proxy="be_foo",server="srv0",state="L4CON"} 1
haproxy_server_check_status{proxy="be_foo",server="srv0",state="L6OK"} 0
haproxy_server_check_status{proxy="be_foo",server="srv0",state="L6TOUT"} 0
haproxy_server_check_status{proxy="be_foo",server="srv0",state="L6RSP"} 0
haproxy_server_check_status{proxy="be_foo",server="srv0",state="L7TOUT"} 0
haproxy_server_check_status{proxy="be_foo",server="srv0",state="L7RSP"} 0
haproxy_server_check_status{proxy="be_foo",server="srv0",state="L7OK"} 0
haproxy_server_check_status{proxy="be_foo",server="srv0",state="L7OKC"} 0
haproxy_server_check_status{proxy="be_foo",server="srv0",state="L7STS"} 0
haproxy_server_check_status{proxy="be_foo",server="srv0",state="PROCERR"} 0
haproxy_server_check_status{proxy="be_foo",server="srv0",state="PROCTOUT"} 0
haproxy_server_check_status{proxy="be_foo",server="srv0",state="PROCOK"} 0


It means:
* a lot more metrics for large setup (but you can still filter as
explained in the doc)
* easier use on prometheus side: you will be able to group per state
very easily now.

Generally speaking I'm very interested in feedback regarding this
change. This was motivated by the usage I saw in several companies,
where people were struggling making use of those metrics.

Willy: it is probably a wise idea to keep it for the 2.4 final release
notes, some people might want to know that during their update; a lot
of people have their production alerts on those metrics.

Thanks,
-- 
William



[PATCH 2/2] MEDIUM: contrib/prometheus-exporter: export base stick table stats

2021-02-07 Thread William Dauchy
I saw some people falling back to unix socket to collect some data they
could not find in prometheus exporter. One of them is base info from
stick tables (used/size).
I do not plan to extend it more for now; keys are quite a mess to
handle.

This should resolve github issue #1008.

Signed-off-by: William Dauchy 
---
 contrib/prometheus-exporter/README|  10 ++
 .../prometheus-exporter/service-prometheus.c  | 148 +++---
 reg-tests/contrib/prometheus.vtc  |   4 +
 3 files changed, 142 insertions(+), 20 deletions(-)

diff --git a/contrib/prometheus-exporter/README 
b/contrib/prometheus-exporter/README
index a85981597..d882b092f 100644
--- a/contrib/prometheus-exporter/README
+++ b/contrib/prometheus-exporter/README
@@ -72,6 +72,7 @@ exported. Here are examples:
   /metrics?scope=frontend=backend # ==> Frontend and backend metrics 
will be exported
   /metrics?scope=*=   # ==> no metrics will be exported
   /metrics?scope==global  # ==> global metrics will be exported
+  /metrics?scope=sticktable # ==> stick tables metrics will be 
exported
 
 * How do I prevent my prometheus instance to explode?
 
@@ -320,3 +321,12 @@ See prometheus export for the description of each field.
 | haproxy_server_need_connections_current|
 | haproxy_server_uweight |
 ++
+
+* Stick table metrics
+
+++
+|Metric name |
+++
+| haproxy_sticktable_size|
+| haproxy_sticktable_used|
+++
diff --git a/contrib/prometheus-exporter/service-prometheus.c 
b/contrib/prometheus-exporter/service-prometheus.c
index 769389735..521fe1056 100644
--- a/contrib/prometheus-exporter/service-prometheus.c
+++ b/contrib/prometheus-exporter/service-prometheus.c
@@ -47,28 +47,33 @@ enum {
 
 /* Prometheus exporter dumper states (appctx->st1) */
 enum {
-PROMEX_DUMPER_INIT = 0, /* initialized */
-PROMEX_DUMPER_GLOBAL,   /* dump metrics of globals */
-PROMEX_DUMPER_FRONT,/* dump metrics of frontend proxies */
-PROMEX_DUMPER_BACK, /* dump metrics of backend proxies */
-PROMEX_DUMPER_LI,   /* dump metrics of listeners */
-PROMEX_DUMPER_SRV,  /* dump metrics of servers */
-   PROMEX_DUMPER_DONE, /* finished */
+   PROMEX_DUMPER_INIT = 0,   /* initialized */
+   PROMEX_DUMPER_GLOBAL, /* dump metrics of globals */
+   PROMEX_DUMPER_FRONT,  /* dump metrics of frontend proxies */
+   PROMEX_DUMPER_BACK,   /* dump metrics of backend proxies */
+   PROMEX_DUMPER_LI, /* dump metrics of listeners */
+   PROMEX_DUMPER_SRV,/* dump metrics of servers */
+   PROMEX_DUMPER_STICKTABLE, /* dump metrics of stick tables */
+   PROMEX_DUMPER_DONE,   /* finished */
 };
 
 /* Prometheus exporter flags (appctx->ctx.stats.flags) */
-#define PROMEX_FL_METRIC_HDR0x0001
-#define PROMEX_FL_INFO_METRIC   0x0002
-#define PROMEX_FL_FRONT_METRIC  0x0004
-#define PROMEX_FL_BACK_METRIC   0x0008
-#define PROMEX_FL_SRV_METRIC0x0010
-#define PROMEX_FL_SCOPE_GLOBAL  0x0020
-#define PROMEX_FL_SCOPE_FRONT   0x0040
-#define PROMEX_FL_SCOPE_BACK0x0080
-#define PROMEX_FL_SCOPE_SERVER  0x0100
-#define PROMEX_FL_NO_MAINT_SRV  0x0200
-
-#define PROMEX_FL_SCOPE_ALL 
(PROMEX_FL_SCOPE_GLOBAL|PROMEX_FL_SCOPE_FRONT|PROMEX_FL_SCOPE_BACK|PROMEX_FL_SCOPE_SERVER)
+#define PROMEX_FL_METRIC_HDR0x0001
+#define PROMEX_FL_INFO_METRIC   0x0002
+#define PROMEX_FL_FRONT_METRIC  0x0004
+#define PROMEX_FL_BACK_METRIC   0x0008
+#define PROMEX_FL_SRV_METRIC0x0010
+#define PROMEX_FL_SCOPE_GLOBAL  0x0020
+#define PROMEX_FL_SCOPE_FRONT   0x0040
+#define PROMEX_FL_SCOPE_BACK0x0080
+#define PROMEX_FL_SCOPE_SERVER  0x0100
+#define PROMEX_FL_NO_MAINT_SRV  0x0200
+#define PROMEX_FL_STICKTABLE_METRIC 0x0400
+#define PROMEX_FL_SCOPE_STICKTABLE  0x0800
+
+#define PROMEX_FL_SCOPE_ALL (PROMEX_FL_SCOPE_GLOBAL | PROMEX_FL_SCOPE_FRONT | \
+PROMEX_FL_SCOPE_BACK | PROMEX_FL_SCOPE_SERVER | \
+PROMEX_FL_SCOPE_STICKTABLE)
 
 /* Promtheus metric type (gauge or counter) */
 enum promex_mt_type {
@@ -298,6 +303,25 @@ const struct ist promex_st_metric_desc[ST_F_TOTAL_FIELDS] 
= {
[ST_F_TT_MAX] = IST("Maximum observed total request+response 
time (request+queue+connect+response+processing)"),
 };
 
+/* stick table base fields */
+enum sticktable_field {
+   STICKTABLE_SIZE = 0,
+   STICKTABLE_USED,
+   /* must always be the last one */
+   STICKTABLE_TOTA

[PATCH 1/2] MINOR: contrib/prometheus-exporter: use stats desc when possible followup

2021-02-07 Thread William Dauchy
Remove remaining descrition which are common to stats.c.

This patch is a followup of commit
82b2ce2f967d967139adb7afab064416fadad615 ("MINOR:
contrib/prometheus-exporter: use stats desc when possible"). I probably
messed up with one of my rebase because I'm pretty sure I removed them
at some point, but who knows what happened.

Signed-off-by: William Dauchy 
---
 .../prometheus-exporter/service-prometheus.c  | 35 ---
 1 file changed, 35 deletions(-)

diff --git a/contrib/prometheus-exporter/service-prometheus.c 
b/contrib/prometheus-exporter/service-prometheus.c
index 126962f5e..769389735 100644
--- a/contrib/prometheus-exporter/service-prometheus.c
+++ b/contrib/prometheus-exporter/service-prometheus.c
@@ -284,42 +284,7 @@ const struct promex_metric 
promex_st_metrics[ST_F_TOTAL_FIELDS] = {
 
 /* Description of overriden stats fields */
 const struct ist promex_st_metric_desc[ST_F_TOTAL_FIELDS] = {
-   [ST_F_PXNAME] = IST("The proxy name."),
-   [ST_F_SVNAME] = IST("The service name (FRONTEND for frontend, 
BACKEND for backend, any name for server/listener)."),
-   [ST_F_QCUR]   = IST("Current number of queued requests."),
-   [ST_F_QMAX]   = IST("Maximum observed number of queued 
requests."),
-   [ST_F_SCUR]   = IST("Current number of active sessions."),
-   [ST_F_SMAX]   = IST("Maximum observed number of active 
sessions."),
-   [ST_F_SLIM]   = IST("Configured session limit."),
-   [ST_F_STOT]   = IST("Total number of sessions."),
-   [ST_F_BIN]= IST("Current total of incoming bytes."),
-   [ST_F_BOUT]   = IST("Current total of outgoing bytes."),
-   [ST_F_DREQ]   = IST("Total number of denied requests."),
-   [ST_F_DRESP]  = IST("Total number of denied responses."),
-   [ST_F_EREQ]   = IST("Total number of request errors."),
-   [ST_F_ECON]   = IST("Total number of connection errors."),
-   [ST_F_ERESP]  = IST("Total number of response errors."),
-   [ST_F_WRETR]  = IST("Total number of retry warnings."),
-   [ST_F_WREDIS] = IST("Total number of redispatch warnings."),
[ST_F_STATUS] = IST("Current status of the service, per state 
label value."),
-   [ST_F_WEIGHT] = IST("Service weight."),
-   [ST_F_ACT]= IST("Current number of active servers."),
-   [ST_F_BCK]= IST("Current number of backup servers."),
-   [ST_F_CHKFAIL]= IST("Total number of failed check (Only counts 
checks failed when the server is up)."),
-   [ST_F_CHKDOWN]= IST("Total number of UP->DOWN transitions."),
-   [ST_F_LASTCHG]= IST("Number of seconds since the last UP<->DOWN 
transition."),
-   [ST_F_DOWNTIME]   = IST("Total downtime (in seconds) for the 
service."),
-   [ST_F_QLIMIT] = IST("Configured maxqueue for the server (0 
meaning no limit)."),
-   [ST_F_PID]= IST("Process id (0 for first instance, 1 for 
second, ...)"),
-   [ST_F_IID]= IST("Unique proxy id."),
-   [ST_F_SID]= IST("Server id (unique inside a proxy)."),
-   [ST_F_THROTTLE]   = IST("Current throttle percentage for the 
server, when slowstart is active, or no value if not in slowstart."),
-   [ST_F_LBTOT]  = IST("Total number of times a service was 
selected, either for new sessions, or when redispatching."),
-   [ST_F_TRACKED]= IST("Id of proxy/server if tracking is 
enabled."),
-   [ST_F_TYPE]   = IST("Service type (0=frontend, 1=backend, 
2=server, 3=socket/listener)."),
-   [ST_F_RATE]   = IST("Current number of sessions per second over 
last elapsed second."),
-   [ST_F_RATE_LIM]   = IST("Configured limit on new sessions per 
second."),
-   [ST_F_RATE_MAX]   = IST("Maximum observed number of sessions per 
second."),
[ST_F_CHECK_STATUS]   = IST("Status of last health check, per state 
label value."),
[ST_F_CHECK_CODE] = IST("layer5-7 code, if available of the last 
health check."),
[ST_F_CHECK_DURATION] = IST("Total duration of the latest server health 
check, in seconds."),
-- 
2.30.0




[PATCH 2/6] CLEANUP: tools: typo in `strl2irc` mention

2021-02-06 Thread William Dauchy
`str2irc` does not exist

Signed-off-by: William Dauchy 
---
 src/tools.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/tools.c b/src/tools.c
index 8fef15b4d..2d40d8910 100644
--- a/src/tools.c
+++ b/src/tools.c
@@ -2178,7 +2178,7 @@ int strl2irc(const char *s, int len, int *ret)
  * applications designed for hostile environments. It returns zero when the
  * number has successfully been converted, non-zero otherwise. When an error
  * is returned, the  value is left untouched. It is about 3 times slower
- * than str2irc().
+ * than strl2irc().
  */
 
 int strl2llrc(const char *s, int len, long long *ret)
-- 
2.30.0




[PATCH 5/6] MEDIUM: server: support {check,agent}_addr, agent_port in server state

2021-02-06 Thread William Dauchy
logical followup from cli commands addition, so that the state server
file stays compatible with the changes made at runtime; use previously
added helper to load server attributes.

Signed-off-by: William Dauchy 
---
 doc/management.txt|  5 ++-
 include/haproxy/server-t.h|  9 ++--
 .../checks/1be_40srv_odd_health_checks.vtc|  2 +-
 .../checks/40be_2srv_odd_health_checks.vtc|  2 +-
 reg-tests/checks/4be_1srv_health_checks.vtc   |  6 +--
 src/proxy.c   | 41 +++
 src/server.c  | 30 --
 7 files changed, 57 insertions(+), 38 deletions(-)

diff --git a/doc/management.txt b/doc/management.txt
index 423c614b2..60e25c7e1 100644
--- a/doc/management.txt
+++ b/doc/management.txt
@@ -2455,7 +2455,10 @@ show servers state []
  srv_port:Server port.
  srvrecord:   DNS SRV record associated to this SRV.
  srv_use_ssl: use ssl for server connections.
- srv_check_port:  Server check port.
+ srv_check_port:  Server health check port.
+ srv_check_addr:  Server health check address.
+ srv_agent_addr:  Server health agent address.
+ srv_agent_port:  Server health agent port.
 
 show sess
   Dump all known sessions. Avoid doing this on slow connections as this can
diff --git a/include/haproxy/server-t.h b/include/haproxy/server-t.h
index 32697a9c4..102eb4483 100644
--- a/include/haproxy/server-t.h
+++ b/include/haproxy/server-t.h
@@ -126,10 +126,13 @@ enum srv_initaddr {
 "srv_port "   \
 "srvrecord "  \
 "srv_use_ssl "\
-"srv_check_port"
+"srv_check_port " \
+"srv_check_addr " \
+"srv_agent_addr " \
+"srv_agent_port"
 
-#define SRV_STATE_FILE_MAX_FIELDS 22
-#define SRV_STATE_FILE_NB_FIELDS_VERSION_1 21
+#define SRV_STATE_FILE_MAX_FIELDS 25
+#define SRV_STATE_FILE_NB_FIELDS_VERSION_1 22
 #define SRV_STATE_LINE_MAXLEN 512
 
 /* server flags -- 32 bits */
diff --git a/reg-tests/checks/1be_40srv_odd_health_checks.vtc 
b/reg-tests/checks/1be_40srv_odd_health_checks.vtc
index f01205295..c279972aa 100644
--- a/reg-tests/checks/1be_40srv_odd_health_checks.vtc
+++ b/reg-tests/checks/1be_40srv_odd_health_checks.vtc
@@ -112,6 +112,6 @@ syslog S -wait
 
 haproxy h1 -cli {
 send "show servers state"
-expect ~ "# be_id be_name srv_id srv_name srv_addr srv_op_state 
srv_admin_state srv_uweight srv_iweight srv_time_since_last_change 
srv_check_status srv_check_result srv_check_health srv_check_state 
srv_agent_state bk_f_forced_id srv_f_forced_id srv_fqdn srv_port srvrecord 
srv_use_ssl srv_check_port\n2 be1 1 srv0 ${s0_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 
0 0 0 0 - ${s0_port} - 0 0\n2 be1 2 srv1 ${s1_addr} 2 0 1 1 [[:digit:]]+ 6 
([[:digit:]]+ ){3}0 0 0 - ${s1_port} - 0 0\n2 be1 3 srv2 ${s2_addr} 2 0 1 1 
[[:digit:]]+ 1 0 1 0 0 0 0 - ${s2_port} - 0 0\n2 be1 4 srv3 ${s3_addr} 2 0 1 1 
[[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s3_port} - 0 0\n2 be1 5 srv4 
${s4_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s4_port} - 0 0\n2 be1 6 srv5 
${s5_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s5_port} - 0 0\n2 
be1 7 srv6 ${s6_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s6_port} - 0 0\n2 
be1 8 srv7 ${s7_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - 
${s7_port} - 0 0\n2 be1 9 srv8 ${s8_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - 
${s8_port} - 0 0\n2 be1 10 srv9 ${s9_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ 
){3}0 0 0 - ${s9_port} - 0 0\n2 be1 11 srv10 ${s10_addr} 2 0 1 1 [[:digit:]]+ 1 
0 1 0 0 0 0 - ${s10_port} - 0 0\n2 be1 12 srv11 ${s11_addr} 2 0 1 1 
[[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s11_port} - 0 0\n2 be1 13 srv12 
${s12_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s12_port} - 0 0\n2 be1 14 
srv13 ${s13_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s13_port} 
- 0 0\n2 be1 15 srv14 ${s14_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - 
${s14_port} - 0 0\n2 be1 16 srv15 ${s15_addr} 2 0 1 1 [[:digit:]]+ 6 
([[:digit:]]+ ){3}0 0 0 - ${s15_port} - 0 0\n2 be1 17 srv16 ${s16_addr} 2 0 1 1 
[[:digit:]]+ 1 0 1 0 0 0 0 - ${s16_port} - 0 0\n2 be1 18 srv17 ${s17_addr} 2 0 
1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s17_port} - 0 0\n2 be1 19 srv18 
${s18_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s18_port} - 0 0\n2 be1 20 
srv19 ${s19_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s19_port} 
- 0 0\n2 be1 21 srv20 ${s20_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - 
${s20_port} - 0 0\n2 be1 22 srv21 ${s21_addr} 2 0 1 1 [[:digit:]]+ 6 
([[:digit:]]+ ){3}0 0 0 - ${s21_port} - 0 0\n2 be1 23 srv22 ${s22_addr} 2 0 1 1 
[[:digit:]]+ 1 0 1 0 0 0 0 - ${s22_port} - 0 0\n2 be1 24 srv23 ${s23_addr} 2 0 
1 1 [[:digit:

[PATCH 4/6] MEDIUM: cli: add agent-port command

2021-02-06 Thread William Dauchy
this patch allows to set agent port at runtime. In order to align with
both `addr` and `check-addr` commands, also add the possibility to
optionnaly set port on `agent-addr` command. This led to a small
refactor in order to use the same function for both `agent-addr` and
`agent-port` commands.

Signed-off-by: William Dauchy 
---
 doc/management.txt |  6 +++-
 src/server.c   | 77 --
 2 files changed, 73 insertions(+), 10 deletions(-)

diff --git a/doc/management.txt b/doc/management.txt
index bff770e4e..423c614b2 100644
--- a/doc/management.txt
+++ b/doc/management.txt
@@ -1828,10 +1828,14 @@ set server / agent [ up | down ]
   switch a server's state regardless of some slow agent checks for example.
   Note that the change is propagated to tracking servers if any.
 
-set server / agent-addr 
+set server / agent-addr  [port ]
   Change addr for servers agent checks. Allows to migrate agent-checks to
   another address at runtime. You can specify both IP and hostname, it will be
   resolved.
+  Optionally, change the port agent.
+
+set server / agent-port 
+  Change the port used for agent checks.
 
 set server / agent-send 
   Change agent string sent to agent check target. Allows to update string while
diff --git a/src/server.c b/src/server.c
index 533755f1e..a983d5d68 100644
--- a/src/server.c
+++ b/src/server.c
@@ -56,6 +56,8 @@ static int srv_state_get_version(FILE *f);
 static void srv_cleanup_connections(struct server *srv);
 static const char *update_server_check_addr_port(struct server *s, const char 
*addr,
 const char *port);
+static const char *update_server_agent_addr_port(struct server *s, const char 
*addr,
+const char *port);
 
 /* List head of all known server keywords */
 static struct srv_kw_list srv_keywords = {
@@ -3573,6 +3575,47 @@ int update_server_addr(struct server *s, void *ip, int 
ip_sin_family, const char
return 0;
 }
 
+/* update agent health check address and port
+ * addr can be ip4/ip6 or a hostname
+ * must be called with the server lock held.
+ */
+static const char *update_server_agent_addr_port(struct server *s, const char 
*addr,
+const char *port)
+{
+   struct sockaddr_storage sk;
+   struct buffer *msg;
+   int new_port;
+
+   msg = get_trash_chunk();
+
+   if (!(s->agent.state & CHK_ST_ENABLED)) {
+   chunk_appendf(msg, "agent checks are not enabled on this 
server.\n");
+   goto out;
+   }
+
+   if (addr) {
+   memset(, 0, sizeof(struct sockaddr_storage));
+   if (str2ip(addr, ) == NULL) {
+   chunk_appendf(msg, "invalid addr '%s'\n", addr);
+   goto out;
+   }
+   set_srv_agent_addr(s, );
+   }
+   if (port) {
+   if (strl2irc(port, strlen(port), _port) != 0) {
+   chunk_appendf(msg, "provided port is not an integer\n");
+   goto out;
+   }
+   if (new_port < 0 || new_port > 65535) {
+   chunk_appendf(msg, "provided port is invalid\n");
+   goto out;
+   }
+   set_srv_agent_port(s, new_port);
+   }
+out:
+   return msg->area;
+}
+
 /* update server health check address and port
  * addr must be ip4 or ip6, it won't be resolved
  * must be called with the server lock held.
@@ -4428,15 +4471,31 @@ static int cli_parse_set_server(char **args, char 
*payload, struct appctx *appct
cli_err(appctx, "'set server  agent' expects 'up' 
or 'down'.\n");
}
else if (strcmp(args[3], "agent-addr") == 0) {
-   struct sockaddr_storage sk;
-
-   memset(, 0, sizeof(sk));
-   if (!(sv->agent.state & CHK_ST_ENABLED))
-   cli_err(appctx, "agent checks are not enabled on this 
server.\n");
-   else if (str2ip(args[4], ))
-   set_srv_agent_addr(sv, );
-   else
-   cli_err(appctx, "incorrect addr address given for 
agent.\n");
+   char *addr = NULL;
+   char *port = NULL;
+   if (strlen(args[4]) == 0) {
+   cli_err(appctx, "set server / agent-addr requires"
+   " an address and optionally a port.\n");
+   goto out_unlock;
+   }
+   addr = args[4];
+   if (strcmp(args[5], "port") == 0)
+   port = args[6];
+   warning = update_server_agent_addr_port(sv, addr, port);
+   if (warning)
+   cli_

[PATCH 0/6] cli commands coherency

2021-02-06 Thread William Dauchy
Hello,

This is a followup from last week cleaning regarding check and agent
check. This patch series brings some more coherency on the CLI side. I
also put some minor cleaning.

William Dauchy (6):
  CLEANUP: check: fix some typo in comments
  CLEANUP: tools: typo in `strl2irc` mention
  MEDIUM: cli: add check-addr command
  MEDIUM: cli: add agent-port command
  MEDIUM: server: support {check,agent}_addr, agent_port in server state
  CLEANUP: server: add missing space in server-state error output

 doc/management.txt|  15 +-
 include/haproxy/server-t.h|   9 +-
 .../checks/1be_40srv_odd_health_checks.vtc|   2 +-
 .../checks/40be_2srv_odd_health_checks.vtc|   2 +-
 reg-tests/checks/4be_1srv_health_checks.vtc   |   6 +-
 src/check.c   |  18 +-
 src/proxy.c   |  41 ++--
 src/server.c  | 192 ++
 src/tools.c   |   2 +-
 9 files changed, 213 insertions(+), 74 deletions(-)

-- 
2.30.0




[PATCH 1/6] CLEANUP: check: fix some typo in comments

2021-02-06 Thread William Dauchy
a few obvious english typo in comments, some of which introduced by
myself quite recently

Signed-off-by: William Dauchy 
---
 src/check.c | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/src/check.c b/src/check.c
index edb2ac29f..5de867d7f 100644
--- a/src/check.c
+++ b/src/check.c
@@ -1004,7 +1004,7 @@ int check_buf_available(void *target)
 }
 
 /*
- * Allocate a buffer. If if fails, it adds the check in buffer wait queue.
+ * Allocate a buffer. If it fails, it adds the check in buffer wait queue.
  */
 struct buffer *check_get_buf(struct check *check, struct buffer *bptr)
 {
@@ -1211,10 +1211,10 @@ static int start_checks()
 
srand((unsigned)time(NULL));
 
-   /*
-* 2- start them as far as possible from each others. For this, we will
-* start them after their interval set to the min interval divided by
-* the number of servers, weighted by the server's position in the list.
+   /* 2- start them as far as possible from each other. For this, we will
+* start them after their interval is set to the min interval divided
+* by the number of servers, weighted by the server's position in the
+* list.
 */
for (px = proxies_list; px; px = px->next) {
if ((px->options2 & PR_O2_CHK_ANY) == PR_O2_EXT_CHK) {
@@ -1261,7 +1261,7 @@ static int srv_check_healthcheck_port(struct check *chk)
 
srv = chk->server;
 
-   /* by default, we use the health check port ocnfigured */
+   /* by default, we use the health check port configured */
if (chk->port > 0)
return chk->port;
 
@@ -1734,14 +1734,14 @@ int set_srv_agent_send(struct server *srv, const char 
*send)
return 0;
 }
 
-/* set agent addr and apprropriate flag */
+/* set agent addr and appropriate flag */
 inline void set_srv_agent_addr(struct server *srv, struct sockaddr_storage *sk)
 {
srv->agent.addr = *sk;
srv->flags |= SRV_F_AGENTADDR;
 }
 
-/* set agent port and apprropriate flag */
+/* set agent port and appropriate flag */
 inline void set_srv_agent_port(struct server *srv, int port)
 {
srv->agent.port = port;
@@ -2092,7 +2092,7 @@ static struct srv_kw_list srv_kws = { "CHK", { }, {
{ "check-via-socks4",srv_parse_check_via_socks4,0,  1 }, /* 
Enable socks4 proxy for health checks */
{ "no-agent-check",  srv_parse_no_agent_check,  0,  1 }, /* Do 
not enable any auxiliary agent check */
{ "no-check",srv_parse_no_check,0,  1 }, /* 
Disable health checks */
-   { "no-check-send-proxy", srv_parse_no_check_send_proxy, 0,  1 }, /* 
Disable PROXY protol for health checks */
+   { "no-check-send-proxy", srv_parse_no_check_send_proxy, 0,  1 }, /* 
Disable PROXY protocol for health checks */
{ "rise",srv_parse_check_rise,  1,  1 }, /* Set 
rise value for health checks */
{ "fall",srv_parse_check_fall,  1,  1 }, /* Set 
fall value for health checks */
{ "inter",   srv_parse_check_inter, 1,  1 }, /* Set 
inter value for health checks */
-- 
2.30.0




[PATCH 6/6] CLEANUP: server: add missing space in server-state error output

2021-02-06 Thread William Dauchy
a space was missing in the output to make it more readable.

Signed-off-by: William Dauchy 
---
 src/server.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/server.c b/src/server.c
index 42191eda5..33375e638 100644
--- a/src/server.c
+++ b/src/server.c
@@ -3017,7 +3017,7 @@ static void srv_update_state(struct server *srv, int 
version, char **params)
  out:
if (msg->data) {
chunk_appendf(msg, "\n");
-   ha_warning("server-state application failed for server 
'%s/%s'%s",
+   ha_warning("server-state application failed for server '%s/%s' 
%s",
   srv->proxy->id, srv->id, msg->area);
}
 }
-- 
2.30.0




[PATCH 3/6] MEDIUM: cli: add check-addr command

2021-02-06 Thread William Dauchy
this patch allows to set server health check address at runtime. In
order to align with `addr` command, also allow to set port optionnaly.
This led to a small refactor in order to use the same function for both
`check-addr` and `check-port` commands.

This command becomes more and more useful for people having a consul
like architecture:
- the backend server is located on a container with its own IP
- the health checks are done the consul instance located on the host
  with the host IP

Signed-off-by: William Dauchy 
---
 doc/management.txt |  4 +++
 src/server.c   | 83 +-
 2 files changed, 72 insertions(+), 15 deletions(-)

diff --git a/doc/management.txt b/doc/management.txt
index b74aba769..bff770e4e 100644
--- a/doc/management.txt
+++ b/doc/management.txt
@@ -1842,6 +1842,10 @@ set server / health [ up | stopping | 
down ]
   switch a server's state regardless of some slow health checks for example.
   Note that the change is propagated to tracking servers if any.
 
+set server / check-addr  [port ]
+  Change the IP address used for server health checks.
+  Optionally, change the port used for server health checks.
+
 set server / check-port 
   Change the port used for health checking to 
 
diff --git a/src/server.c b/src/server.c
index da2325e9a..533755f1e 100644
--- a/src/server.c
+++ b/src/server.c
@@ -54,6 +54,8 @@ static int srv_set_fqdn(struct server *srv, const char *fqdn, 
int dns_locked);
 static void srv_state_parse_line(char *buf, const int version, char **params, 
char **srv_params);
 static int srv_state_get_version(FILE *f);
 static void srv_cleanup_connections(struct server *srv);
+static const char *update_server_check_addr_port(struct server *s, const char 
*addr,
+const char *port);
 
 /* List head of all known server keywords */
 static struct srv_kw_list srv_keywords = {
@@ -3571,6 +3573,47 @@ int update_server_addr(struct server *s, void *ip, int 
ip_sin_family, const char
return 0;
 }
 
+/* update server health check address and port
+ * addr must be ip4 or ip6, it won't be resolved
+ * must be called with the server lock held.
+ */
+static const char *update_server_check_addr_port(struct server *s, const char 
*addr,
+const char *port)
+{
+   struct sockaddr_storage sk;
+   struct buffer *msg;
+   int new_port;
+
+   msg = get_trash_chunk();
+
+   if (addr) {
+   memset(, 0, sizeof(struct sockaddr_storage));
+   if (str2ip2(addr, , 0) == NULL) {
+   chunk_appendf(msg, "invalid addr '%s'\n", addr);
+   goto out;
+   }
+   s->check.addr = sk;
+   }
+   if (port) {
+   if (strl2irc(port, strlen(port), _port) != 0) {
+   chunk_appendf(msg, "provided port is not an integer\n");
+   goto out;
+   }
+   if (new_port < 0 || new_port > 65535) {
+   chunk_appendf(msg, "provided port is invalid\n");
+   goto out;
+   }
+   /* prevent the update of port to 0 if MAPPORTS are in use */
+   if ((s->flags & SRV_F_MAPPORTS) && new_port == 0) {
+   chunk_appendf(msg, "can't unset 'port' since MAPPORTS 
is in use\n");
+   goto out;
+   }
+   s->check.port = new_port;
+   }
+out:
+   return msg->area;
+}
+
 /*
  * This function update a server's addr and port only for AF_INET and AF_INET6 
families.
  *
@@ -4403,23 +4446,32 @@ static int cli_parse_set_server(char **args, char 
*payload, struct appctx *appct
cli_err(appctx, "cannot allocate memory for new 
string.\n");
}
}
-   else if (strcmp(args[3], "check-port") == 0) {
-   int i = 0;
-   if (strl2irc(args[4], strlen(args[4]), ) != 0) {
-   cli_err(appctx, "'set server  check-port' expects 
an integer as argument.\n");
-   goto out_unlock;
-   }
-   if ((i < 0) || (i > 65535)) {
-   cli_err(appctx, "provided port is not valid.\n");
+   else if (strcmp(args[3], "check-addr") == 0) {
+   char *addr = NULL;
+   char *port = NULL;
+   if (strlen(args[4]) == 0) {
+   cli_err(appctx, "set server / check-addr requires"
+   " an address and optionally a port.\n");
goto out_unlock;
}
-   /* prevent the update of port to 0 if MAPPORTS are in use */
-   if ((sv->flags & SRV_F_MAPPORTS) && (

Re: [PATCH v3 0/5] fix check port/addr consistency

2021-02-04 Thread William Dauchy
Hello Christopher,

Thanks for your continuous support on this review :)

On Thu, Feb 4, 2021 at 10:09 AM Christopher Faulet  wrote:
> Thanks William, it seems good. I've just a question (sorry :). When the server
> state file is loaded, why the check port is set only if there is a DNS
> resolution ? I know it was done this way before the support of check_port
> parameter in the server state. But I suppose that now we support it, it should
> always be set, isn't it ?

yes you are totally right.

> And is there any usage to add "agent-addr" in the server state file? Because 
> it
> can also be set on the CLI. And later, same question will appear for
> "check-addr" and "agent-port".

the general rule is indeed, anything which can be changed through the
CLI should be loaded through server states. Truth is server states
becomes a nightmare to manage; I don't remember if you were in the
discussions a few months ago, but we concluded we should change that
long term by https://github.com/haproxy/haproxy/issues/953
So while waiting for a proper solution, we are indeed supposed to keep
server states up to date.

> I don't want to bother you again. So, I propose you to merge the first patch 
> and
> to add a new one to not set the check port when the server state file is 
> loaded.
> Then I can merge the third patch and amend the second one to move the check 
> port
> assignment before merging it. And finally I can merge the fourth and fifth 
> patches.

Don't worry, I can send you a v4.
--
William



[PATCH v3 5/5] MEDIUM: check: align agentaddr and agentport behaviour

2021-02-03 Thread William Dauchy
in the same manner of agentaddr, we now:
- permit to set agentport through `port` keyword, like it is the case
  for agentaddr through `addr`
- set the priority on `agent-port` keyword when used
- add a flag to be able to test when the value is set like for agentaddr

it makes the behaviour between `addr` and `port` more consistent.

Signed-off-by: William Dauchy 
---
 doc/configuration.txt  | 10 +-
 include/haproxy/check.h|  1 +
 include/haproxy/server-t.h |  1 +
 src/check.c| 12 +++-
 4 files changed, 18 insertions(+), 6 deletions(-)

diff --git a/doc/configuration.txt b/doc/configuration.txt
index 2abd684fa..eb685785d 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -14061,11 +14061,11 @@ pool-purge-delay 
 
 port 
   Using the "port" parameter, it becomes possible to use a different port to
-  send health-checks. On some servers, it may be desirable to dedicate a port
-  to a specific component able to perform complex tests which are more suitable
-  to health-checks than the application. It is common to run a simple script in
-  inetd for instance. This parameter is ignored if the "check" parameter is not
-  set. See also the "addr" parameter.
+  send health-checks or to probe the agent-check. On some servers, it may be
+  desirable to dedicate a port to a specific component able to perform complex
+  tests which are more suitable to health-checks than the application. It is
+  common to run a simple script in inetd for instance. This parameter is
+  ignored if the "check" parameter is not set. See also the "addr" parameter.
 
 proto 
   Forces the multiplexer's protocol to use for the outgoing connections to this
diff --git a/include/haproxy/check.h b/include/haproxy/check.h
index ed8124470..ffeef4e22 100644
--- a/include/haproxy/check.h
+++ b/include/haproxy/check.h
@@ -53,6 +53,7 @@ int spoe_handle_healthcheck_response(char *frame, size_t 
size, char *err, int er
 
 int set_srv_agent_send(struct server *srv, const char *send);
 void set_srv_agent_addr(struct server *srv, struct sockaddr_storage *sk);
+void set_srv_agent_port(struct server *srv, int port);
 
 /* Use this one only. This inline version only ensures that we don't
  * call the function when the observe mode is disabled.
diff --git a/include/haproxy/server-t.h b/include/haproxy/server-t.h
index 45f41959c..32697a9c4 100644
--- a/include/haproxy/server-t.h
+++ b/include/haproxy/server-t.h
@@ -138,6 +138,7 @@ enum srv_initaddr {
 #define SRV_F_NON_STICK0x0004/* never add connections allocated to 
this server to a stick table */
 #define SRV_F_USE_NS_FROM_PP 0x0008  /* use namespace associated with 
connection if present */
 #define SRV_F_FORCED_ID0x0010/* server's ID was forced in the 
configuration */
+#define SRV_F_AGENTPORT0x0040/* this server has a agent port 
configured */
 #define SRV_F_AGENTADDR0x0080/* this server has a agent addr 
configured */
 #define SRV_F_COOKIESET0x0100/* this server has a cookie 
configured, so don't generate dynamic cookies */
 #define SRV_F_FASTOPEN 0x0200/* Use TCP Fast Open to connect to 
server */
diff --git a/src/check.c b/src/check.c
index 31f108bff..194f3b4a6 100644
--- a/src/check.c
+++ b/src/check.c
@@ -1697,7 +1697,7 @@ static int srv_parse_agent_port(char **args, int 
*cur_arg, struct proxy *curpx,
}
 
global.maxsock++;
-   srv->agent.port = atol(args[*cur_arg+1]);
+   set_srv_agent_port(srv, atol(args[*cur_arg + 1]));
 
   out:
return err_code;
@@ -1741,6 +1741,13 @@ inline void set_srv_agent_addr(struct server *srv, 
struct sockaddr_storage *sk)
srv->flags |= SRV_F_AGENTADDR;
 }
 
+/* set agent port and apprropriate flag */
+inline void set_srv_agent_port(struct server *srv, int port)
+{
+   srv->agent.port = port;
+   srv->flags |= SRV_F_AGENTPORT;
+}
+
 /* Parse the "agent-send" server keyword */
 static int srv_parse_agent_send(char **args, int *cur_arg, struct proxy 
*curpx, struct server *srv,
char **errmsg)
@@ -2060,6 +2067,9 @@ static int srv_parse_check_port(char **args, int 
*cur_arg, struct proxy *curpx,
 
global.maxsock++;
srv->check.port = atol(args[*cur_arg+1]);
+   /* if agentport was never set, we can use port */
+   if (!(srv->flags & SRV_F_AGENTPORT))
+   set_srv_agent_port(srv, srv->check.port);
 
   out:
return err_code;
-- 
2.30.0




[PATCH v3 3/5] MEDIUM: check: remove checkport checkaddr flag

2021-02-03 Thread William Dauchy
While trying to fix some consistency problem with the config file/cli
(e.g. check-port cli command does not set the flag), we realised
checkport flag was not necessarily needed. Indeed tcpcheck uses service
port as the last choice if check.port is zero. So we can assume if
check.port is zero, it means it was never set by the user, regardless if
it is by the cli or config file.  In the longterm this will avoid to
introduce a new consistency issue if we forget to set the flag.

in the same manner of checkport flag, we don't really need checkaddr
flag. We can assume if checkaddr is not set, it means it was never set
by the user or config.

Signed-off-by: William Dauchy 
---
 include/haproxy/server-t.h | 2 --
 src/check.c| 2 --
 src/dns.c  | 3 +--
 src/server.c   | 7 ++-
 4 files changed, 3 insertions(+), 11 deletions(-)

diff --git a/include/haproxy/server-t.h b/include/haproxy/server-t.h
index e42b1c7ed..45f41959c 100644
--- a/include/haproxy/server-t.h
+++ b/include/haproxy/server-t.h
@@ -138,8 +138,6 @@ enum srv_initaddr {
 #define SRV_F_NON_STICK0x0004/* never add connections allocated to 
this server to a stick table */
 #define SRV_F_USE_NS_FROM_PP 0x0008  /* use namespace associated with 
connection if present */
 #define SRV_F_FORCED_ID0x0010/* server's ID was forced in the 
configuration */
-#define SRV_F_CHECKADDR0x0020/* this server has a check addr 
configured */
-#define SRV_F_CHECKPORT0x0040/* this server has a check port 
configured */
 #define SRV_F_AGENTADDR0x0080/* this server has a agent addr 
configured */
 #define SRV_F_COOKIESET0x0100/* this server has a cookie 
configured, so don't generate dynamic cookies */
 #define SRV_F_FASTOPEN 0x0200/* Use TCP Fast Open to connect to 
server */
diff --git a/src/check.c b/src/check.c
index 879fe84ce..e24050ff2 100644
--- a/src/check.c
+++ b/src/check.c
@@ -1527,7 +1527,6 @@ static int srv_parse_addr(char **args, int *cur_arg, 
struct proxy *curpx, struct
}
 
srv->check.addr = srv->agent.addr = *sk;
-   srv->flags |= SRV_F_CHECKADDR;
srv->flags |= SRV_F_AGENTADDR;
 
   out:
@@ -2050,7 +2049,6 @@ static int srv_parse_check_port(char **args, int 
*cur_arg, struct proxy *curpx,
 
global.maxsock++;
srv->check.port = atol(args[*cur_arg+1]);
-   srv->flags |= SRV_F_CHECKPORT;
 
   out:
return err_code;
diff --git a/src/dns.c b/src/dns.c
index 8c97df46b..8fc9089e7 100644
--- a/src/dns.c
+++ b/src/dns.c
@@ -712,8 +712,7 @@ static void dns_check_dns_response(struct dns_resolution 
*res)
 
srv->svc_port = item->port;
srv->flags   &= ~SRV_F_MAPPORTS;
-   if ((srv->check.state & CHK_ST_CONFIGURED) &&
-   !(srv->flags & SRV_F_CHECKPORT))
+   if (!srv->check.port)
srv->check.port = item->port;
 
if (!srv->dns_opts.ignore_weight) {
diff --git a/src/server.c b/src/server.c
index d8216058f..1fd71e403 100644
--- a/src/server.c
+++ b/src/server.c
@@ -1660,8 +1660,6 @@ static void srv_settings_cpy(struct server *srv, struct 
server *src, int srv_tmp
srv->flags   |= src->flags;
srv->do_check = src->do_check;
srv->do_agent = src->do_agent;
-   if (srv->check.port)
-   srv->flags |= SRV_F_CHECKPORT;
srv->check.inter  = src->check.inter;
srv->check.fastinter  = src->check.fastinter;
srv->check.downinter  = src->check.downinter;
@@ -2985,8 +2983,7 @@ static void srv_update_state(struct server *srv, int 
version, char **params)
}
 
/* configure check.port accordingly */
-   if ((srv->check.state & CHK_ST_CONFIGURED) &&
-   !(srv->flags & SRV_F_CHECKPORT))
+   if (!srv->check.port)
srv->check.port = port_check;
 
/* Unset SRV_F_MAPPORTS for SRV records.
@@ -3673,7 +3670,7 @@ const char *update_server_addr_port(struct server *s, 
const char *addr, const ch
 * we're switching from a fixed port to a 
SRV_F_MAPPORTS (mapped) port
 * prevent PORT change if check doesn't have 
it's dedicated port while switching
 * to port mapping */
-   if ((s->check.state & CHK_ST_CONFIGURED) && 
!(s->flags & SRV_F_CHECKPORT)) {
+  

[PATCH v3 0/5] fix check port/addr consistency

2021-02-03 Thread William Dauchy
Hello Christopher,

Here is my last update on port/addr consistency. I addressed all the
point you mentioned. I hope I did not forgot anything.  I will come back
with `check-addr` and `agent-port` on the cli once those patches are
accepted.

William Dauchy (5):
  BUG/MINOR: cli: fix set server addr/port coherency with health checks
  MEDIUM: server: adding support for check_port in server state
  MEDIUM: check: remove checkport checkaddr flag
  BUG/MINOR: check: consitent way to set agentaddr
  MEDIUM: check: align agentaddr and agentport behaviour

 doc/configuration.txt | 10 +--
 doc/management.txt|  1 +
 include/haproxy/check.h   |  2 +
 include/haproxy/server-t.h| 10 +--
 .../checks/1be_40srv_odd_health_checks.vtc|  2 +-
 .../checks/40be_2srv_odd_health_checks.vtc|  2 +-
 reg-tests/checks/4be_1srv_health_checks.vtc   |  6 +-
 src/check.c   | 33 --
 src/dns.c |  3 +-
 src/proxy.c   |  4 +-
 src/server.c  | 64 +--
 11 files changed, 78 insertions(+), 59 deletions(-)

-- 
2.30.0




[PATCH v3 1/5] BUG/MINOR: cli: fix set server addr/port coherency with health checks

2021-02-03 Thread William Dauchy
while reading `update_server_addr_port` I found out some things which
can be seen as incoherency. I hope I did not overlooked anything:

- one comment is stating check's address should be updated if it uses
  the server one; however the condition checks if `SRV_F_CHECKADDR` is
  set; this flag is set when a check address is set; result is that we
  override the check address where I was not expecting it. In fact we
  don't need to update anything here as server addr is used when check
  addr is not set.
- same goes for check agent addr
- for port, it is a bit different, we update the check port if it is
  unset. This is harmless because we also use server port if check port
  is unset. However it creates some incoherency before/after using this
  command, as check port should stay unset througout the life of the
  process unless it is is set by `set server check-port` command.

quite hard to locate the origin of this this issue but the function was
introduced in commit d458adcc52b74608e2fe6a2a95f09ce5e94932b7 ("MINOR:
new update_server_addr_port() function to change both server's ADDR and
service PORT"). I was however not able to determine whether this is due
to a change of behavior along the years. So this patch can potentially
be backported up to v1.8 but we must be careful while doing so, as the
code has changed a lot. That being said, the bug being not very
impacting I would be fine keeping it for 2.4 only.

Signed-off-by: William Dauchy 
---
 src/server.c | 15 ---
 1 file changed, 15 deletions(-)

diff --git a/src/server.c b/src/server.c
index 10f528640..99b7e9181 100644
--- a/src/server.c
+++ b/src/server.c
@@ -3625,16 +3625,6 @@ const char *update_server_addr_port(struct server *s, 
const char *addr, const ch
ipcpy(, >addr);
changed = 1;
 
-   /* we also need to update check's ADDR only if it uses the 
server's one */
-   if ((s->check.state & CHK_ST_CONFIGURED) && (s->flags & 
SRV_F_CHECKADDR)) {
-   ipcpy(, >check.addr);
-   }
-
-   /* we also need to update agent ADDR only if it use the 
server's one */
-   if ((s->agent.state & CHK_ST_CONFIGURED) && (s->flags & 
SRV_F_AGENTADDR)) {
-   ipcpy(, >agent.addr);
-   }
-
/* update report for caller */
chunk_printf(msg, "IP changed from '%s' to '%s'", current_addr, 
addr);
}
@@ -3714,11 +3704,6 @@ const char *update_server_addr_port(struct server *s, 
const char *addr, const ch
}
 
chunk_appendf(msg, "%d'", new_port);
-
-   /* we also need to update health checks port only if it 
uses server's realport */
-   if ((s->check.state & CHK_ST_CONFIGURED) && !(s->flags 
& SRV_F_CHECKPORT)) {
-   s->check.port = new_port;
-   }
}
else {
chunk_appendf(msg, "no need to change the port");
-- 
2.30.0




[PATCH v3 4/5] BUG/MINOR: check: consitent way to set agentaddr

2021-02-03 Thread William Dauchy
small consistency problem with `addr` and `agent-addr` options:
for the both options, the last one parsed is always used to set the
agent-check addr.  Thus these two lines don't have the same behavior:

  server ... addr  agent-addr 
  server ... agent-addr  addr 

After this patch `agent-addr` will always be the priority option over
`addr`. It means we test the flag before setting agentaddr.
We also fix all the places where we did not set the flag to be coherent
everywhere.

I was not really able to determine where this issue is coming from. So
it is probable we may backport it to all stable version where the agent
is supported.

Signed-off-by: William Dauchy 
---
 include/haproxy/check.h |  1 +
 src/check.c | 19 +++
 src/server.c|  7 ++-
 3 files changed, 22 insertions(+), 5 deletions(-)

diff --git a/include/haproxy/check.h b/include/haproxy/check.h
index 8d70040a3..ed8124470 100644
--- a/include/haproxy/check.h
+++ b/include/haproxy/check.h
@@ -52,6 +52,7 @@ int spoe_prepare_healthcheck_request(char **req, int *len);
 int spoe_handle_healthcheck_response(char *frame, size_t size, char *err, int 
errlen);
 
 int set_srv_agent_send(struct server *srv, const char *send);
+void set_srv_agent_addr(struct server *srv, struct sockaddr_storage *sk);
 
 /* Use this one only. This inline version only ensures that we don't
  * call the function when the observe mode is disabled.
diff --git a/src/check.c b/src/check.c
index e24050ff2..31f108bff 100644
--- a/src/check.c
+++ b/src/check.c
@@ -1526,8 +1526,10 @@ static int srv_parse_addr(char **args, int *cur_arg, 
struct proxy *curpx, struct
goto error;
}
 
-   srv->check.addr = srv->agent.addr = *sk;
-   srv->flags |= SRV_F_AGENTADDR;
+   srv->check.addr = *sk;
+   /* if agentaddr was never set, we can use addr */
+   if (!(srv->flags & SRV_F_AGENTADDR))
+   set_srv_agent_addr(srv, sk);
 
   out:
return err_code;
@@ -1537,21 +1539,23 @@ static int srv_parse_addr(char **args, int *cur_arg, 
struct proxy *curpx, struct
goto out;
 }
 
-
 /* Parse the "agent-addr" server keyword */
 static int srv_parse_agent_addr(char **args, int *cur_arg, struct proxy 
*curpx, struct server *srv,
char **errmsg)
 {
+   struct sockaddr_storage sk;
int err_code = 0;
 
if (!*(args[*cur_arg+1])) {
memprintf(errmsg, "'%s' expects an address as argument.", 
args[*cur_arg]);
goto error;
}
-   if(str2ip(args[*cur_arg+1], >agent.addr) == NULL) {
+   memset(, 0, sizeof(sk));
+   if (str2ip(args[*cur_arg + 1], ) == NULL) {
memprintf(errmsg, "parsing agent-addr failed. Check if '%s' is 
correct address.", args[*cur_arg+1]);
goto error;
}
+   set_srv_agent_addr(srv, );
 
   out:
return err_code;
@@ -1730,6 +1734,13 @@ int set_srv_agent_send(struct server *srv, const char 
*send)
return 0;
 }
 
+/* set agent addr and apprropriate flag */
+inline void set_srv_agent_addr(struct server *srv, struct sockaddr_storage *sk)
+{
+   srv->agent.addr = *sk;
+   srv->flags |= SRV_F_AGENTADDR;
+}
+
 /* Parse the "agent-send" server keyword */
 static int srv_parse_agent_send(char **args, int *cur_arg, struct proxy 
*curpx, struct server *srv,
char **errmsg)
diff --git a/src/server.c b/src/server.c
index 1fd71e403..b6b3b0893 100644
--- a/src/server.c
+++ b/src/server.c
@@ -4384,9 +4384,14 @@ static int cli_parse_set_server(char **args, char 
*payload, struct appctx *appct
cli_err(appctx, "'set server  agent' expects 'up' 
or 'down'.\n");
}
else if (strcmp(args[3], "agent-addr") == 0) {
+   struct sockaddr_storage sk;
+
+   memset(, 0, sizeof(sk));
if (!(sv->agent.state & CHK_ST_ENABLED))
cli_err(appctx, "agent checks are not enabled on this 
server.\n");
-   else if (str2ip(args[4], >agent.addr) == NULL)
+   else if (str2ip(args[4], ))
+   set_srv_agent_addr(sv, );
+   else
cli_err(appctx, "incorrect addr address given for 
agent.\n");
}
else if (strcmp(args[3], "agent-send") == 0) {
-- 
2.30.0




[PATCH v3 2/5] MEDIUM: server: adding support for check_port in server state

2021-02-03 Thread William Dauchy
We can currently change the check-port using the cli command `set server
check-port` but there is a consistency issue when using server state.
This patch aims to fix this problem but will be also a good preparation
work to get rid of checkport flag, so we are able to know when checkport
was set by config.

I am fully aware this is not making github #953 moving forward, I
however think this might be acceptable while waiting for a proper
solution and resolve consistency problem faced with port settings.

Signed-off-by: William Dauchy 
---
 doc/management.txt|  1 +
 include/haproxy/server-t.h|  7 ++--
 .../checks/1be_40srv_odd_health_checks.vtc|  2 +-
 .../checks/40be_2srv_odd_health_checks.vtc|  2 +-
 reg-tests/checks/4be_1srv_health_checks.vtc   |  6 ++--
 src/proxy.c   |  4 +--
 src/server.c  | 35 ---
 7 files changed, 35 insertions(+), 22 deletions(-)

diff --git a/doc/management.txt b/doc/management.txt
index de3fbf607..b74aba769 100644
--- a/doc/management.txt
+++ b/doc/management.txt
@@ -2447,6 +2447,7 @@ show servers state []
  srv_port:Server port.
  srvrecord:   DNS SRV record associated to this SRV.
  srv_use_ssl: use ssl for server connections.
+ srv_check_port:  Server check port.
 
 show sess
   Dump all known sessions. Avoid doing this on slow connections as this can
diff --git a/include/haproxy/server-t.h b/include/haproxy/server-t.h
index b29c75c0b..e42b1c7ed 100644
--- a/include/haproxy/server-t.h
+++ b/include/haproxy/server-t.h
@@ -125,10 +125,11 @@ enum srv_initaddr {
 "srv_fqdn "   \
 "srv_port "   \
 "srvrecord "  \
-"srv_use_ssl"
+"srv_use_ssl "\
+"srv_check_port"
 
-#define SRV_STATE_FILE_MAX_FIELDS 21
-#define SRV_STATE_FILE_NB_FIELDS_VERSION_1 20
+#define SRV_STATE_FILE_MAX_FIELDS 22
+#define SRV_STATE_FILE_NB_FIELDS_VERSION_1 21
 #define SRV_STATE_LINE_MAXLEN 512
 
 /* server flags -- 32 bits */
diff --git a/reg-tests/checks/1be_40srv_odd_health_checks.vtc 
b/reg-tests/checks/1be_40srv_odd_health_checks.vtc
index bd07d8840..f01205295 100644
--- a/reg-tests/checks/1be_40srv_odd_health_checks.vtc
+++ b/reg-tests/checks/1be_40srv_odd_health_checks.vtc
@@ -112,6 +112,6 @@ syslog S -wait
 
 haproxy h1 -cli {
 send "show servers state"
-expect ~ "# be_id be_name srv_id srv_name srv_addr srv_op_state 
srv_admin_state srv_uweight srv_iweight srv_time_since_last_change 
srv_check_status srv_check_result srv_check_health srv_check_state 
srv_agent_state bk_f_forced_id srv_f_forced_id srv_fqdn srv_port srvrecord 
srv_use_ssl\n2 be1 1 srv0 ${s0_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - 
${s0_port} - 0\n2 be1 2 srv1 ${s1_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ 
){3}0 0 0 - ${s1_port} - 0\n2 be1 3 srv2 ${s2_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 
0 0 0 0 - ${s2_port} - 0\n2 be1 4 srv3 ${s3_addr} 2 0 1 1 [[:digit:]]+ 6 
([[:digit:]]+ ){3}0 0 0 - ${s3_port} - 0\n2 be1 5 srv4 ${s4_addr} 2 0 1 1 
[[:digit:]]+ 1 0 1 0 0 0 0 - ${s4_port} - 0\n2 be1 6 srv5 ${s5_addr} 2 0 1 1 
[[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s5_port} - 0\n2 be1 7 srv6 
${s6_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s6_port} - 0\n2 be1 8 srv7 
${s7_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s7_port} - 0\n2 
be1 9 srv8 ${s8_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s8_port} - 0\n2 
be1 10 srv9 ${s9_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - 
${s9_port} - 0\n2 be1 11 srv10 ${s10_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - 
${s10_port} - 0\n2 be1 12 srv11 ${s11_addr} 2 0 1 1 [[:digit:]]+ 6 
([[:digit:]]+ ){3}0 0 0 - ${s11_port} - 0\n2 be1 13 srv12 ${s12_addr} 2 0 1 1 
[[:digit:]]+ 1 0 1 0 0 0 0 - ${s12_port} - 0\n2 be1 14 srv13 ${s13_addr} 2 0 1 
1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s13_port} - 0\n2 be1 15 srv14 
${s14_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s14_port} - 0\n2 be1 16 
srv15 ${s15_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s15_port} 
- 0\n2 be1 17 srv16 ${s16_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - 
${s16_port} - 0\n2 be1 18 srv17 ${s17_addr} 2 0 1 1 [[:digit:]]+ 6 
([[:digit:]]+ ){3}0 0 0 - ${s17_port} - 0\n2 be1 19 srv18 ${s18_addr} 2 0 1 1 
[[:digit:]]+ 1 0 1 0 0 0 0 - ${s18_port} - 0\n2 be1 20 srv19 ${s19_addr} 2 0 1 
1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s19_port} - 0\n2 be1 21 srv20 
${s20_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s20_port} - 0\n2 be1 22 
srv21 ${s21_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s21_port} 
- 0\n2 be1 23 srv22 ${s22_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - 
${s22_port} - 0\n2 be1 24 srv23 ${s23_addr} 2 0 1 1 [[:digit:]]+ 6 
([[:digit:]]+ ){3}0 0 0 - ${s23_port} - 0\n2 be1 25 srv24 ${s24_addr} 2 0 1 1 
[[:digit:]]+ 1 

Re: [PATCH v2 0/5] fix check port/addr consistency

2021-02-03 Thread William Dauchy
On Wed, Feb 3, 2021 at 9:59 AM Christopher Faulet  wrote:
> At first glance, I'm just a bit annoyed with the patch 5. In the 
> documentation,
> it is stated that "addr" option will be used for agent-check too. And there is
> no info about interactions between "addr" and "agent-addr" options when both 
> are
> configured. For me, for an agent-check, the "agent-addr" option must be used 
> in
> priority, regardless the options order. If not defined, the "addr" option must
> be used, if defined. And at the end, we use the server address by default if
> none is specified.

ok I missed that part. it is a bit sad honestly, it makes things less explicit.

> There is another problem with "port" and "agent-port" options. It is mentioned
> in the documentation that "agent-port" is required to perform agent-checks. 
> And
> "port" option is not used for agent-check. It is not really consistent. I
> propose to deal with port/agent-port options in the same way than
> addr/agent-addr ones. And we keep the test to be sure to have a dedicated port
> for the agent-check to not use the server's one. This way, we keep backward
> compatibility and improve consistency.
> Thus, if you agree with these changes, I guess we should keep SRV_F_AGENTADDR
> flag and add SRV_F_AGENTPORT.

ok I will come back with a v3. I honestly don't like to have port/addr
used for agent, as stated previously because it is creating some non
explicit things. But indeed as we need to keep backward compat, I will
fix that.

> About the CLI. I agree, the "check-addr" parameter on the "set server" command
> must be added. And the "agent-port" parameter must bee added too. For me, it 
> is
> a consistency matter. But I understand it is also mandatory for dynamic
> environments.

thanks, let me come back with a v3, so we can move forward for the
next improvements.
-- 
William



Re: [PATCH v2 4/5] MEDIUM: get rid of checkaddr and agentaddr flag

2021-02-02 Thread William Dauchy
On Tue, Feb 2, 2021 at 10:57 PM William Dauchy  wrote:
> in the same manner of checkport flag, we found out some consistency
> problem. For instance, SRV_F_AGENTADDR flag is set when the "addr"
> option is set on a server line but not for the "agent-addr" option.
> The issue is similar in the cli where we never set the flag.
> A consequence of this patch is that agentaddr is no longer used, at
> least for now. We might reintroduce it later if we need it.

sorry for that, I forgot "server:" in the subject. will send v3 if
there are other feedbacks
-- 
William



[PATCH v2 4/5] MEDIUM: get rid of checkaddr and agentaddr flag

2021-02-02 Thread William Dauchy
in the same manner of checkport flag, we found out some consistency
problem. For instance, SRV_F_AGENTADDR flag is set when the "addr"
option is set on a server line but not for the "agent-addr" option.
The issue is similar in the cli where we never set the flag.
A consequence of this patch is that agentaddr is no longer used, at
least for now. We might reintroduced it later if we need it.

Signed-off-by: William Dauchy 
---
 include/haproxy/server-t.h | 2 --
 src/check.c| 2 --
 2 files changed, 4 deletions(-)

diff --git a/include/haproxy/server-t.h b/include/haproxy/server-t.h
index 131b97cb6..382caa218 100644
--- a/include/haproxy/server-t.h
+++ b/include/haproxy/server-t.h
@@ -138,8 +138,6 @@ enum srv_initaddr {
 #define SRV_F_NON_STICK0x0004/* never add connections allocated to 
this server to a stick table */
 #define SRV_F_USE_NS_FROM_PP 0x0008  /* use namespace associated with 
connection if present */
 #define SRV_F_FORCED_ID0x0010/* server's ID was forced in the 
configuration */
-#define SRV_F_CHECKADDR0x0020/* this server has a check addr 
configured */
-#define SRV_F_AGENTADDR0x0080/* this server has a agent addr 
configured */
 #define SRV_F_COOKIESET0x0100/* this server has a cookie 
configured, so don't generate dynamic cookies */
 #define SRV_F_FASTOPEN 0x0200/* Use TCP Fast Open to connect to 
server */
 #define SRV_F_SOCKS4_PROXY 0x0400/* this server uses SOCKS4 proxy */
diff --git a/src/check.c b/src/check.c
index f4b11fc46..f4d5bd452 100644
--- a/src/check.c
+++ b/src/check.c
@@ -1527,8 +1527,6 @@ static int srv_parse_addr(char **args, int *cur_arg, 
struct proxy *curpx, struct
}
 
srv->check.addr = srv->agent.addr = *sk;
-   srv->flags |= SRV_F_CHECKADDR;
-   srv->flags |= SRV_F_AGENTADDR;
 
   out:
return err_code;
-- 
2.29.2




[PATCH v2 2/5] MEDIUM: server: adding support for check_port in server state

2021-02-02 Thread William Dauchy
We can currently change the check-port using the cli command `set server
check-port` but there is a consistency issue when using server state.
This patch aims to fix this problem but will be also a good preparation
work to get rid of checkport flag, so we are able to know when checkport
was set by config.

I am fully aware this is not making github #953 moving forward, I
however think this might be acceptable while waiting for a proper
solution and resolve consistency problem faced with port settings.

Signed-off-by: William Dauchy 
---
 doc/management.txt|  1 +
 include/haproxy/server-t.h|  7 ++--
 .../checks/1be_40srv_odd_health_checks.vtc|  2 +-
 .../checks/40be_2srv_odd_health_checks.vtc|  2 +-
 reg-tests/checks/4be_1srv_health_checks.vtc   |  6 ++--
 src/proxy.c   |  4 +--
 src/server.c  | 35 ---
 7 files changed, 35 insertions(+), 22 deletions(-)

diff --git a/doc/management.txt b/doc/management.txt
index de3fbf607..b74aba769 100644
--- a/doc/management.txt
+++ b/doc/management.txt
@@ -2447,6 +2447,7 @@ show servers state []
  srv_port:Server port.
  srvrecord:   DNS SRV record associated to this SRV.
  srv_use_ssl: use ssl for server connections.
+ srv_check_port:  Server check port.
 
 show sess
   Dump all known sessions. Avoid doing this on slow connections as this can
diff --git a/include/haproxy/server-t.h b/include/haproxy/server-t.h
index b29c75c0b..e42b1c7ed 100644
--- a/include/haproxy/server-t.h
+++ b/include/haproxy/server-t.h
@@ -125,10 +125,11 @@ enum srv_initaddr {
 "srv_fqdn "   \
 "srv_port "   \
 "srvrecord "  \
-"srv_use_ssl"
+"srv_use_ssl "\
+"srv_check_port"
 
-#define SRV_STATE_FILE_MAX_FIELDS 21
-#define SRV_STATE_FILE_NB_FIELDS_VERSION_1 20
+#define SRV_STATE_FILE_MAX_FIELDS 22
+#define SRV_STATE_FILE_NB_FIELDS_VERSION_1 21
 #define SRV_STATE_LINE_MAXLEN 512
 
 /* server flags -- 32 bits */
diff --git a/reg-tests/checks/1be_40srv_odd_health_checks.vtc 
b/reg-tests/checks/1be_40srv_odd_health_checks.vtc
index bd07d8840..f01205295 100644
--- a/reg-tests/checks/1be_40srv_odd_health_checks.vtc
+++ b/reg-tests/checks/1be_40srv_odd_health_checks.vtc
@@ -112,6 +112,6 @@ syslog S -wait
 
 haproxy h1 -cli {
 send "show servers state"
-expect ~ "# be_id be_name srv_id srv_name srv_addr srv_op_state 
srv_admin_state srv_uweight srv_iweight srv_time_since_last_change 
srv_check_status srv_check_result srv_check_health srv_check_state 
srv_agent_state bk_f_forced_id srv_f_forced_id srv_fqdn srv_port srvrecord 
srv_use_ssl\n2 be1 1 srv0 ${s0_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - 
${s0_port} - 0\n2 be1 2 srv1 ${s1_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ 
){3}0 0 0 - ${s1_port} - 0\n2 be1 3 srv2 ${s2_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 
0 0 0 0 - ${s2_port} - 0\n2 be1 4 srv3 ${s3_addr} 2 0 1 1 [[:digit:]]+ 6 
([[:digit:]]+ ){3}0 0 0 - ${s3_port} - 0\n2 be1 5 srv4 ${s4_addr} 2 0 1 1 
[[:digit:]]+ 1 0 1 0 0 0 0 - ${s4_port} - 0\n2 be1 6 srv5 ${s5_addr} 2 0 1 1 
[[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s5_port} - 0\n2 be1 7 srv6 
${s6_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s6_port} - 0\n2 be1 8 srv7 
${s7_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s7_port} - 0\n2 
be1 9 srv8 ${s8_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s8_port} - 0\n2 
be1 10 srv9 ${s9_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - 
${s9_port} - 0\n2 be1 11 srv10 ${s10_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - 
${s10_port} - 0\n2 be1 12 srv11 ${s11_addr} 2 0 1 1 [[:digit:]]+ 6 
([[:digit:]]+ ){3}0 0 0 - ${s11_port} - 0\n2 be1 13 srv12 ${s12_addr} 2 0 1 1 
[[:digit:]]+ 1 0 1 0 0 0 0 - ${s12_port} - 0\n2 be1 14 srv13 ${s13_addr} 2 0 1 
1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s13_port} - 0\n2 be1 15 srv14 
${s14_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s14_port} - 0\n2 be1 16 
srv15 ${s15_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s15_port} 
- 0\n2 be1 17 srv16 ${s16_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - 
${s16_port} - 0\n2 be1 18 srv17 ${s17_addr} 2 0 1 1 [[:digit:]]+ 6 
([[:digit:]]+ ){3}0 0 0 - ${s17_port} - 0\n2 be1 19 srv18 ${s18_addr} 2 0 1 1 
[[:digit:]]+ 1 0 1 0 0 0 0 - ${s18_port} - 0\n2 be1 20 srv19 ${s19_addr} 2 0 1 
1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s19_port} - 0\n2 be1 21 srv20 
${s20_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - ${s20_port} - 0\n2 be1 22 
srv21 ${s21_addr} 2 0 1 1 [[:digit:]]+ 6 ([[:digit:]]+ ){3}0 0 0 - ${s21_port} 
- 0\n2 be1 23 srv22 ${s22_addr} 2 0 1 1 [[:digit:]]+ 1 0 1 0 0 0 0 - 
${s22_port} - 0\n2 be1 24 srv23 ${s23_addr} 2 0 1 1 [[:digit:]]+ 6 
([[:digit:]]+ ){3}0 0 0 - ${s23_port} - 0\n2 be1 25 srv24 ${s24_addr} 2 0 1 1 
[[:digit:]]+ 1 

[PATCH v2 3/5] MEDIUM: server: get rid of checkport flag

2021-02-02 Thread William Dauchy
While trying to fix some consistency problem with the config file/cli
(e.g. check-port cli command does not set the flag), we realised
checkport flag was not necessarily needed. Indeed tcpcheck uses service
port as the last choice if check.port is zero. So we can assume if
check.port is zero, it means it was never set by the user, regardless if
it is by the cli or config file.  In the longterm this will avoid to
introduce a new consistency issue if we forget to set the flag.

Signed-off-by: William Dauchy 
---
 include/haproxy/server-t.h | 1 -
 src/check.c| 1 -
 src/dns.c  | 3 +--
 src/server.c   | 7 ++-
 4 files changed, 3 insertions(+), 9 deletions(-)

diff --git a/include/haproxy/server-t.h b/include/haproxy/server-t.h
index e42b1c7ed..131b97cb6 100644
--- a/include/haproxy/server-t.h
+++ b/include/haproxy/server-t.h
@@ -139,7 +139,6 @@ enum srv_initaddr {
 #define SRV_F_USE_NS_FROM_PP 0x0008  /* use namespace associated with 
connection if present */
 #define SRV_F_FORCED_ID0x0010/* server's ID was forced in the 
configuration */
 #define SRV_F_CHECKADDR0x0020/* this server has a check addr 
configured */
-#define SRV_F_CHECKPORT0x0040/* this server has a check port 
configured */
 #define SRV_F_AGENTADDR0x0080/* this server has a agent addr 
configured */
 #define SRV_F_COOKIESET0x0100/* this server has a cookie 
configured, so don't generate dynamic cookies */
 #define SRV_F_FASTOPEN 0x0200/* Use TCP Fast Open to connect to 
server */
diff --git a/src/check.c b/src/check.c
index 879fe84ce..f4b11fc46 100644
--- a/src/check.c
+++ b/src/check.c
@@ -2050,7 +2050,6 @@ static int srv_parse_check_port(char **args, int 
*cur_arg, struct proxy *curpx,
 
global.maxsock++;
srv->check.port = atol(args[*cur_arg+1]);
-   srv->flags |= SRV_F_CHECKPORT;
 
   out:
return err_code;
diff --git a/src/dns.c b/src/dns.c
index 8c97df46b..8fc9089e7 100644
--- a/src/dns.c
+++ b/src/dns.c
@@ -712,8 +712,7 @@ static void dns_check_dns_response(struct dns_resolution 
*res)
 
srv->svc_port = item->port;
srv->flags   &= ~SRV_F_MAPPORTS;
-   if ((srv->check.state & CHK_ST_CONFIGURED) &&
-   !(srv->flags & SRV_F_CHECKPORT))
+   if (!srv->check.port)
srv->check.port = item->port;
 
if (!srv->dns_opts.ignore_weight) {
diff --git a/src/server.c b/src/server.c
index d8216058f..1fd71e403 100644
--- a/src/server.c
+++ b/src/server.c
@@ -1660,8 +1660,6 @@ static void srv_settings_cpy(struct server *srv, struct 
server *src, int srv_tmp
srv->flags   |= src->flags;
srv->do_check = src->do_check;
srv->do_agent = src->do_agent;
-   if (srv->check.port)
-   srv->flags |= SRV_F_CHECKPORT;
srv->check.inter  = src->check.inter;
srv->check.fastinter  = src->check.fastinter;
srv->check.downinter  = src->check.downinter;
@@ -2985,8 +2983,7 @@ static void srv_update_state(struct server *srv, int 
version, char **params)
}
 
/* configure check.port accordingly */
-   if ((srv->check.state & CHK_ST_CONFIGURED) &&
-   !(srv->flags & SRV_F_CHECKPORT))
+   if (!srv->check.port)
srv->check.port = port_check;
 
/* Unset SRV_F_MAPPORTS for SRV records.
@@ -3673,7 +3670,7 @@ const char *update_server_addr_port(struct server *s, 
const char *addr, const ch
 * we're switching from a fixed port to a 
SRV_F_MAPPORTS (mapped) port
 * prevent PORT change if check doesn't have 
it's dedicated port while switching
 * to port mapping */
-   if ((s->check.state & CHK_ST_CONFIGURED) && 
!(s->flags & SRV_F_CHECKPORT)) {
+   if (!s->check.port) {
chunk_appendf(msg, "can't change  
to port map because it is incompatible with current health check port 
configuration (use 'port' statement from the 'server' directive.");
goto out;
}
-- 
2.29.2




[PATCH v2 0/5] fix check port/addr consistency

2021-02-02 Thread William Dauchy
Hello Christopher,

As discussed, I revisited my previous series regarding check addr and
port consistency. I don't think I missed anything.

I won't hide my aim here, I would like to add support to set
`check.addr` on the cli like it is possible for the service port. More
and more people have a setup with containers having their own IP, but
with a health check on the host (e.g. on the consul side for example).
So it becomes more and more needed.

That's mostly why I wanted to clarify the situation first with the
different things I've seen while revisting this part of the code.

William Dauchy (5):
  BUG/MINOR: cli: fix set server addr/port coherency with health checks
  MEDIUM: server: adding support for check_port in server state
  MEDIUM: server: get rid of checkport flag
  MEDIUM: get rid of checkaddr and agentaddr flag
  BUG/MINOR: server: don't set agent addr for addr parsing

 doc/management.txt|  1 +
 include/haproxy/server-t.h| 10 ++--
 .../checks/1be_40srv_odd_health_checks.vtc|  2 +-
 .../checks/40be_2srv_odd_health_checks.vtc|  2 +-
 reg-tests/checks/4be_1srv_health_checks.vtc   |  6 +-
 src/check.c   |  5 +-
 src/dns.c |  3 +-
 src/proxy.c   |  4 +-
 src/server.c  | 57 ---
 9 files changed, 39 insertions(+), 51 deletions(-)

-- 
2.29.2




[PATCH v2 5/5] BUG/MINOR: server: don't set agent addr for addr parsing

2021-02-02 Thread William Dauchy
small consistency problem with `addr` and `agent-addr` options:
for the both options, the last one parsed is always used to set the
agent-check addr.  Thus these two lines don't have the same behavior:

  server ... addr  agent-addr 
  server ... agent-addr  addr 

I was not really able to determine where this issue is coming from. So
it is probable we may backport it to all stable version where the agent
is supported.

Signed-off-by: William Dauchy 
---
 src/check.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/check.c b/src/check.c
index f4d5bd452..aab86dc8c 100644
--- a/src/check.c
+++ b/src/check.c
@@ -1526,7 +1526,7 @@ static int srv_parse_addr(char **args, int *cur_arg, 
struct proxy *curpx, struct
goto error;
}
 
-   srv->check.addr = srv->agent.addr = *sk;
+   srv->check.addr = *sk;
 
   out:
return err_code;
-- 
2.29.2




[PATCH v2 1/5] BUG/MINOR: cli: fix set server addr/port coherency with health checks

2021-02-02 Thread William Dauchy
while reading `update_server_addr_port` I found out some things which
can be seen as incoherency. I hope I did not overlooked anything:

- one comment is stating check's address should be updated if it uses
  the server one; however the condition checks if `SRV_F_CHECKADDR` is
  set; this flag is set when a check address is set; result is that we
  override the check address where I was not expecting it. In fact we
  don't need to update anything here as server addr is used when check
  addr is not set.
- same goes for check agent addr
- for port, it is a bit different, we update the check port if it is
  unset. This is harmless because we also use server port if check port
  is unset. However it creates some incoherency before/after using this
  command, as check port should stay unset througout the life of the
  process unless it is is set by `set server check-port` command.

quite hard to locate the origin of this this issue but the function was
introduced in commit d458adcc52b74608e2fe6a2a95f09ce5e94932b7 ("MINOR:
new update_server_addr_port() function to change both server's ADDR and
service PORT"). I was however not able to determine whether this is due
to a change of behavior along the years. So this patch can potentially
be backported up to v1.8 but we must be careful while doing so, as the
code has changed a lot. That being said, the bug being not very
impacting I would be fine keeping it for 2.4 only.

Signed-off-by: William Dauchy 
---
 src/server.c | 15 ---
 1 file changed, 15 deletions(-)

diff --git a/src/server.c b/src/server.c
index 10f528640..99b7e9181 100644
--- a/src/server.c
+++ b/src/server.c
@@ -3625,16 +3625,6 @@ const char *update_server_addr_port(struct server *s, 
const char *addr, const ch
ipcpy(, >addr);
changed = 1;
 
-   /* we also need to update check's ADDR only if it uses the 
server's one */
-   if ((s->check.state & CHK_ST_CONFIGURED) && (s->flags & 
SRV_F_CHECKADDR)) {
-   ipcpy(, >check.addr);
-   }
-
-   /* we also need to update agent ADDR only if it use the 
server's one */
-   if ((s->agent.state & CHK_ST_CONFIGURED) && (s->flags & 
SRV_F_AGENTADDR)) {
-   ipcpy(, >agent.addr);
-   }
-
/* update report for caller */
chunk_printf(msg, "IP changed from '%s' to '%s'", current_addr, 
addr);
}
@@ -3714,11 +3704,6 @@ const char *update_server_addr_port(struct server *s, 
const char *addr, const ch
}
 
chunk_appendf(msg, "%d'", new_port);
-
-   /* we also need to update health checks port only if it 
uses server's realport */
-   if ((s->check.state & CHK_ST_CONFIGURED) && !(s->flags 
& SRV_F_CHECKPORT)) {
-   s->check.port = new_port;
-   }
}
else {
chunk_appendf(msg, "no need to change the port");
-- 
2.29.2




Re: [PATCH 1/2] BUG/MINOR: cli: fix set server addr/port coherency with health checks

2021-02-02 Thread William Dauchy
Hi Christopher,

Thanks for the review.

On Tue, Feb 2, 2021 at 10:21 AM Christopher Faulet  wrote:
> So, it may be good to take a global look at this stuff. I may missed 
> something.
> And be carefull for the backports because the health-checks were refactored in
> the 2.2.

ok I will come back with a more global look and see whether I can get
rid of those flags.

Thanks,
-- 
William



Re: [PATCH 1/9] MAJOR: contrib/prometheus-exporter: move health check status to labels

2021-02-01 Thread William Dauchy
On Mon, Feb 1, 2021 at 12:05 PM Pierre Cheynier  wrote:
> In addition to this update, I would add some recommendations about the user
> setup in the README ("how do I prevent my prometheus instance to explode
> when scrapping this endpoint?").
> For server-template users:
> - 
>   params:
> no-maint:
> - empty
>
> Generally speaking, to prevent all server metrics to be saved, except this 
> one:
> - 
>metric_relabel_configs:
>- source_labels: ['__name__']
>   regex: 'haproxy_(process_|frontend_|backend_|server_check_status).*'
>   action: keep

agreed taken into account in v2

> "Status of last health check, per state value" ?

updated in v2.
-- 
William



Re: [PATCH 0/9] prometheus: health check as labels + cleanup

2021-02-01 Thread William Dauchy
On Mon, Feb 01, 2021 at 11:30:18AM +0100, Christopher Faulet wrote:
> Don't be sorry to send patches ! If you think the number of labels for the
> check_status metric is not a problem, even for huge configurations, I trust 
> you.
> And I guess we can reduce this number to 16 status only, removing all status
> with an unknown check result (HCHK_STATUS_UNKNOWN, HCHK_STATUS_INI,
> HCHK_STATUS_START, HCHK_STATUS_L57DATA) and the status regarding the 
> agent-check
> only (HCHK_STATUS_CHECKED).
> 
> For info, HCHK_STATUS_UNKNOWN is only used for uninitialized health-check
> (during configuration parsing). HCHK_STATUS_INI is used for initialized
> health-check and before the first run. HCHK_STATUS_START and 
> HCHK_STATUS_L57DATA
> are dummy status and never assigned to a health-check. And as said,
> HCHK_STATUS_CHECKED is only used for agent-check.
> 
> It is a slight improvement, but still better than nothing. To do so, I propose
> you to add a function in check.c to get the corresponding check result of a
> status (the attached patch). This way, in your first patch, we can filter on
> check result. I may amend your first patch this way if you're ok:

definitely worth it indeed. I let you do the amend

> --- a/contrib/prometheus-exporter/service-prometheus.c
> +++ b/contrib/prometheus-exporter/service-prometheus.c
> @@ -880,6 +880,8 @@ static int promex_dump_srv_metrics(struct appctx *appctx, 
> struct htx *htx)
> goto next_sv;
> for (i = 0; i < 
> HCHK_STATUS_SIZE; i++) {
> +   if 
> (get_check_status_result(i) < CHK_RES_FAILED)
> +   continue;
> val = 
> mkf_u32(FO_STATUS, sv->check.status == i);
> check_state = 
> get_check_status_info(i);
> labels[2].name = 
> ist("state");
> 
> No comments about the other patches. Except the README is now outdated and
> does not reflect recent changes. It could be good to keep it up-to-date as far
> as possible.

ah true, I will send an update about it soon.

-- 
William



[PATCH 1/2] BUG/MINOR: cli: fix set server addr/port coherency with health checks

2021-02-01 Thread William Dauchy
while reading `update_server_addr_port` I found out some things which
can be seen as incoherency. I hope I did not overlooked anything:

- one comment is stating check's address should be updated if it uses
  the server one; however the condition checks if `SRV_F_CHECKADDR` is
  set; this flag is set when a check address is set; result is that we
  override the check address where I was not expecting it. In fact we
  don't need to update anything here as server addr is used when check
  addr is not set.
- same goes for check agent addr
- for port, it is a bit different, we update the check port if it is
  unset. This is harmless because we also use server port if check port
  is unset. However it creates some incoherency before/after using this
  command, as check port should stay unset througout the life of the
  process unless it is is set by `set server check-port` command.

quite hard to locate the origin of this this issue but the function was
introduced in commit d458adcc52b74608e2fe6a2a95f09ce5e94932b7 ("MINOR:
new update_server_addr_port() function to change both server's ADDR and
service PORT"). I was however not able to determine whether this is due
to a change of behavior along the years. So this patch can potentially
be backported up to v1.8.

Signed-off-by: William Dauchy 
---
 src/server.c | 15 ---
 1 file changed, 15 deletions(-)

diff --git a/src/server.c b/src/server.c
index 10f528640..99b7e9181 100644
--- a/src/server.c
+++ b/src/server.c
@@ -3625,16 +3625,6 @@ const char *update_server_addr_port(struct server *s, 
const char *addr, const ch
ipcpy(, >addr);
changed = 1;
 
-   /* we also need to update check's ADDR only if it uses the 
server's one */
-   if ((s->check.state & CHK_ST_CONFIGURED) && (s->flags & 
SRV_F_CHECKADDR)) {
-   ipcpy(, >check.addr);
-   }
-
-   /* we also need to update agent ADDR only if it use the 
server's one */
-   if ((s->agent.state & CHK_ST_CONFIGURED) && (s->flags & 
SRV_F_AGENTADDR)) {
-   ipcpy(, >agent.addr);
-   }
-
/* update report for caller */
chunk_printf(msg, "IP changed from '%s' to '%s'", current_addr, 
addr);
}
@@ -3714,11 +3704,6 @@ const char *update_server_addr_port(struct server *s, 
const char *addr, const ch
}
 
chunk_appendf(msg, "%d'", new_port);
-
-   /* we also need to update health checks port only if it 
uses server's realport */
-   if ((s->check.state & CHK_ST_CONFIGURED) && !(s->flags 
& SRV_F_CHECKPORT)) {
-   s->check.port = new_port;
-   }
}
else {
chunk_appendf(msg, "no need to change the port");
-- 
2.29.2




[PATCH 2/2] BUG/MINOR: cli: set server check-port missing flag

2021-02-01 Thread William Dauchy
when we set check port flag, we need to set a flag on server.

this is missing since the origin of the feature: commit
5094656a67fa1b56f30cd2316adb675ca9d2a79a ("MINOR: cli: change a server
health check port through the stats socket"). So it can potentially be
backport up to v1.8.

Signed-off-by: William Dauchy 
---
 src/server.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/server.c b/src/server.c
index 99b7e9181..ded3d47ff 100644
--- a/src/server.c
+++ b/src/server.c
@@ -4405,6 +4405,7 @@ static int cli_parse_set_server(char **args, char 
*payload, struct appctx *appct
goto out_unlock;
}
sv->check.port = i;
+   sv->flags |= SRV_F_CHECKPORT;
cli_msg(appctx, LOG_NOTICE, "health check port updated.\n");
}
else if (strcmp(args[3], "addr") == 0) {
-- 
2.29.2




[PATCH 9/9] CLEANUP: contrib/prometheus-exporter: align and reorder fields

2021-01-30 Thread William Dauchy
- align safe_idle_connections_current field
  fix minor typo added in commit
  37286a5ac595069a491c3e8a7a7e4faf3d9ea8ad ("MEDIUM:
  contrib/prometheus-exporter: Rework matrices defining Promex metrics")
- reorder info fields to be able to compare them easily
- add missing ignored info fields as comment

Signed-off-by: William Dauchy 
---
 contrib/prometheus-exporter/service-prometheus.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/contrib/prometheus-exporter/service-prometheus.c 
b/contrib/prometheus-exporter/service-prometheus.c
index 98f6fca62..29524cd91 100644
--- a/contrib/prometheus-exporter/service-prometheus.c
+++ b/contrib/prometheus-exporter/service-prometheus.c
@@ -108,7 +108,6 @@ const struct promex_metric 
promex_global_metrics[INF_TOTAL_FIELDS] = {
//[INF_NAME]   ignored
//[INF_VERSION],   ignored
//[INF_RELEASE_DATE]   ignored
-   [INF_BUILD_INFO] = { .n = IST("build_info"),
.type = PROMEX_MT_GAUGE,   .flags = PROMEX_FL_INFO_METRIC },
[INF_NBTHREAD]   = { .n = IST("nbthread"),  
.type = PROMEX_MT_GAUGE,   .flags = PROMEX_FL_INFO_METRIC },
[INF_NBPROC] = { .n = IST("nbproc"),
.type = PROMEX_MT_GAUGE,   .flags = PROMEX_FL_INFO_METRIC },
[INF_PROCESS_NUM]= { .n = 
IST("relative_process_id"),   .type = PROMEX_MT_GAUGE,   .flags = 
PROMEX_FL_INFO_METRIC },
@@ -116,8 +115,11 @@ const struct promex_metric 
promex_global_metrics[INF_TOTAL_FIELDS] = {
//[INF_UPTIME] ignored
[INF_UPTIME_SEC] = { .n = IST("uptime_seconds"),
.type = PROMEX_MT_GAUGE,   .flags = PROMEX_FL_INFO_METRIC },
[INF_START_TIME_SEC] = { .n = 
IST("start_time_seconds"),.type = PROMEX_MT_GAUGE,   .flags = 
PROMEX_FL_INFO_METRIC },
+   //[INF_MEMMAX_MB]  ignored
[INF_MEMMAX_BYTES]   = { .n = IST("max_memory_bytes"),  
.type = PROMEX_MT_GAUGE,   .flags = PROMEX_FL_INFO_METRIC },
+   //[INF_POOL_ALLOC_MB]  ignored
[INF_POOL_ALLOC_BYTES]   = { .n = 
IST("pool_allocated_bytes"),  .type = PROMEX_MT_GAUGE,   .flags = 
PROMEX_FL_INFO_METRIC },
+   //[INF_POOL_USED_MB]   ignored
[INF_POOL_USED_BYTES]= { .n = IST("pool_used_bytes"),   
.type = PROMEX_MT_GAUGE,   .flags = PROMEX_FL_INFO_METRIC },
[INF_POOL_FAILED]= { .n = 
IST("pool_failures_total"),   .type = PROMEX_MT_COUNTER, .flags = 
PROMEX_FL_INFO_METRIC },
[INF_ULIMIT_N]   = { .n = IST("max_fds"),   
.type = PROMEX_MT_GAUGE,   .flags = PROMEX_FL_INFO_METRIC },
@@ -173,6 +175,7 @@ const struct promex_metric 
promex_global_metrics[INF_TOTAL_FIELDS] = {
[INF_BYTES_OUT_RATE] = { .n = IST("bytes_out_rate"),
.type = PROMEX_MT_GAUGE,   .flags = PROMEX_FL_INFO_METRIC },
//[INF_DEBUG_COMMANDS_ISSUED]  ignored
[INF_CUM_LOG_MSGS]   = { .n = IST("recv_logs_total"),   
.type = PROMEX_MT_COUNTER, .flags = PROMEX_FL_INFO_METRIC },
+   [INF_BUILD_INFO] = { .n = IST("build_info"),
.type = PROMEX_MT_GAUGE,   .flags = PROMEX_FL_INFO_METRIC },
 };
 
 /* frontend/backend/server fields */
@@ -273,7 +276,7 @@ const struct promex_metric 
promex_st_metrics[ST_F_TOTAL_FIELDS] = {
[ST_F_TT_MAX] = { .n = IST("max_total_time_seconds"),   
.type = PROMEX_MT_GAUGE,.flags = ( 
PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) },
[ST_F_EINT]   = { .n = IST("internal_errors_total"),
.type = PROMEX_MT_COUNTER,  .flags = (PROMEX_FL_FRONT_METRIC | 
PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) },
[ST_F_IDLE_CONN_CUR]  = { .n = IST("unsafe_idle_connections_current"),  
.type = PROMEX_MT_GAUGE,.flags = (  
   PROMEX_FL_SRV_METRIC) },
-   [ST_F_SAFE_CONN_CUR]=   { .n = IST("safe_idle_connections_current"),
.type = PROMEX_MT_GAUGE,.flags = (  
   PROMEX_FL_SRV_METRIC) },
+   [ST_F_SAFE_CONN_CUR]  = { .n = IST("safe_idle_connections_current"),
.type = PROMEX_MT_GAUGE,.flags = (  
   PROMEX_FL_SRV_METRIC) },
[ST_F_USED_CONN_CUR]  = { .n = IST("used_connections_current"), 
.type =

[PATCH 7/9] MINOR: contrib/prometheus-exporter: add recv logs_logs_total field

2021-01-30 Thread William Dauchy
this field was added by commit 45c457a62941a7c4a86ce4327d7755edcd4b230e
("MINOR: log: adds counters on received syslog messages.")

Signed-off-by: William Dauchy 
---
 contrib/prometheus-exporter/service-prometheus.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/contrib/prometheus-exporter/service-prometheus.c 
b/contrib/prometheus-exporter/service-prometheus.c
index 0db75b6ae..6c5072b16 100644
--- a/contrib/prometheus-exporter/service-prometheus.c
+++ b/contrib/prometheus-exporter/service-prometheus.c
@@ -177,6 +177,7 @@ const struct promex_metric 
promex_global_metrics[INF_TOTAL_FIELDS] = {
[INF_TOTAL_SPLICED_BYTES_OUT]= { .n = 
IST("spliced_bytes_out_total"),   .type = PROMEX_MT_COUNTER, .flags = 
PROMEX_FL_INFO_METRIC },
[INF_BYTES_OUT_RATE] = { .n = IST("bytes_out_rate"),
.type = PROMEX_MT_GAUGE,   .flags = PROMEX_FL_INFO_METRIC },
//[INF_DEBUG_COMMANDS_ISSUED]  ignored
+   [INF_CUM_LOG_MSGS]   = { .n = IST("recv_logs_total"),   
.type = PROMEX_MT_COUNTER, .flags = PROMEX_FL_INFO_METRIC },
 };
 
 /* frontend/backend/server fields */
-- 
2.29.2




[PATCH 8/9] CLEANUP: contrib/prometheus-exporter: remove unused includes

2021-01-30 Thread William Dauchy
unless I'm wrong, those includes are no longer needed.  The only recent
one I remember is ssl-sock include since commit
5d9b8f3c9347a1a10b86f81d70b22c3cab0e6925 ("MINOR:
contrib/prometheus-exporter: use fill_info for process dump") where we
make use of the code from stats.c

Signed-off-by: William Dauchy 
---
 contrib/prometheus-exporter/service-prometheus.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/contrib/prometheus-exporter/service-prometheus.c 
b/contrib/prometheus-exporter/service-prometheus.c
index 6c5072b16..98f6fca62 100644
--- a/contrib/prometheus-exporter/service-prometheus.c
+++ b/contrib/prometheus-exporter/service-prometheus.c
@@ -19,8 +19,6 @@
 #include 
 #include 
 #include 
-#include 
-#include 
 #include 
 #include 
 #include 
@@ -29,12 +27,9 @@
 #include 
 #include 
 #include 
-#include 
-#include 
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
-- 
2.29.2




[PATCH 5/9] MINOR: contrib/prometheus-exporter: use stats desc when possible

2021-01-30 Thread William Dauchy
It is a followup work of commit a191b77e54c26a97064cb42ab4927d4f5c95b896
("MINOR: contrib/prometheus-exporter: merge info description from
stats") but for all other stats fields; we however keep a way to
override them when needed (e.g. units, specific cases)

this is another step which will avoid duplicating work between stats.c
and prometheus.

Signed-off-by: William Dauchy 
---
 .../prometheus-exporter/service-prometheus.c  | 107 ++
 1 file changed, 10 insertions(+), 97 deletions(-)

diff --git a/contrib/prometheus-exporter/service-prometheus.c 
b/contrib/prometheus-exporter/service-prometheus.c
index 8baef82d7..f10e78f49 100644
--- a/contrib/prometheus-exporter/service-prometheus.c
+++ b/contrib/prometheus-exporter/service-prometheus.c
@@ -282,107 +282,20 @@ const struct promex_metric 
promex_st_metrics[ST_F_TOTAL_FIELDS] = {
[ST_F_NEED_CONN_EST]  = { .n = IST("need_connections_current"), 
.type = PROMEX_MT_GAUGE,.flags = (  
   PROMEX_FL_SRV_METRIC) },
 };
 
-/* Description of all stats fields */
+/* Description of overriden stats fields */
 const struct ist promex_st_metric_desc[ST_F_TOTAL_FIELDS] = {
-   [ST_F_PXNAME] = IST("The proxy name."),
-   [ST_F_SVNAME] = IST("The service name (FRONTEND for frontend, 
BACKEND for backend, any name for server/listener)."),
-   [ST_F_QCUR]   = IST("Current number of queued requests."),
-   [ST_F_QMAX]   = IST("Maximum observed number of queued 
requests."),
-   [ST_F_SCUR]   = IST("Current number of active sessions."),
-   [ST_F_SMAX]   = IST("Maximum observed number of active 
sessions."),
-   [ST_F_SLIM]   = IST("Configured session limit."),
-   [ST_F_STOT]   = IST("Total number of sessions."),
-   [ST_F_BIN]= IST("Current total of incoming bytes."),
-   [ST_F_BOUT]   = IST("Current total of outgoing bytes."),
-   [ST_F_DREQ]   = IST("Total number of denied requests."),
-   [ST_F_DRESP]  = IST("Total number of denied responses."),
-   [ST_F_EREQ]   = IST("Total number of request errors."),
-   [ST_F_ECON]   = IST("Total number of connection errors."),
-   [ST_F_ERESP]  = IST("Total number of response errors."),
-   [ST_F_WRETR]  = IST("Total number of retry warnings."),
-   [ST_F_WREDIS] = IST("Total number of redispatch warnings."),
[ST_F_STATUS] = IST("Current status of the service (0/1 
depending on current `state` label value)."),
-   [ST_F_WEIGHT] = IST("Service weight."),
-   [ST_F_ACT]= IST("Current number of active servers."),
-   [ST_F_BCK]= IST("Current number of backup servers."),
-   [ST_F_CHKFAIL]= IST("Total number of failed check (Only counts 
checks failed when the server is up)."),
-   [ST_F_CHKDOWN]= IST("Total number of UP->DOWN transitions."),
-   [ST_F_LASTCHG]= IST("Number of seconds since the last UP<->DOWN 
transition."),
-   [ST_F_DOWNTIME]   = IST("Total downtime (in seconds) for the 
service."),
-   [ST_F_QLIMIT] = IST("Configured maxqueue for the server (0 
meaning no limit)."),
-   [ST_F_PID]= IST("Process id (0 for first instance, 1 for 
second, ...)"),
-   [ST_F_IID]= IST("Unique proxy id."),
-   [ST_F_SID]= IST("Server id (unique inside a proxy)."),
-   [ST_F_THROTTLE]   = IST("Current throttle percentage for the 
server, when slowstart is active, or no value if not in slowstart."),
-   [ST_F_LBTOT]  = IST("Total number of times a service was 
selected, either for new sessions, or when redispatching."),
-   [ST_F_TRACKED]= IST("Id of proxy/server if tracking is 
enabled."),
-   [ST_F_TYPE]   = IST("Service type (0=frontend, 1=backend, 
2=server, 3=socket/listener)."),
-   [ST_F_RATE]   = IST("Current number of sessions per second over 
last elapsed second."),
-   [ST_F_RATE_LIM]   = IST("Configured limit on new sessions per 
second."),
-   [ST_F_RATE_MAX]   = IST("Maximum observed number of sessions per 
second."),
[ST_F_CHECK_STATUS]   = IST("Status of last health check (0/1 depending 
on current `state` label value)."),
[ST_F_CHECK_CODE] = IST("layer5-7 code, if available of the last 
health check."),
[ST_F_CHECK_DURATION] = IST("Total duration of the latest server health 
che

[PATCH 6/9] MINOR: contrib/prometheus-exporter: add uweight field

2021-01-30 Thread William Dauchy
this field was added in commit bd7151002437af1a034a9fdbb582b3cbef5a78d1
("MINOR: stats: report server's user-configured weight next to effective
weight")

Signed-off-by: William Dauchy 
---
 contrib/prometheus-exporter/service-prometheus.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/contrib/prometheus-exporter/service-prometheus.c 
b/contrib/prometheus-exporter/service-prometheus.c
index f10e78f49..0db75b6ae 100644
--- a/contrib/prometheus-exporter/service-prometheus.c
+++ b/contrib/prometheus-exporter/service-prometheus.c
@@ -280,6 +280,7 @@ const struct promex_metric 
promex_st_metrics[ST_F_TOTAL_FIELDS] = {
[ST_F_SAFE_CONN_CUR]=   { .n = IST("safe_idle_connections_current"),
.type = PROMEX_MT_GAUGE,.flags = (  
   PROMEX_FL_SRV_METRIC) },
[ST_F_USED_CONN_CUR]  = { .n = IST("used_connections_current"), 
.type = PROMEX_MT_GAUGE,.flags = (  
   PROMEX_FL_SRV_METRIC) },
[ST_F_NEED_CONN_EST]  = { .n = IST("need_connections_current"), 
.type = PROMEX_MT_GAUGE,.flags = (  
   PROMEX_FL_SRV_METRIC) },
+   [ST_F_UWEIGHT]= { .n = IST("uweight"),  
.type = PROMEX_MT_GAUGE,.flags = ( 
PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) },
 };
 
 /* Description of overriden stats fields */
-- 
2.29.2




[PATCH 4/9] MINOR: stats: improve max stats descriptions

2021-01-30 Thread William Dauchy
In order to unify prometheus and stats description, we need to remove
some field reference which are specific to stats implementation:
- `scur` in max current sessions (also reword current session)
- `rate` in max sessions
- `req_rate` in max requests
- `conn_rate` in max connections

Signed-off-by: William Dauchy 
---
 src/stats.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/src/stats.c b/src/stats.c
index fdb8e9adf..6fa59a9ef 100644
--- a/src/stats.c
+++ b/src/stats.c
@@ -160,8 +160,8 @@ const struct name_desc stat_fields[ST_F_TOTAL_FIELDS] = {
[ST_F_SVNAME]= { .name = "svname",  
.desc = "Server name" },
[ST_F_QCUR]  = { .name = "qcur",
.desc = "Number of current queued connections" },
[ST_F_QMAX]  = { .name = "qmax",
.desc = "Highest value of queued connections encountered since process 
started" },
-   [ST_F_SCUR]  = { .name = "scur",
.desc = "Current number of sessions on the frontend, backend or server" 
},
-   [ST_F_SMAX]  = { .name = "smax",
.desc = "Highest value of scur encountered since process started" },
+   [ST_F_SCUR]  = { .name = "scur",
.desc = "Number of current sessions on the frontend, backend or server" 
},
+   [ST_F_SMAX]  = { .name = "smax",
.desc = "Highest value of current sessions encountered since process 
started" },
[ST_F_SLIM]  = { .name = "slim",
.desc = "Frontend/listener/server's maxconn, backend's fullconn" },
[ST_F_STOT]  = { .name = "stot",
.desc = "Total number of sessions since process started" },
[ST_F_BIN]   = { .name = "bin", 
.desc = "Total number of request bytes since process started" },
@@ -191,7 +191,7 @@ const struct name_desc stat_fields[ST_F_TOTAL_FIELDS] = {
[ST_F_TYPE]  = { .name = "type",
.desc = "Type of the object (Listener, Frontend, Backend, Server)" },
[ST_F_RATE]  = { .name = "rate",
.desc = "Total number of sessions processed by this object over the 
last second (sessions for listeners/frontends, requests for backends/servers)" 
},
[ST_F_RATE_LIM]  = { .name = "rate_lim",
.desc = "Limit on the number of sessions accepted in a second (frontend 
only, 'rate-limit sessions' setting)" },
-   [ST_F_RATE_MAX]  = { .name = "rate_max",
.desc = "Highest value of 'rate' observed since the worker process 
started" },
+   [ST_F_RATE_MAX]  = { .name = "rate_max",
.desc = "Highest value of sessions per second observed since the worker 
process started" },
[ST_F_CHECK_STATUS]  = { .name = "check_status",
.desc = "Status report of the server's latest health check, prefixed 
with '*' if a check is currently in progress" },
[ST_F_CHECK_CODE]= { .name = "check_code",  
.desc = "HTTP/SMTP/LDAP status code reported by the latest server 
health check" },
[ST_F_CHECK_DURATION]= { .name = "check_duration",  
.desc = "Total duration of the latest server health check, in 
milliseconds" },
@@ -203,7 +203,7 @@ const struct name_desc stat_fields[ST_F_TOTAL_FIELDS] = {
[ST_F_HRSP_OTHER]= { .name = "hrsp_other",  
.desc = "Total number of HTTP responses with status <100, >599 returned 
by this object since the worker process started (error -1 included)" },
[ST_F_HANAFAIL]  = { .name = "hanafail",
.desc = "Total number of failed checks caused by an 'on-error' 
directive after an 'observe' condition matched" },
[ST_F_REQ_RATE]  = { .name = "req_rate",
.desc = "Number of HTTP requests processed over the last second on this 
object" },
-   [ST_F_REQ_RATE_MAX]  = { .name = "req_rate_max",
.desc = "Highest value of 'req_rate' observed since the worker p

[PATCH 3/9] MINOR: stats: improve pending connections description

2021-01-30 Thread William Dauchy
In order to unify prometheus and stats description, we need to clarify
the description for pending connections.
- remove the BE reference in counters struct, as it is also used in
  servers
- remove reference of `qcur` field in description as it is specific to
  stats implemention
- try to reword cur and max pending connections description

Signed-off-by: William Dauchy 
---
 include/haproxy/counters-t.h | 2 +-
 src/stats.c  | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/haproxy/counters-t.h b/include/haproxy/counters-t.h
index b896c4d36..1ea68dd6c 100644
--- a/include/haproxy/counters-t.h
+++ b/include/haproxy/counters-t.h
@@ -73,7 +73,7 @@ struct be_counters {
 
unsigned int cps_max;   /* maximum of new connections 
received per second */
unsigned int sps_max;   /* maximum of new connections 
accepted per second (sessions) */
-   unsigned int nbpend_max;/* max number of pending 
connections with no server assigned yet (BE only) */
+   unsigned int nbpend_max;/* max number of pending 
connections with no server assigned yet */
unsigned int cur_sess_max;  /* max number of currently 
active sessions */
 
long long bytes_in; /* number of bytes transferred 
from the client to the server */
diff --git a/src/stats.c b/src/stats.c
index 4b8bd89b4..fdb8e9adf 100644
--- a/src/stats.c
+++ b/src/stats.c
@@ -158,8 +158,8 @@ const struct name_desc info_fields[INF_TOTAL_FIELDS] = {
 const struct name_desc stat_fields[ST_F_TOTAL_FIELDS] = {
[ST_F_PXNAME]= { .name = "pxname",  
.desc = "Proxy name" },
[ST_F_SVNAME]= { .name = "svname",  
.desc = "Server name" },
-   [ST_F_QCUR]  = { .name = "qcur",
.desc = "Current number of connections waiting in the server of backend 
queue" },
-   [ST_F_QMAX]  = { .name = "qmax",
.desc = "Highest value of qcur encountered since process started" },
+   [ST_F_QCUR]  = { .name = "qcur",
.desc = "Number of current queued connections" },
+   [ST_F_QMAX]  = { .name = "qmax",
.desc = "Highest value of queued connections encountered since process 
started" },
[ST_F_SCUR]  = { .name = "scur",
.desc = "Current number of sessions on the frontend, backend or server" 
},
[ST_F_SMAX]  = { .name = "smax",
.desc = "Highest value of scur encountered since process started" },
[ST_F_SLIM]  = { .name = "slim",
.desc = "Frontend/listener/server's maxconn, backend's fullconn" },
-- 
2.29.2




[PATCH 1/9] MAJOR: contrib/prometheus-exporter: move health check status to labels

2021-01-30 Thread William Dauchy
this patch is a breaking change between v2.3 and v2.4: we move from
using gauge value for health check states to labels values. The diff is
quite small thanks to the preparation work from Christopher to allow
more flexibility in labels, see commit
5a2f938732126f43bbf4cea5482c01552b0d0314 ("MEDIUM:
contrib/prometheus-exporter: Use dynamic labels instead of static ones")

this is a follow up of commit c6464591a365bfcf509b322bdaa4d608c9395d75
("MAJOR: contrib/prometheus-exporter: move ftd/bkd/srv states to
labels"). The main goal being to be better aligned with prometheus use
cases in terms of queries. More specifically to health checks, Pierre C.
mentioned the possible quirks he had to put in place in order to make
use of those metrics through prometheus:

   by(proxy, check_status) (count_values by(proxy,
  instance) ("check_status", haproxy_server_check_status))

I am perfectly aware this introduces a lot more metrics but I don't see
how we can improve the usability without it. The main issue remains in
the cardinality of the states which are > 20. Prometheus recommends to
stay below a cardinality of 10 for a given metric but I consider our
case very specific, because highly linked to the level of precision
haproxy exposes.

Even before this patch I saw several large production setup (a few
hundreds of MB in output) which are making use of the scope parameter to
simply ignore the server metrics, so that the scrapping can be faster,
and memory consumed on client side not too high. So I believe we should
eventually continue in that direction and offer more granularity of
filtering of the output. That being said it is already possible to
filter out the data on prometheus client side.

this is related to github issue #1029

Signed-off-by: William Dauchy 
---
 .../prometheus-exporter/service-prometheus.c| 17 ++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/contrib/prometheus-exporter/service-prometheus.c 
b/contrib/prometheus-exporter/service-prometheus.c
index dbf4c7f39..a3141d39d 100644
--- a/contrib/prometheus-exporter/service-prometheus.c
+++ b/contrib/prometheus-exporter/service-prometheus.c
@@ -18,6 +18,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -319,7 +320,7 @@ const struct ist promex_st_metric_desc[ST_F_TOTAL_FIELDS] = 
{
[ST_F_RATE]   = IST("Current number of sessions per second over 
last elapsed second."),
[ST_F_RATE_LIM]   = IST("Configured limit on new sessions per 
second."),
[ST_F_RATE_MAX]   = IST("Maximum observed number of sessions per 
second."),
-   [ST_F_CHECK_STATUS]   = IST("Status of last health check (HCHK_STATUS_* 
values)."),
+   [ST_F_CHECK_STATUS]   = IST("Status of last health check (0/1 depending 
on current `state` label value)."),
[ST_F_CHECK_CODE] = IST("layer5-7 code, if available of the last 
health check."),
[ST_F_CHECK_DURATION] = IST("Total duration of the latest server health 
check, in seconds."),
[ST_F_HRSP_1XX]   = IST("Total number of HTTP responses."),
@@ -886,6 +887,7 @@ static int promex_dump_srv_metrics(struct appctx *appctx, 
struct htx *htx)
int ret = 1;
double secs;
enum promex_srv_state state;
+   const char *check_state;
int i;
 
for (;appctx->st2 < ST_F_TOTAL_FIELDS; appctx->st2++) {
@@ -963,8 +965,17 @@ static int promex_dump_srv_metrics(struct appctx *appctx, 
struct htx *htx)
case ST_F_CHECK_STATUS:
if ((sv->check.state & 
(CHK_ST_ENABLED|CHK_ST_PAUSED)) != CHK_ST_ENABLED)
goto next_sv;
-   val = mkf_u32(FN_OUTPUT, 
sv->check.status);
-   break;
+
+   for (i = 0; i < 
HCHK_STATUS_SIZE; i++) {
+   val = 
mkf_u32(FO_STATUS, sv->check.status == i);
+   check_state = 
get_check_status_info(i);
+   labels[2].name = 
ist("state");
+   labels[2].value = 
ist2(check_state, strlen(check_state));
+   if 
(!promex_dump_metric(appctx, htx, prefix, _st_metrics[appctx->st2],
+   
, labels, , max))
+   goto full;
+   }
+   goto next_sv;

[PATCH 2/9] MINOR: contrib/prometheus-exporter: improve service status description field

2021-01-30 Thread William Dauchy
Since we changed the behaviour of this metric, improve the description
to better explain what is the meaning of the new gauge value; it also
reflects the description we did for health check status.

Signed-off-by: William Dauchy 
---
 contrib/prometheus-exporter/service-prometheus.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/contrib/prometheus-exporter/service-prometheus.c 
b/contrib/prometheus-exporter/service-prometheus.c
index a3141d39d..8baef82d7 100644
--- a/contrib/prometheus-exporter/service-prometheus.c
+++ b/contrib/prometheus-exporter/service-prometheus.c
@@ -301,7 +301,7 @@ const struct ist promex_st_metric_desc[ST_F_TOTAL_FIELDS] = 
{
[ST_F_ERESP]  = IST("Total number of response errors."),
[ST_F_WRETR]  = IST("Total number of retry warnings."),
[ST_F_WREDIS] = IST("Total number of redispatch warnings."),
-   [ST_F_STATUS] = IST("Current status of the service."),
+   [ST_F_STATUS] = IST("Current status of the service (0/1 
depending on current `state` label value)."),
[ST_F_WEIGHT] = IST("Service weight."),
[ST_F_ACT]= IST("Current number of active servers."),
[ST_F_BCK]= IST("Current number of backup servers."),
-- 
2.29.2




[PATCH 0/9] prometheus: health check as labels + cleanup

2021-01-30 Thread William Dauchy
Hi Christopher,

Sorry for this long series but I believe this one is fairly easy to
review:
- the major change is for health checks on prometheus side, there are
  more details in the commit message; this is something I wanted early
  so people can start testing it, especially on large setup. I however
  think we will need to improve the `scope` filtering. The good point is
  that after this patch, this metric will be way easier to use; the bad
  point is that it is a breaking change between v2.3 and v2.4. But I
  believe this is acceptable following the previous state changes
  already merged.
- next patch is the continuation of the cleaning work:
 * try to improve descriptions so we can use them in both case
 * merge them when possible, override otherwise
- I finished with a bit of minor cleaning which pointed me a few missing
  fields. I know we probably can improve that to avoid forgetting adding
  those fields in the future, but I assume it is ok for now to simply
  come back with a sane result. The postive point is I did not had to
  add description and implementation, as we make use of stats.c

William Dauchy (9):
  MAJOR: contrib/prometheus-exporter: move health check status to labels
  MINOR: contrib/prometheus-exporter: improve service status description
field
  MINOR: stats: improve pending connections description
  MINOR: stats: improve max stats descriptions
  MINOR: contrib/prometheus-exporter: use stats desc when possible
  MINOR: contrib/prometheus-exporter: add uweight field
  MINOR: contrib/prometheus-exporter: add recv logs_logs_total field
  CLEANUP: contrib/prometheus-exporter: remove unused includes
  CLEANUP: contrib/prometheus-exporter: align and reorder fields

 .../prometheus-exporter/service-prometheus.c  | 140 --
 include/haproxy/counters-t.h  |   2 +-
 src/stats.c   |  14 +-
 3 files changed, 40 insertions(+), 116 deletions(-)

-- 
2.29.2




Re: lua function core.get_info() broken in haproxy 2.2.7

2021-01-30 Thread William Dauchy
Hi James,

On Sat, Jan 30, 2021 at 3:07 AM James Brown  wrote:
> Ah, never mind, I see that this was already fixed in master in 
> 3ddec3ee7d344112b4e4fbde317f8886a20d66a0.

yeah, sorry about that. This will be fixed in v2.2.8 and v2.3.4
-- 
William



Re: Makefile, environment variables and REGTESTS_TYPES

2021-01-29 Thread William Dauchy
On Fri, Jan 29, 2021 at 2:46 PM William Lallemand
 wrote:
> According to `make reg-tests-help` the REGTESTS_TYPES parameter must be
> configured as an environment variable and not a make argument.
> However since patch 3bad3d5 ("BUILD: Makefile: exclude broken tests by
> default"), it does not work anymore with an environment variable.
> Looking at the several CI files we have, it is used as a make
> argument everywhere.

wow indeed, I did not realise that; yet another thing I broke ;-)

> I'm going to update the `make reg-tests-help` command with the correct
> syntax if there isn't any complain.

Thanks for looking into this!
-- 
William



[PATCH v2 3/4] MEDIUM: stats: allow to select one field in `stats_fill_sv_stats`

2021-01-25 Thread William Dauchy
prometheus approach requires to output all values for a given metric
name; meaning we iterate through all metrics, and then iterate in the
inner loop on all objects for this metric.
In order to allow more code reuse, adapt the stats API to be able to
select one field or fill them all otherwise.
This patch follows what has already been done on frontend and backend
side.
>From this patch it should be possible to remove most of the duplicate
code on prometheuse side for the server.

A few things to note though:
- state require prior calculation, so I moved that to a sort of helper
  `stats_fill_be_stats_computestate`.
- all ST_F*TIME fields requires some minor compute, so I moved it at te
  beginning of the function under a condition.

Signed-off-by: William Dauchy 
---
 include/haproxy/stats.h |   2 +-
 src/hlua_fcn.c  |   3 +-
 src/stats.c | 585 +---
 3 files changed, 368 insertions(+), 222 deletions(-)

diff --git a/include/haproxy/stats.h b/include/haproxy/stats.h
index 6c04f2761..6115dca8f 100644
--- a/include/haproxy/stats.h
+++ b/include/haproxy/stats.h
@@ -51,7 +51,7 @@ int stats_fill_fe_stats(struct proxy *px, struct field 
*stats, int len,
 int stats_fill_li_stats(struct proxy *px, struct listener *l, int flags,
 struct field *stats, int len);
 int stats_fill_sv_stats(struct proxy *px, struct server *sv, int flags,
-struct field *stats, int len);
+struct field *stats, int len, enum stat_field 
*selected_field);
 int stats_fill_be_stats(struct proxy *px, int flags, struct field *stats, int 
len,
enum stat_field *selected_field);
 
diff --git a/src/hlua_fcn.c b/src/hlua_fcn.c
index d13c5344f..6dd9efb21 100644
--- a/src/hlua_fcn.c
+++ b/src/hlua_fcn.c
@@ -922,7 +922,8 @@ int hlua_server_get_stats(lua_State *L)
return 1;
}
 
-   stats_fill_sv_stats(srv->proxy, srv, STAT_SHLGNDS, stats, STATS_LEN);
+   stats_fill_sv_stats(srv->proxy, srv, STAT_SHLGNDS, stats,
+   STATS_LEN, NULL);
 
lua_newtable(L);
for (i=0; i with the server statistics.  is
- * preallocated array of length . The length of the array
- * must be at least ST_F_TOTAL_FIELDS. If this length is less
- * then this value, the function returns 0, otherwise, it
- * returns 1.  can take the value STAT_SHLGNDS.
+/* Compute server state helper
  */
-int stats_fill_sv_stats(struct proxy *px, struct server *sv, int flags,
-struct field *stats, int len)
+static void stats_fill_sv_stats_computestate(struct server *sv, struct server 
*ref,
+enum srv_stats_state *state)
 {
-   struct server *via, *ref;
-   char str[INET6_ADDRSTRLEN];
-   struct buffer *out = get_trash_chunk();
-   enum srv_stats_state state;
-   char *fld_status;
-   long long srv_samples_counter;
-   unsigned int srv_samples_window = TIME_STATS_SAMPLES;
-
-   if (len < ST_F_TOTAL_FIELDS)
-   return 0;
-
-   /* we have "via" which is the tracked server as described in the 
configuration,
-* and "ref" which is the checked server and the end of the chain.
-*/
-   via = sv->track ? sv->track : sv;
-   ref = via;
-   while (ref->track)
-   ref = ref->track;
-
if (sv->cur_state == SRV_ST_RUNNING || sv->cur_state == 
SRV_ST_STARTING) {
if ((ref->check.state & CHK_ST_ENABLED) &&
(ref->check.health < ref->check.rise + ref->check.fall - 
1)) {
-   state = SRV_STATS_STATE_UP_GOING_DOWN;
+   *state = SRV_STATS_STATE_UP_GOING_DOWN;
} else {
-   state = SRV_STATS_STATE_UP;
+   *state = SRV_STATS_STATE_UP;
}
 
if (sv->cur_admin & SRV_ADMF_DRAIN) {
if (ref->agent.state & CHK_ST_ENABLED)
-   state = SRV_STATS_STATE_DRAIN_AGENT;
-   else if (state == SRV_STATS_STATE_UP_GOING_DOWN)
-   state = SRV_STATS_STATE_DRAIN_GOING_DOWN;
+   *state = SRV_STATS_STATE_DRAIN_AGENT;
+   else if (*state == SRV_STATS_STATE_UP_GOING_DOWN)
+   *state = SRV_STATS_STATE_DRAIN_GOING_DOWN;
else
-   state = SRV_STATS_STATE_DRAIN;
+   *state = SRV_STATS_STATE_DRAIN;
}
 
-   if (state == SRV_STATS_STATE_UP && !(ref->check.state & 
CHK_ST_ENABLED)) {
-   state = SRV_STATS_STATE_NO_CHECK;
+   if (*state == SRV_STATS_STATE_UP && !(ref->check.state & 
CHK_

[PATCH v2 2/4] MINOR: contrib/prometheus-exporter: use fill_be_stats for backend dump

2021-01-25 Thread William Dauchy
use `stats_fill_be_stats` when possible to avoid duplicating code; make
use of field selector to get the needed field only.

the only difference is on `haproxy_backend_downtime_seconds_total` as
stats.c is testing `px->srv`. This behaviour is present since commit
7344f4789321ef8ce2ce17cf73dabd672f7c8c6 ("MINOR: stats: only report
backend's down time if it has servers"). The end result is a NaN instead
of a zero when no server are present.

Signed-off-by: William Dauchy 
---
 .../prometheus-exporter/service-prometheus.c  | 163 ++
 1 file changed, 15 insertions(+), 148 deletions(-)

diff --git a/contrib/prometheus-exporter/service-prometheus.c 
b/contrib/prometheus-exporter/service-prometheus.c
index bc9ebf3fc..12fe3ee90 100644
--- a/contrib/prometheus-exporter/service-prometheus.c
+++ b/contrib/prometheus-exporter/service-prometheus.c
@@ -685,8 +685,8 @@ static int promex_dump_back_metrics(struct appctx *appctx, 
struct htx *htx)
struct channel *chn = si_ic(appctx->owner);
struct ist out = ist2(trash.area, 0);
size_t max = htx_get_max_blksz(htx, channel_htx_recv_max(chn, htx));
+   struct field *stats = stat_l[STATS_DOMAIN_PROXY];
int ret = 1;
-   uint32_t weight;
double secs;
 
for (;appctx->st2 < ST_F_TOTAL_FIELDS; appctx->st2++) {
@@ -700,46 +700,13 @@ static int promex_dump_back_metrics(struct appctx 
*appctx, struct htx *htx)
if (px->disabled || px->uuid <= 0 || !(px->cap & 
PR_CAP_BE))
goto next_px;
 
+   if (!stats_fill_be_stats(px, 0, stats, 
ST_F_TOTAL_FIELDS, &(appctx->st2)))
+   return -1;
+
switch (appctx->st2) {
case ST_F_STATUS:
val = mkf_u32(FO_STATUS, 
(px->lbprm.tot_weight > 0 || !px->srv) ? 1 : 0);
break;
-   case ST_F_SCUR:
-   val = mkf_u32(0, px->beconn);
-   break;
-   case ST_F_SMAX:
-   val = mkf_u32(FN_MAX, 
px->be_counters.conn_max);
-   break;
-   case ST_F_SLIM:
-   val = mkf_u32(FO_CONFIG|FN_LIMIT, 
px->fullconn);
-   break;
-   case ST_F_STOT:
-   val = mkf_u64(FN_COUNTER, 
px->be_counters.cum_conn);
-   break;
-   case ST_F_RATE_MAX:
-   val = mkf_u32(0, 
px->be_counters.sps_max);
-   break;
-   case ST_F_LASTSESS:
-   val = mkf_s32(FN_AGE, 
be_lastsession(px));
-   break;
-   case ST_F_QCUR:
-   val = mkf_u32(0, px->nbpend);
-   break;
-   case ST_F_QMAX:
-   val = mkf_u32(FN_MAX, 
px->be_counters.nbpend_max);
-   break;
-   case ST_F_CONNECT:
-   val = mkf_u64(FN_COUNTER, 
px->be_counters.connect);
-   break;
-   case ST_F_REUSE:
-   val = mkf_u64(FN_COUNTER, 
px->be_counters.reuse);
-   break;
-   case ST_F_BIN:
-   val = mkf_u64(FN_COUNTER, 
px->be_counters.bytes_in);
-   break;
-   case ST_F_BOUT:
-   val = mkf_u64(FN_COUNTER, 
px->be_counters.bytes_out);
-   break;
case ST_F_QTIME:
secs = 
(double)swrate_avg(px->be_counters.q_time, TIME_STATS_SAMPLES) / 1000.0;
val = mkf_flt(FN_AVG, secs);
@@ -772,139 +739,39 @@ static int promex_dump_back_metrics(struct appctx 
*appctx, struct htx *htx)
secs = 
(double)px->be_counters.ttime_max / 1000.0;
val = mkf_flt(FN_MAX, secs);
break;
-   case ST_F_DREQ:
-   val = mkf_u64(FN_COUNTER, 
px->be_counters.denied_req);
-

[PATCH v2 4/4] MINOR: contrib/prometheus-exporter: use fill_sv_stats for server dump

2021-01-25 Thread William Dauchy
use `stats_fill_sv_stats` when possible to avoid duplicating code.

the following metrics have a change of behaviour:

haproxy_server_limit_sessions
haproxy_server_queue_limit
haproxy_server_check_failures_total
haproxy_server_check_up_down_total
haproxy_server_downtime_seconds_total
haproxy_server_current_throttle
haproxy_server_idle_connections_limit

depending on cases, if the limit was not configured or enabled, NaN is
returned instead. It should not be an issue for users, even better than
before as it provides more precise info.

Signed-off-by: William Dauchy 
---
 .../prometheus-exporter/service-prometheus.c  | 143 +-
 1 file changed, 7 insertions(+), 136 deletions(-)

diff --git a/contrib/prometheus-exporter/service-prometheus.c 
b/contrib/prometheus-exporter/service-prometheus.c
index 12fe3ee90..364d4b6d1 100644
--- a/contrib/prometheus-exporter/service-prometheus.c
+++ b/contrib/prometheus-exporter/service-prometheus.c
@@ -799,8 +799,8 @@ static int promex_dump_srv_metrics(struct appctx *appctx, 
struct htx *htx)
struct channel *chn = si_ic(appctx->owner);
struct ist out = ist2(trash.area, 0);
size_t max = htx_get_max_blksz(htx, channel_htx_recv_max(chn, htx));
+   struct field *stats = stat_l[STATS_DOMAIN_PROXY];
int ret = 1;
-   uint32_t weight;
double secs;
 
for (;appctx->st2 < ST_F_TOTAL_FIELDS; appctx->st2++) {
@@ -817,6 +817,9 @@ static int promex_dump_srv_metrics(struct appctx *appctx, 
struct htx *htx)
while (appctx->ctx.stats.obj2) {
sv = appctx->ctx.stats.obj2;
 
+   if (!stats_fill_sv_stats(px, sv, 0, stats, 
ST_F_TOTAL_FIELDS, &(appctx->st2)))
+   return -1;
+
if ((appctx->ctx.stats.flags & 
PROMEX_FL_NO_MAINT_SRV) && (sv->cur_admin & SRV_ADMF_MAINT))
goto next_sv;
 
@@ -824,39 +827,6 @@ static int promex_dump_srv_metrics(struct appctx *appctx, 
struct htx *htx)
case ST_F_STATUS:
val = mkf_u32(FO_STATUS, 
promex_srv_status(sv));
break;
-   case ST_F_SCUR:
-   val = mkf_u32(0, sv->cur_sess);
-   break;
-   case ST_F_SMAX:
-   val = mkf_u32(FN_MAX, 
sv->counters.cur_sess_max);
-   break;
-   case ST_F_SLIM:
-   val = 
mkf_u32(FO_CONFIG|FN_LIMIT, sv->maxconn);
-   break;
-   case ST_F_STOT:
-   val = mkf_u64(FN_COUNTER, 
sv->counters.cum_sess);
-   break;
-   case ST_F_RATE_MAX:
-   val = mkf_u32(FN_MAX, 
sv->counters.sps_max);
-   break;
-   case ST_F_LASTSESS:
-   val = mkf_s32(FN_AGE, 
srv_lastsession(sv));
-   break;
-   case ST_F_QCUR:
-   val = mkf_u32(0, sv->nbpend);
-   break;
-   case ST_F_QMAX:
-   val = mkf_u32(FN_MAX, 
sv->counters.nbpend_max);
-   break;
-   case ST_F_QLIMIT:
-   val = 
mkf_u32(FO_CONFIG|FS_SERVICE, sv->maxqueue);
-   break;
-   case ST_F_BIN:
-   val = mkf_u64(FN_COUNTER, 
sv->counters.bytes_in);
-   break;
-   case ST_F_BOUT:
-   val = mkf_u64(FN_COUNTER, 
sv->counters.bytes_out);
-   break;
case ST_F_QTIME:
secs = 
(double)swrate_avg(sv->counters.q_time, TIME_STATS_SAMPLES) / 1000.0;
val = mkf_flt(FN_AVG, secs);
@@ -889,43 +859,6 @@ static int promex_dump_srv_metrics(struct appctx *appctx, 
str

[PATCH v2 1/4] MEDIUM: stats: allow to select one field in `stats_fill_be_stats`

2021-01-25 Thread William Dauchy
prometheus approach requires to output all values for a given metric
name; meaning we iterate through all metrics, and then iterate in the
inner loop on all objects for this metric.
In order to allow more code reuse, adapt the stats API to be able to
select one field or fill them all otherwise.
This patch follows what has already been done on frontend side.
>From this patch it should be possible to remove most of the duplicate
code on prometheuse side for the backend

A few things to note though:
- status and uweight field requires prior compute, so I moved that to a
  sort of helper `stats_fill_be_stats_computesrv`.
- all ST_F*TIME fields requires some minor compute, so I moved it at te
  beginning of the function under a condition.

Signed-off-by: William Dauchy 
---
 include/haproxy/stats.h |   3 +-
 src/hlua_fcn.c  |   2 +-
 src/stats.c | 368 
 3 files changed, 266 insertions(+), 107 deletions(-)

diff --git a/include/haproxy/stats.h b/include/haproxy/stats.h
index 8210367ac..6c04f2761 100644
--- a/include/haproxy/stats.h
+++ b/include/haproxy/stats.h
@@ -52,7 +52,8 @@ int stats_fill_li_stats(struct proxy *px, struct listener *l, 
int flags,
 struct field *stats, int len);
 int stats_fill_sv_stats(struct proxy *px, struct server *sv, int flags,
 struct field *stats, int len);
-int stats_fill_be_stats(struct proxy *px, int flags, struct field *stats, int 
len);
+int stats_fill_be_stats(struct proxy *px, int flags, struct field *stats, int 
len,
+   enum stat_field *selected_field);
 
 void stats_io_handler(struct stream_interface *si);
 int stats_emit_raw_data_field(struct buffer *out, const struct field *f);
diff --git a/src/hlua_fcn.c b/src/hlua_fcn.c
index aab864370..d13c5344f 100644
--- a/src/hlua_fcn.c
+++ b/src/hlua_fcn.c
@@ -1341,7 +1341,7 @@ int hlua_proxy_get_stats(lua_State *L)
 
px = hlua_check_proxy(L, 1);
if (px->cap & PR_CAP_BE)
-   stats_fill_be_stats(px, STAT_SHLGNDS, stats, STATS_LEN);
+   stats_fill_be_stats(px, STAT_SHLGNDS, stats, STATS_LEN, NULL);
else
stats_fill_fe_stats(px, stats, STATS_LEN, NULL);
lua_newtable(L);
diff --git a/src/stats.c b/src/stats.c
index dcb99f674..949b8f696 100644
--- a/src/stats.c
+++ b/src/stats.c
@@ -2260,130 +2260,288 @@ static int stats_dump_sv_stats(struct 
stream_interface *si, struct proxy *px, st
return stats_dump_one_line(stats, stats_count, appctx);
 }
 
-/* Fill  with the backend statistics.  is
- * preallocated array of length . The length of the array
- * must be at least ST_F_TOTAL_FIELDS. If this length is less
- * then this value, the function returns 0, otherwise, it
- * returns 1.  can take the value STAT_SHLGNDS.
+/* Helper to compute srv values for a given backend
  */
-int stats_fill_be_stats(struct proxy *px, int flags, struct field *stats, int 
len)
+static void stats_fill_be_stats_computesrv(struct proxy *px, int *nbup, int 
*nbsrv, int *totuw)
 {
-   long long be_samples_counter;
-   unsigned int be_samples_window = TIME_STATS_SAMPLES;
-   struct buffer *out = get_trash_chunk();
+   int nbup_tmp, nbsrv_tmp, totuw_tmp;
const struct server *srv;
-   int nbup, nbsrv;
-   int totuw;
-   char *fld;
-
-   if (len < ST_F_TOTAL_FIELDS)
-   return 0;
 
-   totuw = 0;
-   nbup = nbsrv = 0;
+   nbup_tmp = nbsrv_tmp = totuw_tmp = 0;
for (srv = px->srv; srv; srv = srv->next) {
if (srv->cur_state != SRV_ST_STOPPED) {
-   nbup++;
+   nbup_tmp++;
if (srv_currently_usable(srv) &&
(!px->srv_act ^ !(srv->flags & SRV_F_BACKUP)))
-   totuw += srv->uweight;
+   totuw_tmp += srv->uweight;
}
-   nbsrv++;
+   nbsrv_tmp++;
}
 
HA_RWLOCK_RDLOCK(LBPRM_LOCK, >lbprm.lock);
if (!px->srv_act && px->lbprm.fbck)
-   totuw = px->lbprm.fbck->uweight;
+   totuw_tmp = px->lbprm.fbck->uweight;
HA_RWLOCK_RDUNLOCK(LBPRM_LOCK, >lbprm.lock);
 
-   stats[ST_F_PXNAME]   = mkf_str(FO_KEY|FN_NAME|FS_SERVICE, px->id);
-   stats[ST_F_SVNAME]   = mkf_str(FO_KEY|FN_NAME|FS_SERVICE, "BACKEND");
-   stats[ST_F_MODE] = mkf_str(FO_CONFIG|FS_SERVICE, 
proxy_mode_str(px->mode));
-   stats[ST_F_QCUR] = mkf_u32(0, px->nbpend);
-   stats[ST_F_QMAX] = mkf_u32(FN_MAX, px->be_counters.nbpend_max);
-   stats[ST_F_SCUR] = mkf_u32(0, px->beconn);
-   stats[ST_F_SMAX] = mkf_u32(FN_MAX, px->be_counters.conn_max);
-   stats[ST_F_SLIM] = mkf_u32(FO_CONFIG|FN_LIMIT, px->fullconn);
-   stats[ST_F_STO

Re: [PATCH 5/6] MEDIUM: stats: allow to select one field in `stats_fill_sv_stats`

2021-01-25 Thread William Dauchy
Hello Christopher,

Thanks for the review.

On Mon, Jan 25, 2021 at 4:33 PM Christopher Faulet  wrote:
> The "via" parameter is unused. No reason to pass it.

good catch. surprised gcc did not triggered it
fixed in v2

> Same comment than for stats_fill_b_stats. The metric variable must be 
> initialized.

fixed in v2.

--
William



Re: [PATCH 3/6] MEDIUM: stats: allow to select one field in `stats_fill_be_stats`

2021-01-25 Thread William Dauchy
Hi Christopher,

Thanks for the review.

On Mon, Jan 25, 2021 at 4:26 PM Christopher Faulet  wrote:
> Here, you must take care to increment "*nbup" and "*nbsrv" in the for loop. 
> But
> changing it this way gcc then complains these variables are now unused. Thus 
> to
> make it happy, I must use local variables.

ah yes, I forgot the * indeed.
I did modify the code to use local variables but it somehow looks
uglier than before.
see in v2.

> Take a look at the commit 8596bfbaf ("BUG/MINOR: stats: Init the metric 
> variable
> when frontend stats are filled"). The metric variable must be initialized.

fixed in v2


--
William



[PATCH 6/6] MINOR: contrib/prometheus-exporter: use fill_sv_stats for server dump

2021-01-22 Thread William Dauchy
use `stats_fill_sv_stats` when possible to avoid duplicating code.

the following metrics have a change of behaviour:

haproxy_server_limit_sessions
haproxy_server_queue_limit
haproxy_server_check_failures_total
haproxy_server_check_up_down_total
haproxy_server_downtime_seconds_total
haproxy_server_current_throttle
haproxy_server_idle_connections_limit

depending on cases, if the limit was not configured or enabled, NaN is
returned instead. It should not be an issue for users, even better than
before as it provides more precise info.

Signed-off-by: William Dauchy 
---
 .../prometheus-exporter/service-prometheus.c  | 143 +-
 1 file changed, 7 insertions(+), 136 deletions(-)

diff --git a/contrib/prometheus-exporter/service-prometheus.c 
b/contrib/prometheus-exporter/service-prometheus.c
index 12fe3ee90..364d4b6d1 100644
--- a/contrib/prometheus-exporter/service-prometheus.c
+++ b/contrib/prometheus-exporter/service-prometheus.c
@@ -799,8 +799,8 @@ static int promex_dump_srv_metrics(struct appctx *appctx, 
struct htx *htx)
struct channel *chn = si_ic(appctx->owner);
struct ist out = ist2(trash.area, 0);
size_t max = htx_get_max_blksz(htx, channel_htx_recv_max(chn, htx));
+   struct field *stats = stat_l[STATS_DOMAIN_PROXY];
int ret = 1;
-   uint32_t weight;
double secs;
 
for (;appctx->st2 < ST_F_TOTAL_FIELDS; appctx->st2++) {
@@ -817,6 +817,9 @@ static int promex_dump_srv_metrics(struct appctx *appctx, 
struct htx *htx)
while (appctx->ctx.stats.obj2) {
sv = appctx->ctx.stats.obj2;
 
+   if (!stats_fill_sv_stats(px, sv, 0, stats, 
ST_F_TOTAL_FIELDS, &(appctx->st2)))
+   return -1;
+
if ((appctx->ctx.stats.flags & 
PROMEX_FL_NO_MAINT_SRV) && (sv->cur_admin & SRV_ADMF_MAINT))
goto next_sv;
 
@@ -824,39 +827,6 @@ static int promex_dump_srv_metrics(struct appctx *appctx, 
struct htx *htx)
case ST_F_STATUS:
val = mkf_u32(FO_STATUS, 
promex_srv_status(sv));
break;
-   case ST_F_SCUR:
-   val = mkf_u32(0, sv->cur_sess);
-   break;
-   case ST_F_SMAX:
-   val = mkf_u32(FN_MAX, 
sv->counters.cur_sess_max);
-   break;
-   case ST_F_SLIM:
-   val = 
mkf_u32(FO_CONFIG|FN_LIMIT, sv->maxconn);
-   break;
-   case ST_F_STOT:
-   val = mkf_u64(FN_COUNTER, 
sv->counters.cum_sess);
-   break;
-   case ST_F_RATE_MAX:
-   val = mkf_u32(FN_MAX, 
sv->counters.sps_max);
-   break;
-   case ST_F_LASTSESS:
-   val = mkf_s32(FN_AGE, 
srv_lastsession(sv));
-   break;
-   case ST_F_QCUR:
-   val = mkf_u32(0, sv->nbpend);
-   break;
-   case ST_F_QMAX:
-   val = mkf_u32(FN_MAX, 
sv->counters.nbpend_max);
-   break;
-   case ST_F_QLIMIT:
-   val = 
mkf_u32(FO_CONFIG|FS_SERVICE, sv->maxqueue);
-   break;
-   case ST_F_BIN:
-   val = mkf_u64(FN_COUNTER, 
sv->counters.bytes_in);
-   break;
-   case ST_F_BOUT:
-   val = mkf_u64(FN_COUNTER, 
sv->counters.bytes_out);
-   break;
case ST_F_QTIME:
secs = 
(double)swrate_avg(sv->counters.q_time, TIME_STATS_SAMPLES) / 1000.0;
val = mkf_flt(FN_AVG, secs);
@@ -889,43 +859,6 @@ static int promex_dump_srv_metrics(struct appctx *appctx, 
str

[PATCH 5/6] MEDIUM: stats: allow to select one field in `stats_fill_sv_stats`

2021-01-22 Thread William Dauchy
prometheus approach requires to output all values for a given metric
name; meaning we iterate through all metrics, and then iterate in the
inner loop on all objects for this metric.
In order to allow more code reuse, adapt the stats API to be able to
select one field or fill them all otherwise.
This patch follows what has already been done on frontend and backend
side.
>From this patch it should be possible to remove most of the duplicate
code on prometheuse side for the server.

A few things to note though:
- state require prior calculation, so I moved that to a sort of helper
  `stats_fill_be_stats_computestate`.
- all ST_F*TIME fields requires some minor compute, so I moved it at te
  beginning of the function under a condition.

Signed-off-by: William Dauchy 
---
 include/haproxy/stats.h |   2 +-
 src/hlua_fcn.c  |   3 +-
 src/stats.c | 585 +---
 3 files changed, 368 insertions(+), 222 deletions(-)

diff --git a/include/haproxy/stats.h b/include/haproxy/stats.h
index 6c04f2761..6115dca8f 100644
--- a/include/haproxy/stats.h
+++ b/include/haproxy/stats.h
@@ -51,7 +51,7 @@ int stats_fill_fe_stats(struct proxy *px, struct field 
*stats, int len,
 int stats_fill_li_stats(struct proxy *px, struct listener *l, int flags,
 struct field *stats, int len);
 int stats_fill_sv_stats(struct proxy *px, struct server *sv, int flags,
-struct field *stats, int len);
+struct field *stats, int len, enum stat_field 
*selected_field);
 int stats_fill_be_stats(struct proxy *px, int flags, struct field *stats, int 
len,
enum stat_field *selected_field);
 
diff --git a/src/hlua_fcn.c b/src/hlua_fcn.c
index d13c5344f..6dd9efb21 100644
--- a/src/hlua_fcn.c
+++ b/src/hlua_fcn.c
@@ -922,7 +922,8 @@ int hlua_server_get_stats(lua_State *L)
return 1;
}
 
-   stats_fill_sv_stats(srv->proxy, srv, STAT_SHLGNDS, stats, STATS_LEN);
+   stats_fill_sv_stats(srv->proxy, srv, STAT_SHLGNDS, stats,
+   STATS_LEN, NULL);
 
lua_newtable(L);
for (i=0; i with the server statistics.  is
- * preallocated array of length . The length of the array
- * must be at least ST_F_TOTAL_FIELDS. If this length is less
- * then this value, the function returns 0, otherwise, it
- * returns 1.  can take the value STAT_SHLGNDS.
+/* Compute server state helper
  */
-int stats_fill_sv_stats(struct proxy *px, struct server *sv, int flags,
-struct field *stats, int len)
+static void stats_fill_sv_stats_computestate(struct server *sv, struct server 
*via,
+struct server *ref, enum 
srv_stats_state *state)
 {
-   struct server *via, *ref;
-   char str[INET6_ADDRSTRLEN];
-   struct buffer *out = get_trash_chunk();
-   enum srv_stats_state state;
-   char *fld_status;
-   long long srv_samples_counter;
-   unsigned int srv_samples_window = TIME_STATS_SAMPLES;
-
-   if (len < ST_F_TOTAL_FIELDS)
-   return 0;
-
-   /* we have "via" which is the tracked server as described in the 
configuration,
-* and "ref" which is the checked server and the end of the chain.
-*/
-   via = sv->track ? sv->track : sv;
-   ref = via;
-   while (ref->track)
-   ref = ref->track;
-
if (sv->cur_state == SRV_ST_RUNNING || sv->cur_state == 
SRV_ST_STARTING) {
if ((ref->check.state & CHK_ST_ENABLED) &&
(ref->check.health < ref->check.rise + ref->check.fall - 
1)) {
-   state = SRV_STATS_STATE_UP_GOING_DOWN;
+   *state = SRV_STATS_STATE_UP_GOING_DOWN;
} else {
-   state = SRV_STATS_STATE_UP;
+   *state = SRV_STATS_STATE_UP;
}
 
if (sv->cur_admin & SRV_ADMF_DRAIN) {
if (ref->agent.state & CHK_ST_ENABLED)
-   state = SRV_STATS_STATE_DRAIN_AGENT;
-   else if (state == SRV_STATS_STATE_UP_GOING_DOWN)
-   state = SRV_STATS_STATE_DRAIN_GOING_DOWN;
+   *state = SRV_STATS_STATE_DRAIN_AGENT;
+   else if (*state == SRV_STATS_STATE_UP_GOING_DOWN)
+   *state = SRV_STATS_STATE_DRAIN_GOING_DOWN;
else
-   state = SRV_STATS_STATE_DRAIN;
+   *state = SRV_STATS_STATE_DRAIN;
}
 
-   if (state == SRV_STATS_STATE_UP && !(ref->check.state & 
CHK_ST_ENABLED)) {
-   state = SRV_STATS_STATE_NO_CHECK;
+   if (*state == SRV_STATS_STAT

[PATCH 3/6] MEDIUM: stats: allow to select one field in `stats_fill_be_stats`

2021-01-22 Thread William Dauchy
prometheus approach requires to output all values for a given metric
name; meaning we iterate through all metrics, and then iterate in the
inner loop on all objects for this metric.
In order to allow more code reuse, adapt the stats API to be able to
select one field or fill them all otherwise.
This patch follows what has already been done on frontend side.
>From this patch it should be possible to remove most of the duplicate
code on prometheuse side for the backend

A few things to note though:
- status and uweight field requires prior compute, so I moved that to a
  sort of helper `stats_fill_be_stats_computesrv`.
- all ST_F*TIME fields requires some minor compute, so I moved it at te
  beginning of the function under a condition.

Signed-off-by: William Dauchy 
---
 include/haproxy/stats.h |   3 +-
 src/hlua_fcn.c  |   2 +-
 src/stats.c | 357 
 3 files changed, 257 insertions(+), 105 deletions(-)

diff --git a/include/haproxy/stats.h b/include/haproxy/stats.h
index 8210367ac..6c04f2761 100644
--- a/include/haproxy/stats.h
+++ b/include/haproxy/stats.h
@@ -52,7 +52,8 @@ int stats_fill_li_stats(struct proxy *px, struct listener *l, 
int flags,
 struct field *stats, int len);
 int stats_fill_sv_stats(struct proxy *px, struct server *sv, int flags,
 struct field *stats, int len);
-int stats_fill_be_stats(struct proxy *px, int flags, struct field *stats, int 
len);
+int stats_fill_be_stats(struct proxy *px, int flags, struct field *stats, int 
len,
+   enum stat_field *selected_field);
 
 void stats_io_handler(struct stream_interface *si);
 int stats_emit_raw_data_field(struct buffer *out, const struct field *f);
diff --git a/src/hlua_fcn.c b/src/hlua_fcn.c
index aab864370..d13c5344f 100644
--- a/src/hlua_fcn.c
+++ b/src/hlua_fcn.c
@@ -1341,7 +1341,7 @@ int hlua_proxy_get_stats(lua_State *L)
 
px = hlua_check_proxy(L, 1);
if (px->cap & PR_CAP_BE)
-   stats_fill_be_stats(px, STAT_SHLGNDS, stats, STATS_LEN);
+   stats_fill_be_stats(px, STAT_SHLGNDS, stats, STATS_LEN, NULL);
else
stats_fill_fe_stats(px, stats, STATS_LEN, NULL);
lua_newtable(L);
diff --git a/src/stats.c b/src/stats.c
index 161fc6548..81b368663 100644
--- a/src/stats.c
+++ b/src/stats.c
@@ -2259,130 +2259,281 @@ static int stats_dump_sv_stats(struct 
stream_interface *si, struct proxy *px, st
return stats_dump_one_line(stats, stats_count, appctx);
 }
 
-/* Fill  with the backend statistics.  is
- * preallocated array of length . The length of the array
- * must be at least ST_F_TOTAL_FIELDS. If this length is less
- * then this value, the function returns 0, otherwise, it
- * returns 1.  can take the value STAT_SHLGNDS.
+/* Helper to compute srv values for a given backend
  */
-int stats_fill_be_stats(struct proxy *px, int flags, struct field *stats, int 
len)
+static void stats_fill_be_stats_computesrv(struct proxy *px, int *nbup, int 
*nbsrv, int *totuw)
 {
-   long long be_samples_counter;
-   unsigned int be_samples_window = TIME_STATS_SAMPLES;
-   struct buffer *out = get_trash_chunk();
const struct server *srv;
-   int nbup, nbsrv;
-   int totuw;
-   char *fld;
-
-   if (len < ST_F_TOTAL_FIELDS)
-   return 0;
 
-   totuw = 0;
-   nbup = nbsrv = 0;
+   *nbup = *nbsrv = *totuw = 0;
for (srv = px->srv; srv; srv = srv->next) {
if (srv->cur_state != SRV_ST_STOPPED) {
nbup++;
if (srv_currently_usable(srv) &&
(!px->srv_act ^ !(srv->flags & SRV_F_BACKUP)))
-   totuw += srv->uweight;
+   *totuw += srv->uweight;
}
nbsrv++;
}
 
HA_RWLOCK_RDLOCK(LBPRM_LOCK, >lbprm.lock);
if (!px->srv_act && px->lbprm.fbck)
-   totuw = px->lbprm.fbck->uweight;
+   *totuw = px->lbprm.fbck->uweight;
HA_RWLOCK_RDUNLOCK(LBPRM_LOCK, >lbprm.lock);
+}
 
-   stats[ST_F_PXNAME]   = mkf_str(FO_KEY|FN_NAME|FS_SERVICE, px->id);
-   stats[ST_F_SVNAME]   = mkf_str(FO_KEY|FN_NAME|FS_SERVICE, "BACKEND");
-   stats[ST_F_MODE] = mkf_str(FO_CONFIG|FS_SERVICE, 
proxy_mode_str(px->mode));
-   stats[ST_F_QCUR] = mkf_u32(0, px->nbpend);
-   stats[ST_F_QMAX] = mkf_u32(FN_MAX, px->be_counters.nbpend_max);
-   stats[ST_F_SCUR] = mkf_u32(0, px->beconn);
-   stats[ST_F_SMAX] = mkf_u32(FN_MAX, px->be_counters.conn_max);
-   stats[ST_F_SLIM] = mkf_u32(FO_CONFIG|FN_LIMIT, px->fullconn);
-   stats[ST_F_STOT] = mkf_u64(FN_COUNTER, px->be_counters.cum_conn);
-   stats[ST_F_BIN]  = mkf_u64(FN_COUNTER, px->be_counter

[PATCH 2/6] CLEANUP: stats: improve field selection for frontend http fields

2021-01-22 Thread William Dauchy
while working on backend/servers I realised I could have written that in
a better way and avoid one extra break. This is slightly improving
readiness.
also while being here, fix function declaration which was not 100%
accurate.

this patch does not change the behaviour of the code.

Signed-off-by: William Dauchy 
---
 include/haproxy/stats.h |  2 +-
 src/stats.c | 45 +
 2 files changed, 19 insertions(+), 28 deletions(-)

diff --git a/include/haproxy/stats.h b/include/haproxy/stats.h
index eb72446ae..8210367ac 100644
--- a/include/haproxy/stats.h
+++ b/include/haproxy/stats.h
@@ -47,7 +47,7 @@ int stats_dump_one_line(const struct field *stats, size_t 
stats_count, struct ap
 
 int stats_fill_info(struct field *info, int len);
 int stats_fill_fe_stats(struct proxy *px, struct field *stats, int len,
-   enum stat_field *field);
+   enum stat_field *selected_field);
 int stats_fill_li_stats(struct proxy *px, struct listener *l, int flags,
 struct field *stats, int len);
 int stats_fill_sv_stats(struct proxy *px, struct server *sv, int flags,
diff --git a/src/stats.c b/src/stats.c
index e1b350a44..161fc6548 100644
--- a/src/stats.c
+++ b/src/stats.c
@@ -1712,49 +1712,40 @@ int stats_fill_fe_stats(struct proxy *px, struct field 
*stats, int len,
metric = mkf_u64(FN_COUNTER, 
px->fe_counters.internal_errors);
break;
case ST_F_HRSP_1XX:
-   if (px->mode != PR_MODE_HTTP)
-   break;
-   metric = mkf_u64(FN_COUNTER, 
px->fe_counters.p.http.rsp[1]);
+   if (px->mode == PR_MODE_HTTP)
+   metric = mkf_u64(FN_COUNTER, 
px->fe_counters.p.http.rsp[1]);
break;
case ST_F_HRSP_2XX:
-   if (px->mode != PR_MODE_HTTP)
-   break;
-   metric = mkf_u64(FN_COUNTER, 
px->fe_counters.p.http.rsp[2]);
+   if (px->mode == PR_MODE_HTTP)
+   metric = mkf_u64(FN_COUNTER, 
px->fe_counters.p.http.rsp[2]);
break;
case ST_F_HRSP_3XX:
-   if (px->mode != PR_MODE_HTTP)
-   break;
-   metric = mkf_u64(FN_COUNTER, 
px->fe_counters.p.http.rsp[3]);
+   if (px->mode == PR_MODE_HTTP)
+   metric = mkf_u64(FN_COUNTER, 
px->fe_counters.p.http.rsp[3]);
break;
case ST_F_HRSP_4XX:
-   if (px->mode != PR_MODE_HTTP)
-   break;
-   metric = mkf_u64(FN_COUNTER, 
px->fe_counters.p.http.rsp[4]);
+   if (px->mode == PR_MODE_HTTP)
+   metric = mkf_u64(FN_COUNTER, 
px->fe_counters.p.http.rsp[4]);
break;
case ST_F_HRSP_5XX:
-   if (px->mode != PR_MODE_HTTP)
-   break;
-   metric = mkf_u64(FN_COUNTER, 
px->fe_counters.p.http.rsp[5]);
+   if (px->mode == PR_MODE_HTTP)
+   metric = mkf_u64(FN_COUNTER, 
px->fe_counters.p.http.rsp[5]);
break;
case ST_F_HRSP_OTHER:
-   if (px->mode != PR_MODE_HTTP)
-   break;
-   metric = mkf_u64(FN_COUNTER, 
px->fe_counters.p.http.rsp[0]);
+   if (px->mode == PR_MODE_HTTP)
+   metric = mkf_u64(FN_COUNTER, 
px->fe_counters.p.http.rsp[0]);
break;
case ST_F_INTERCEPTED:
-   if (px->mode != PR_MODE_HTTP)
-   break;
-   metric = mkf_u64(FN_COUNTER, 
px->fe_counters.intercepted_req);
+   if (px->mode == PR_MODE_HTTP)
+   metric = mkf_u64(FN_COUNTER, 
px->fe_counters.intercepted_req);
break;
case ST_F_CACHE_LOOKUPS:
-   if (px->mode != PR_MODE_HTTP)
-   break;
-   metric = mkf_u64(FN_COUNTER, 
px->fe

[PATCH 4/6] MINOR: contrib/prometheus-exporter: use fill_be_stats for backend dump

2021-01-22 Thread William Dauchy
use `stats_fill_be_stats` when possible to avoid duplicating code; make
use of field selector to get the needed field only.

the only difference is on `haproxy_backend_downtime_seconds_total` as
stats.c is testing `px->srv`. This behaviour is present since commit
7344f4789321ef8ce2ce17cf73dabd672f7c8c6 ("MINOR: stats: only report
backend's down time if it has servers"). The end result is a NaN instead
of a zero when no server are present.

Signed-off-by: William Dauchy 
---
 .../prometheus-exporter/service-prometheus.c  | 163 ++
 1 file changed, 15 insertions(+), 148 deletions(-)

diff --git a/contrib/prometheus-exporter/service-prometheus.c 
b/contrib/prometheus-exporter/service-prometheus.c
index bc9ebf3fc..12fe3ee90 100644
--- a/contrib/prometheus-exporter/service-prometheus.c
+++ b/contrib/prometheus-exporter/service-prometheus.c
@@ -685,8 +685,8 @@ static int promex_dump_back_metrics(struct appctx *appctx, 
struct htx *htx)
struct channel *chn = si_ic(appctx->owner);
struct ist out = ist2(trash.area, 0);
size_t max = htx_get_max_blksz(htx, channel_htx_recv_max(chn, htx));
+   struct field *stats = stat_l[STATS_DOMAIN_PROXY];
int ret = 1;
-   uint32_t weight;
double secs;
 
for (;appctx->st2 < ST_F_TOTAL_FIELDS; appctx->st2++) {
@@ -700,46 +700,13 @@ static int promex_dump_back_metrics(struct appctx 
*appctx, struct htx *htx)
if (px->disabled || px->uuid <= 0 || !(px->cap & 
PR_CAP_BE))
goto next_px;
 
+   if (!stats_fill_be_stats(px, 0, stats, 
ST_F_TOTAL_FIELDS, &(appctx->st2)))
+   return -1;
+
switch (appctx->st2) {
case ST_F_STATUS:
val = mkf_u32(FO_STATUS, 
(px->lbprm.tot_weight > 0 || !px->srv) ? 1 : 0);
break;
-   case ST_F_SCUR:
-   val = mkf_u32(0, px->beconn);
-   break;
-   case ST_F_SMAX:
-   val = mkf_u32(FN_MAX, 
px->be_counters.conn_max);
-   break;
-   case ST_F_SLIM:
-   val = mkf_u32(FO_CONFIG|FN_LIMIT, 
px->fullconn);
-   break;
-   case ST_F_STOT:
-   val = mkf_u64(FN_COUNTER, 
px->be_counters.cum_conn);
-   break;
-   case ST_F_RATE_MAX:
-   val = mkf_u32(0, 
px->be_counters.sps_max);
-   break;
-   case ST_F_LASTSESS:
-   val = mkf_s32(FN_AGE, 
be_lastsession(px));
-   break;
-   case ST_F_QCUR:
-   val = mkf_u32(0, px->nbpend);
-   break;
-   case ST_F_QMAX:
-   val = mkf_u32(FN_MAX, 
px->be_counters.nbpend_max);
-   break;
-   case ST_F_CONNECT:
-   val = mkf_u64(FN_COUNTER, 
px->be_counters.connect);
-   break;
-   case ST_F_REUSE:
-   val = mkf_u64(FN_COUNTER, 
px->be_counters.reuse);
-   break;
-   case ST_F_BIN:
-   val = mkf_u64(FN_COUNTER, 
px->be_counters.bytes_in);
-   break;
-   case ST_F_BOUT:
-   val = mkf_u64(FN_COUNTER, 
px->be_counters.bytes_out);
-   break;
case ST_F_QTIME:
secs = 
(double)swrate_avg(px->be_counters.q_time, TIME_STATS_SAMPLES) / 1000.0;
val = mkf_flt(FN_AVG, secs);
@@ -772,139 +739,39 @@ static int promex_dump_back_metrics(struct appctx 
*appctx, struct htx *htx)
secs = 
(double)px->be_counters.ttime_max / 1000.0;
val = mkf_flt(FN_MAX, secs);
break;
-   case ST_F_DREQ:
-   val = mkf_u64(FN_COUNTER, 
px->be_counters.denied_req);
-

[PATCH 0/6] backend and server stats reuse

2021-01-22 Thread William Dauchy
Hi Christopher,

This is a new series which follows the work done on frontend side.

It starts with two minor patches which are more comestic; then it
addresses backend and servers. I made sure to not reproduce the same
mistake for the default case you just fixed on frontends.
There is probably a few controversial points where I did not know how we
could handle prior calculation when it is needed for several fields. So
I ended up putting them in a condition testing fields where it is
needed. It is a bit ugly I admit but I'm open for suggestions.

After this patch series, I have a few other things in queue:
- an attempt to better reuse already existing description on stats.c
- first attempt to adapt labels with state and health check
but I prefer merging things along the way to avoid huge reviews.

William Dauchy (6):
  MINOR: contrib/prometheus-exporter: better output of Not-a-Number
  CLEANUP: stats: improve field selection for frontend http fields
  MEDIUM: stats: allow to select one field in `stats_fill_be_stats`
  MINOR: contrib/prometheus-exporter: use fill_be_stats for backend dump
  MEDIUM: stats: allow to select one field in `stats_fill_sv_stats`
  MINOR: contrib/prometheus-exporter: use fill_sv_stats for server dump

 .../prometheus-exporter/service-prometheus.c  | 314 +-
 include/haproxy/stats.h   |   7 +-
 src/hlua_fcn.c|   5 +-
 src/stats.c   | 987 +++---
 4 files changed, 670 insertions(+), 643 deletions(-)

-- 
2.29.2




[PATCH 1/6] MINOR: contrib/prometheus-exporter: better output of Not-a-Number

2021-01-22 Thread William Dauchy
Not necessarily mandatory but I saw a few prometheus client parsing only
`NaN`. Also most librarries do output `NaN`

Signed-off-by: William Dauchy 
---
 contrib/prometheus-exporter/service-prometheus.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/contrib/prometheus-exporter/service-prometheus.c 
b/contrib/prometheus-exporter/service-prometheus.c
index aa72382d9..bc9ebf3fc 100644
--- a/contrib/prometheus-exporter/service-prometheus.c
+++ b/contrib/prometheus-exporter/service-prometheus.c
@@ -415,7 +415,7 @@ static int promex_srv_status(struct server *sv)
 
 /* Convert a field to its string representation and write it in , followed
  * by a newline, if there is enough space. non-numeric value are converted in
- * "Nan" because Prometheus only support numerical values (but it is 
unexepceted
+ * "NaN" because Prometheus only support numerical values (but it is 
unexepceted
  * to process this kind of value). It returns 1 on success. Otherwise, it
  * returns 0. The buffer's length must not exceed  value.
  */
@@ -424,14 +424,14 @@ static int promex_metric_to_str(struct buffer *out, 
struct field *f, size_t max)
int ret = 0;
 
switch (field_format(f, 0)) {
-   case FF_EMPTY: ret = chunk_strcat(out,  "Nan\n"); break;
+   case FF_EMPTY: ret = chunk_strcat(out,  "NaN\n"); break;
case FF_S32:   ret = chunk_appendf(out, "%d\n", f->u.s32); 
break;
case FF_U32:   ret = chunk_appendf(out, "%u\n", f->u.u32); 
break;
case FF_S64:   ret = chunk_appendf(out, "%lld\n", (long 
long)f->u.s64); break;
case FF_U64:   ret = chunk_appendf(out, "%llu\n", (unsigned 
long long)f->u.u64); break;
case FF_FLT:   ret = chunk_appendf(out, "%f\n", f->u.flt); 
break;
-   case FF_STR:   ret = chunk_strcat(out,  "Nan\n"); break;
-   default:   ret = chunk_strcat(out,  "Nan\n"); break;
+   case FF_STR:   ret = chunk_strcat(out,  "NaN\n"); break;
+   default:   ret = chunk_strcat(out,  "NaN\n"); break;
}
if (!ret || out->data > max)
return 0;
-- 
2.29.2




Re: [PATCH] Prometheus: Simplify metrics matrices definition

2021-01-21 Thread William Dauchy
Hey Christopher,

On Wed, Jan 20, 2021 at 7:45 PM Willy Tarreau  wrote:
> > I will continue my work once you have merged this series.
> Great, I'll let you guys deal with this together :-)

sorry to be a bit impatient but is it possible to push those things so
I can continue? :-)

Best,
-- 
William



Re: [PATCH 0/3] prometheus info fields description merge

2021-01-21 Thread William Dauchy
On Thu, Jan 21, 2021 at 9:39 AM Willy Tarreau  wrote:
> I didn't understand that it was a question. Well, as I mentioned previously,
> we could have units stored in the stats structure so that the promex code
> automatically performs the conversion. This would make everything more
> homogenous I guess.

Do you mean we could rename the "MB" metrics instead of adding new
ones? I added new ones as I did not know how to not break the existing
`stats.c`, even though some transition could be needed at some point
to avoid having those duplicates.
-- 
William



Re: [PATCH 0/3] prometheus info fields description merge

2021-01-21 Thread William Dauchy
On Wed, Jan 20, 2021 at 4:33 PM William Dauchy  wrote:
> On Wed, Jan 20, 2021 at 4:08 PM Christopher Faulet  
> wrote:
> > Sorry for the delay. I reviewed your patches and that seems good for me. In
> > fact, I was first a bit annoyed by the first one, because it adds 3 new 
> > fields
> > just to change the unit (MB vs bytes). But the unit appears in the metric 
> > name
> > so I guess it is ok this way. We must just be careful to not add too many
> > specific fields. And, at the end, it is a great cleanup.
>
> I was also puzzled but could not find a better way; I think we did
> this workaround in the past but that does not solve the issue in the
> long term. I also wondered whether we could deprecate the MB one but I
> already know the answer I believe ;)

Any thoughts about this Willy?
-- 
William



Re: [PATCH] Prometheus: Simplify metrics matrices definition

2021-01-20 Thread William Dauchy
On Wed, Jan 20, 2021 at 07:35:11PM +0100, William Dauchy wrote:
> I have not tested them but from what I see it should really much change

I meant "it should not" ;)

> the current output, even the order. That being said, it looks definitely
> simpler to me.
> I will continue my work once you have merged this series.
> -- 
> William



  1   2   3   4   >