Re: [PATCH} improve ssl guarding

2021-02-07 Thread William Lallemand
On Sat, Feb 06, 2021 at 09:18:30PM +0500, Илья Шипицин wrote:
> you are right.
> I've fixed it.
>

Thanks, both pushed in master.

-- 
William Lallemand



Re: [ANNOUNCE] haproxy-2.4-dev7

2021-02-07 Thread Willy Tarreau
On Sun, Feb 07, 2021 at 09:08:07PM +0100, William Dauchy wrote:
> Willy: it is probably a wise idea to keep it for the 2.4 final release
> notes, some people might want to know that during their update; a lot
> of people have their production alerts on those metrics.

Noted, I hope I won't forget :-)  Thanks for the detailed explanation
William!

Willy



Re: [ANNOUNCE] haproxy-2.4-dev7

2021-02-07 Thread William Dauchy
On Fri, Feb 5, 2021 at 4:14 PM Willy Tarreau  wrote:
> HAProxy 2.4-dev7 was released on 2021/02/05. It added 153 new commits
> after version 2.4-dev6.
>   - Some significant lifting was done to the Prometheus exporter, including
> new fields, better descriptions and some filtering. I've seen quite a
> bunch pass in front of me but do not well understand what it does, all
> that interests me is that some users are happy with these changes so I
> guess they were long awaited :-)

about that, please note two breaking changes:

- objects' status are no longer a gauge value which you need to
translate manually; instead the state is a label with a proper string
value. The value of the metric simply informs whether the state is
active or not.
so we went from:
  haproxy_server_status{proxy="be_foo",server="srv0"} 2
(which meant MAINT)
to:
  haproxy_server_status{proxy="be_foo",server="srv0",state="DOWN"} 0
  haproxy_server_status{proxy="be_foo",server="srv0",state="UP"} 0
  haproxy_server_status{proxy="be_foo",server="srv0",state="MAINT"} 1
  haproxy_server_status{proxy="be_foo",server="srv0",state="DRAIN"} 0
  haproxy_server_status{proxy="be_foo",server="srv0",state="NOLB"} 0

This change is valid for frontend, backend, server with different label values.

- similar change with health checks where we put a state label:

haproxy_server_check_status{proxy="be_foo",server="srv0",state="HANA"} 0
haproxy_server_check_status{proxy="be_foo",server="srv0",state="SOCKERR"} 0
haproxy_server_check_status{proxy="be_foo",server="srv0",state="L4OK"} 0
haproxy_server_check_status{proxy="be_foo",server="srv0",state="L4TOUT"} 0
haproxy_server_check_status{proxy="be_foo",server="srv0",state="L4CON"} 1
haproxy_server_check_status{proxy="be_foo",server="srv0",state="L6OK"} 0
haproxy_server_check_status{proxy="be_foo",server="srv0",state="L6TOUT"} 0
haproxy_server_check_status{proxy="be_foo",server="srv0",state="L6RSP"} 0
haproxy_server_check_status{proxy="be_foo",server="srv0",state="L7TOUT"} 0
haproxy_server_check_status{proxy="be_foo",server="srv0",state="L7RSP"} 0
haproxy_server_check_status{proxy="be_foo",server="srv0",state="L7OK"} 0
haproxy_server_check_status{proxy="be_foo",server="srv0",state="L7OKC"} 0
haproxy_server_check_status{proxy="be_foo",server="srv0",state="L7STS"} 0
haproxy_server_check_status{proxy="be_foo",server="srv0",state="PROCERR"} 0
haproxy_server_check_status{proxy="be_foo",server="srv0",state="PROCTOUT"} 0
haproxy_server_check_status{proxy="be_foo",server="srv0",state="PROCOK"} 0


It means:
* a lot more metrics for large setup (but you can still filter as
explained in the doc)
* easier use on prometheus side: you will be able to group per state
very easily now.

Generally speaking I'm very interested in feedback regarding this
change. This was motivated by the usage I saw in several companies,
where people were struggling making use of those metrics.

Willy: it is probably a wise idea to keep it for the 2.4 final release
notes, some people might want to know that during their update; a lot
of people have their production alerts on those metrics.

Thanks,
-- 
William



[PATCH 2/2] MEDIUM: contrib/prometheus-exporter: export base stick table stats

2021-02-07 Thread William Dauchy
I saw some people falling back to unix socket to collect some data they
could not find in prometheus exporter. One of them is base info from
stick tables (used/size).
I do not plan to extend it more for now; keys are quite a mess to
handle.

This should resolve github issue #1008.

Signed-off-by: William Dauchy 
---
 contrib/prometheus-exporter/README|  10 ++
 .../prometheus-exporter/service-prometheus.c  | 148 +++---
 reg-tests/contrib/prometheus.vtc  |   4 +
 3 files changed, 142 insertions(+), 20 deletions(-)

diff --git a/contrib/prometheus-exporter/README 
b/contrib/prometheus-exporter/README
index a85981597..d882b092f 100644
--- a/contrib/prometheus-exporter/README
+++ b/contrib/prometheus-exporter/README
@@ -72,6 +72,7 @@ exported. Here are examples:
   /metrics?scope=frontend=backend # ==> Frontend and backend metrics 
will be exported
   /metrics?scope=*=   # ==> no metrics will be exported
   /metrics?scope==global  # ==> global metrics will be exported
+  /metrics?scope=sticktable # ==> stick tables metrics will be 
exported
 
 * How do I prevent my prometheus instance to explode?
 
@@ -320,3 +321,12 @@ See prometheus export for the description of each field.
 | haproxy_server_need_connections_current|
 | haproxy_server_uweight |
 ++
+
+* Stick table metrics
+
+++
+|Metric name |
+++
+| haproxy_sticktable_size|
+| haproxy_sticktable_used|
+++
diff --git a/contrib/prometheus-exporter/service-prometheus.c 
b/contrib/prometheus-exporter/service-prometheus.c
index 769389735..521fe1056 100644
--- a/contrib/prometheus-exporter/service-prometheus.c
+++ b/contrib/prometheus-exporter/service-prometheus.c
@@ -47,28 +47,33 @@ enum {
 
 /* Prometheus exporter dumper states (appctx->st1) */
 enum {
-PROMEX_DUMPER_INIT = 0, /* initialized */
-PROMEX_DUMPER_GLOBAL,   /* dump metrics of globals */
-PROMEX_DUMPER_FRONT,/* dump metrics of frontend proxies */
-PROMEX_DUMPER_BACK, /* dump metrics of backend proxies */
-PROMEX_DUMPER_LI,   /* dump metrics of listeners */
-PROMEX_DUMPER_SRV,  /* dump metrics of servers */
-   PROMEX_DUMPER_DONE, /* finished */
+   PROMEX_DUMPER_INIT = 0,   /* initialized */
+   PROMEX_DUMPER_GLOBAL, /* dump metrics of globals */
+   PROMEX_DUMPER_FRONT,  /* dump metrics of frontend proxies */
+   PROMEX_DUMPER_BACK,   /* dump metrics of backend proxies */
+   PROMEX_DUMPER_LI, /* dump metrics of listeners */
+   PROMEX_DUMPER_SRV,/* dump metrics of servers */
+   PROMEX_DUMPER_STICKTABLE, /* dump metrics of stick tables */
+   PROMEX_DUMPER_DONE,   /* finished */
 };
 
 /* Prometheus exporter flags (appctx->ctx.stats.flags) */
-#define PROMEX_FL_METRIC_HDR0x0001
-#define PROMEX_FL_INFO_METRIC   0x0002
-#define PROMEX_FL_FRONT_METRIC  0x0004
-#define PROMEX_FL_BACK_METRIC   0x0008
-#define PROMEX_FL_SRV_METRIC0x0010
-#define PROMEX_FL_SCOPE_GLOBAL  0x0020
-#define PROMEX_FL_SCOPE_FRONT   0x0040
-#define PROMEX_FL_SCOPE_BACK0x0080
-#define PROMEX_FL_SCOPE_SERVER  0x0100
-#define PROMEX_FL_NO_MAINT_SRV  0x0200
-
-#define PROMEX_FL_SCOPE_ALL 
(PROMEX_FL_SCOPE_GLOBAL|PROMEX_FL_SCOPE_FRONT|PROMEX_FL_SCOPE_BACK|PROMEX_FL_SCOPE_SERVER)
+#define PROMEX_FL_METRIC_HDR0x0001
+#define PROMEX_FL_INFO_METRIC   0x0002
+#define PROMEX_FL_FRONT_METRIC  0x0004
+#define PROMEX_FL_BACK_METRIC   0x0008
+#define PROMEX_FL_SRV_METRIC0x0010
+#define PROMEX_FL_SCOPE_GLOBAL  0x0020
+#define PROMEX_FL_SCOPE_FRONT   0x0040
+#define PROMEX_FL_SCOPE_BACK0x0080
+#define PROMEX_FL_SCOPE_SERVER  0x0100
+#define PROMEX_FL_NO_MAINT_SRV  0x0200
+#define PROMEX_FL_STICKTABLE_METRIC 0x0400
+#define PROMEX_FL_SCOPE_STICKTABLE  0x0800
+
+#define PROMEX_FL_SCOPE_ALL (PROMEX_FL_SCOPE_GLOBAL | PROMEX_FL_SCOPE_FRONT | \
+PROMEX_FL_SCOPE_BACK | PROMEX_FL_SCOPE_SERVER | \
+PROMEX_FL_SCOPE_STICKTABLE)
 
 /* Promtheus metric type (gauge or counter) */
 enum promex_mt_type {
@@ -298,6 +303,25 @@ const struct ist promex_st_metric_desc[ST_F_TOTAL_FIELDS] 
= {
[ST_F_TT_MAX] = IST("Maximum observed total request+response 
time (request+queue+connect+response+processing)"),
 };
 
+/* stick table base fields */
+enum sticktable_field {
+   STICKTABLE_SIZE = 0,
+   STICKTABLE_USED,
+   /* must always be the last one */
+   STICKTABLE_TOTAL_FIELDS
+};
+
+const struct 

[PATCH 1/2] MINOR: contrib/prometheus-exporter: use stats desc when possible followup

2021-02-07 Thread William Dauchy
Remove remaining descrition which are common to stats.c.

This patch is a followup of commit
82b2ce2f967d967139adb7afab064416fadad615 ("MINOR:
contrib/prometheus-exporter: use stats desc when possible"). I probably
messed up with one of my rebase because I'm pretty sure I removed them
at some point, but who knows what happened.

Signed-off-by: William Dauchy 
---
 .../prometheus-exporter/service-prometheus.c  | 35 ---
 1 file changed, 35 deletions(-)

diff --git a/contrib/prometheus-exporter/service-prometheus.c 
b/contrib/prometheus-exporter/service-prometheus.c
index 126962f5e..769389735 100644
--- a/contrib/prometheus-exporter/service-prometheus.c
+++ b/contrib/prometheus-exporter/service-prometheus.c
@@ -284,42 +284,7 @@ const struct promex_metric 
promex_st_metrics[ST_F_TOTAL_FIELDS] = {
 
 /* Description of overriden stats fields */
 const struct ist promex_st_metric_desc[ST_F_TOTAL_FIELDS] = {
-   [ST_F_PXNAME] = IST("The proxy name."),
-   [ST_F_SVNAME] = IST("The service name (FRONTEND for frontend, 
BACKEND for backend, any name for server/listener)."),
-   [ST_F_QCUR]   = IST("Current number of queued requests."),
-   [ST_F_QMAX]   = IST("Maximum observed number of queued 
requests."),
-   [ST_F_SCUR]   = IST("Current number of active sessions."),
-   [ST_F_SMAX]   = IST("Maximum observed number of active 
sessions."),
-   [ST_F_SLIM]   = IST("Configured session limit."),
-   [ST_F_STOT]   = IST("Total number of sessions."),
-   [ST_F_BIN]= IST("Current total of incoming bytes."),
-   [ST_F_BOUT]   = IST("Current total of outgoing bytes."),
-   [ST_F_DREQ]   = IST("Total number of denied requests."),
-   [ST_F_DRESP]  = IST("Total number of denied responses."),
-   [ST_F_EREQ]   = IST("Total number of request errors."),
-   [ST_F_ECON]   = IST("Total number of connection errors."),
-   [ST_F_ERESP]  = IST("Total number of response errors."),
-   [ST_F_WRETR]  = IST("Total number of retry warnings."),
-   [ST_F_WREDIS] = IST("Total number of redispatch warnings."),
[ST_F_STATUS] = IST("Current status of the service, per state 
label value."),
-   [ST_F_WEIGHT] = IST("Service weight."),
-   [ST_F_ACT]= IST("Current number of active servers."),
-   [ST_F_BCK]= IST("Current number of backup servers."),
-   [ST_F_CHKFAIL]= IST("Total number of failed check (Only counts 
checks failed when the server is up)."),
-   [ST_F_CHKDOWN]= IST("Total number of UP->DOWN transitions."),
-   [ST_F_LASTCHG]= IST("Number of seconds since the last UP<->DOWN 
transition."),
-   [ST_F_DOWNTIME]   = IST("Total downtime (in seconds) for the 
service."),
-   [ST_F_QLIMIT] = IST("Configured maxqueue for the server (0 
meaning no limit)."),
-   [ST_F_PID]= IST("Process id (0 for first instance, 1 for 
second, ...)"),
-   [ST_F_IID]= IST("Unique proxy id."),
-   [ST_F_SID]= IST("Server id (unique inside a proxy)."),
-   [ST_F_THROTTLE]   = IST("Current throttle percentage for the 
server, when slowstart is active, or no value if not in slowstart."),
-   [ST_F_LBTOT]  = IST("Total number of times a service was 
selected, either for new sessions, or when redispatching."),
-   [ST_F_TRACKED]= IST("Id of proxy/server if tracking is 
enabled."),
-   [ST_F_TYPE]   = IST("Service type (0=frontend, 1=backend, 
2=server, 3=socket/listener)."),
-   [ST_F_RATE]   = IST("Current number of sessions per second over 
last elapsed second."),
-   [ST_F_RATE_LIM]   = IST("Configured limit on new sessions per 
second."),
-   [ST_F_RATE_MAX]   = IST("Maximum observed number of sessions per 
second."),
[ST_F_CHECK_STATUS]   = IST("Status of last health check, per state 
label value."),
[ST_F_CHECK_CODE] = IST("layer5-7 code, if available of the last 
health check."),
[ST_F_CHECK_DURATION] = IST("Total duration of the latest server health 
check, in seconds."),
-- 
2.30.0




Configure peers on clusters with 20+ instances

2021-02-07 Thread Joao Morais


Hello list. I'm implementing peers in order to share rps and other metrics 
between all instances of a haproxy cluster, so I have a global view of these 
data. Here is a snippet of my poc which simply does a request count:

global
localpeer h1
...
listen l1
...
http-request track-sc0 int(1) table p/t1
http-request set-var(req.gpc0) sc_inc_gpc0(0)
http-request set-var(req.gpc0) sc_get_gpc0(0,p/t2),add(req.gpc0)
http-request set-var(req.gpc0) sc_get_gpc0(0,p/t3),add(req.gpc0)
http-request return hdr x-out %[var(req.gpc0)]
peers p
bind :9001
log stdout format raw local0
server h1
server h2 127.0.0.1:9002
server h3 127.0.0.1:9003
table t1 type integer size 1 store gpc0
table t2 type integer size 1 store gpc0
table t3 type integer size 1 store gpc0

Our biggest cluster has actually 25 haproxy instances, meaning 25 tables per 
instance, and 25 set-var + add() per request per tracking data. On top of that 
all the 25 instances will share their 25 tables to all of the other 24 
instances. Build and maintain such configuration isn't a problem at all because 
it's automated, but how does it scale? Starting from how much instances should 
I change the approach and try, eg, to elect a controller that receives 
everything from everybody and delivers grouped data? Any advice or best 
practice will be very much appreciated, thanks!

~jm