RE: Should server crt be consider as crt-list and handled via the runtime API?

2021-02-08 Thread Pierre Cheynier
Hi William!

On Mon, Feb 08 2021 15:49:02 +0100, William Lallemand wrote:
> Thanks to Rémi development we already have the server crt update
> available from the CLI in the 2.4 tree.

Wow, this prove that I didn't follow that much what's currently happening...
Awesome, thanks!

> I'm not sure why you want this in the crt-list though, I think you meant
> "show ssl cert"? The crt-list are only useful to manage multiple
> certificates and SNIs on a bind line, in the case of a server line you
> only need one certicate.

Yes, that's most probably a misunderstanding on my side. As long as I get access
to the " ssl cert" API in the end, that's perfectly OK.

--
Pierre


Should server crt be consider as crt-list and handled via the runtime API?

2021-02-08 Thread Pierre Cheynier
I'm trying to figure out what would be missing to consider server crt-s as 
crt-lists (as in bind lines) so that they could be listed via "show ssl 
crt-list" APIs and also managed (essentially renewed) this way.

Exemple:
 backend foo-using-client-auth
 default-server check ssl crt /path/to/crt-list ca-file /path/to/my/ca.pem
 server srv0 192.0.2.1:80

I'd like then to manage this using:
  set ssl cert  

The use-case being the following: when integrating with service mesh solutions 
such as consul-connect, you may want to reduce the disruption occurring when 
certificates are renewed.
And in such kind of solution, they are renewed quite often (once every few tens 
of hours).
In this case the memory space is already allocated etc. so I (naively?) think 
it probably doesn't hurt too much.

What is your point-of-view?

--
Pierre


RE: [PATCH 1/9] MAJOR: contrib/prometheus-exporter: move health check status to labels

2021-02-01 Thread Pierre Cheynier

De : William Dauchy 
Envoyé : samedi 30 janvier 2021 16:21

> this is a follow up of commit c6464591a365bfcf509b322bdaa4d608c9395d75
> ("MAJOR: contrib/prometheus-exporter: move ftd/bkd/srv states to
> labels"). The main goal being to be better aligned with prometheus use
> cases in terms of queries. More specifically to health checks, Pierre C.
> mentioned the possible quirks he had to put in place in order to make
> use of those metrics through prometheus:
> 
>    by(proxy, check_status) (count_values by(proxy,
>   instance) ("check_status", haproxy_server_check_status))

Indeed, we wanted to aggregate at backend level to produce a view which
represent the "number of servers per health status". Still, this was not ideal
since I'm getting the status code and not the status name.

> I am perfectly aware this introduces a lot more metrics but I don't see
> how we can improve the usability without it. The main issue remains in
> the cardinality of the states which are > 20. Prometheus recommends to
> stay below a cardinality of 10 for a given metric but I consider our
> case very specific, because highly linked to the level of precision
> haproxy exposes.
> 
> Even before this patch I saw several large production setup (a few
> hundreds of MB in output) which are making use of the scope parameter to
> simply ignore the server metrics, so that the scrapping can be faster,
> and memory consumed on client side not too high. So I believe we should
> eventually continue in that direction and offer more granularity of
> filtering of the output. That being said it is already possible to
> filter out the data on prometheus client side.

True, I think this is the right approach to represent such data in Prometheus.
It will create some challenges for people like us, who use server-templates
and create thousands of backends, but we'll have to deal with it.
In addition to this update, I would add some recommendations about the user
setup in the README ("how do I prevent my prometheus instance to explode
when scrapping this endpoint?").
For server-template users:
- 
  params:
no-maint:
- empty

Generally speaking, to prevent all server metrics to be saved, except this one:
- 
   metric_relabel_configs:
   - source_labels: ['__name__']
  regex: 'haproxy_(process_|frontend_|backend_|server_check_status).*'
  action: keep

A long-term alternative (at least for my use-case) would be to provide this 
data at
backend-level, as initially suggested here:
https://www.mail-archive.com/haproxy@formilux.org/msg35369.html

> Signed-off-by: William Dauchy 
> ---

> @@ -319,7 +320,7 @@ const struct ist promex_st_metric_desc[ST_F_TOTAL_FIELDS] 
> = {
>  [ST_F_RATE]   = IST("Current number of sessions per second 
> over last elapsed second."),
>  [ST_F_RATE_LIM]   = IST("Configured limit on new sessions per 
> second."),
>  [ST_F_RATE_MAX]   = IST("Maximum observed number of sessions per 
> second."),
> -   [ST_F_CHECK_STATUS]   = IST("Status of last health check 
> (HCHK_STATUS_* values)."),
> +   [ST_F_CHECK_STATUS]   = IST("Status of last health check (0/1 
> depending on current `state` label value)."),

"Status of last health check, per state value" ?

--
Pierre



[PATCH] DOC: Add missing stats fields in the management doc

2020-10-08 Thread Pierre Cheynier
Added latest fields: idle_conn_cur, safe_conn_cur, used_conn_cur, need_conn_est
---
 doc/management.txt | 4 
 1 file changed, 4 insertions(+)

diff --git a/doc/management.txt b/doc/management.txt
index eef05b0fc..9fd7e6c03 100644
--- a/doc/management.txt
+++ b/doc/management.txt
@@ -1127,6 +1127,10 @@ Here is the list of static fields using the proxy 
statistics domain:
  92. rtime_max [..BS]: the maximum observed response time in ms (0 for TCP)
  93. ttime_max [..BS]: the maximum observed total session time in ms
  94. eint [LFBS]: cumulative number of internal errors
+ 95. idle_conn_cur [...S]: current number of unsafe idle connections
+ 96. safe_conn_cur [...S]: current number of safe idle connections
+ 97. used_conn_cur [...S]: current number of connections in use
+ 98. need_conn_est [...S]: estimated needed number of connections
 
 For all other statistics domains, the presence or the order of the fields are
 not guaranteed. In this case, the header line should always be used to parse
-- 
2.28.0




RE: Logging using %HP (path) produce different results with H1 and H2

2020-08-25 Thread Pierre Cheynier
Hi Willy,

On Tue, Aug 25, 2020 at 14:53:05PM +0200, Willy Tarreau wrote:

> Thus an HTTP/2 request effectively "looks like" an HTTP/1 request using
> an absolute URI. What causes the mess in the logs is that such HTTP/1
> requests are rarely used (most only for proxies), but they are perfectly
> valid and given that they are often used to take routing decisions, it's
> mandatory that they are part of the logs. For example if you decide that
> every *url* starting with "/img" has to be routed to the static server
> and the rest to the application, you're forgetting that "https://foo/img/;
> is valid as well and will be routed to the application. That's what I do
> not want to report fake or reconstructed information in the logs.
>
> In 1.8, what happened when we introduced H2 is that it was directly turned
> into HTTP/1.1 before being processed and that given that we didn't support
> server-side H2, the most seamless way to handle it was to just replace
> everything with origin requests (no authority). That remained till 2.0
> since it was not really acceptable to imagine that depending on whether
> you enabled HTX or not you'd get different logs for the exact same request.
> But now we cannot cheat anymore, it had caused too much trouble already

I clearly understood the problem is more complex than it seems in the
first place, due to protocol + internal representations changes that occured
recently.

> What I understand however is that it's possible that we need to rethink
> what we're logging. Maybe instead of logging the URI by default (and missing
> the Host in HTTP/1) we ought to instead log the scheme, authority and path
> parts. These are not always there (scheme or authority in H1) but we can
> log an empty field like "-" in this case.

Clearly that was my point. Especially when you manipulate "high-level variables"
such as %HP %HQ %HU and so on, you probably expect the hard work to be done
for you.

> We cannot realistically do that in the default "httplog" format, but we
> can imagine a new default format which would report such info (htxlog?),
> or maybe renaming httplog to http-legacy-log and changing the httplog's
> default. We should then consider this opportunity to revisit certain
> fields that do not make sense anymore, like the "~" in front of the
> frontend's name for SSL, the various timers that need to report idle
> and probably user-facing time instead of just data, etc.

I +1 on this as a tradeoff (even though ~ in front frontends is already OK
by using %f vs. %ft - I understand it's more a matter of leveraging this
change in order to remove tech debt).

> So I think it's the right place to open such a discussion (what we should
> log and whether or not it loses info by default or requires to duplicate
> some data while waiting for the response), so that we can reach a better
> and more modern solution. I'm open to proposals.

That's wider than I initially thought :)

--
Pierre


RE: Logging using %HP (path) produce different results with H1 and H2

2020-08-24 Thread Pierre Cheynier
On Fri, Aug 21, 2020 at 8:11 PM William Dauchy  wrote:

So awesome to get the first response from your direct colleague :)

> I believe this is expected; this behaviour has changed since v2.1 though.

Indeed, we don't use this logging variable since a long time, so I'm not really 
able to confirm if this is so new.
Anyway, I understand this is related to handling h2 and its specifics, still I 
think there should be something to fix (in one way or the other) to get back to 
a consistent/deterministic meaning of %HP (and maybe in other places where this 
had an impact).

Willy, any thought about this?

Pierre


Logging using %HP (path) produce different results with H1 and H2

2020-08-21 Thread Pierre Cheynier
Hi list,

We're running HAProxy 2.2.2.
It turns out logging requests paths using "%HP" var produce a different results 
on H1 vs. H2.

H1: /path
H2: https://hostname.domain/path (< I consider this one buggy)

No idea where does this comes from exactly, I essentially understand txn->uri 
structure ends up being completely different between the 2 code paths.
Anybody here with lightspeed knowledge of src/h* to investigate this?

Best,

--
Pierre


[PATCH] CLEANUP: contrib/prometheus-exporter: typo fixes for ssl reuse metric

2020-07-07 Thread Pierre Cheynier
A typo I identified while having a look to our metric inventory.

---
 contrib/prometheus-exporter/README   | 2 +-
 contrib/prometheus-exporter/service-prometheus.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/contrib/prometheus-exporter/README 
b/contrib/prometheus-exporter/README
index 1c5a99241..a1b9e269c 100644
--- a/contrib/prometheus-exporter/README
+++ b/contrib/prometheus-exporter/README
@@ -122,7 +122,7 @@ Exported metrics
 | haproxy_process_max_ssl_rate   | Maximum observed number of 
SSL sessions per second.   |
 | haproxy_process_current_frontend_ssl_key_rate  | Current frontend SSL Key 
computation per second over last elapsed second. |
 | haproxy_process_max_frontend_ssl_key_rate  | Maximum observed frontend 
SSL Key computation per second. |
-| haproxy_process_frontent_ssl_reuse | SSL session reuse ratio 
(percent).|
+| haproxy_process_frontend_ssl_reuse | SSL session reuse ratio 
(percent).|
 | haproxy_process_current_backend_ssl_key_rate   | Current backend SSL Key 
computation per second over last elapsed second.  |
 | haproxy_process_max_backend_ssl_key_rate   | Maximum observed backend 
SSL Key computation per second.  |
 | haproxy_process_ssl_cache_lookups_total| Total number of SSL session 
cache lookups.|
diff --git a/contrib/prometheus-exporter/service-prometheus.c 
b/contrib/prometheus-exporter/service-prometheus.c
index 952558c70..009e817ae 100644
--- a/contrib/prometheus-exporter/service-prometheus.c
+++ b/contrib/prometheus-exporter/service-prometheus.c
@@ -485,7 +485,7 @@ const struct ist promex_inf_metric_names[INF_TOTAL_FIELDS] 
= {
[INF_MAX_SSL_RATE]   = IST("max_ssl_rate"),
[INF_SSL_FRONTEND_KEY_RATE]  = 
IST("current_frontend_ssl_key_rate"),
[INF_SSL_FRONTEND_MAX_KEY_RATE]  = IST("max_frontend_ssl_key_rate"),
-   [INF_SSL_FRONTEND_SESSION_REUSE_PCT] = IST("frontent_ssl_reuse"),
+   [INF_SSL_FRONTEND_SESSION_REUSE_PCT] = IST("frontend_ssl_reuse"),
[INF_SSL_BACKEND_KEY_RATE]   = 
IST("current_backend_ssl_key_rate"),
[INF_SSL_BACKEND_MAX_KEY_RATE]   = IST("max_backend_ssl_key_rate"),
[INF_SSL_CACHE_LOOKUPS]  = IST("ssl_cache_lookups_total"),
-- 
2.27.0




RE: native prometheus exporter: retrieving check_status

2019-11-20 Thread Pierre Cheynier
>> My only fear for this point would be to make the code too complicated
>> and harder to maintain.
>>
>
> And slow down the exporter execution. Moreover, everyone will have a 
> different 
> opinion on how to aggregate the stats. My first idea was to sum all servers 
> counters. But Pierre's reply shown me that it's not what he expects.

I agree it's probably too complex and opinionated. Let see how it goes with 
servers
aggregations only, done on prometheus side,  since it's a server-related field 
initially.
If we identify issues/bottlenecks with output size we'll reopen this thread.

-- 
Pierre


RE: native prometheus exporter: retrieving check_status

2019-11-20 Thread Pierre Cheynier
> Ok, so it is a new kind of metric. I mean, not exposed by HAProxy. It would 
> require an extra loop on all servers for each backend. It is probably doable 
> for 
> the check_status. For the code, I don't know. Because it is not exclusive to 
> HTTP checks. it is also used for SMTP and LDAP checks. At the end, I think a 
> better idea would be to have a way to get specifics metrics in each scope and 
> let Prometheus handling the aggregation. This way, everyone is free to choose 
> how to proceed while limiting the number of metrics exported.

Fair enough, as stated on the other thread with William we'll see how it goes 
doing
it this way. If we have issues related to output size we'll start a new 
discussion.

Thanks!

-- 
Pierre




RE: native prometheus exporter: retrieving check_status

2019-11-19 Thread Pierre Cheynier
> Hi Pierre,

Hi!,

> I addressed this issue based on a William's idea. I also proposed to add a 
> filter to exclude all servers in maintenance from the export. Let me know if 
> you 
> see a better way to do so. For the moment, from the exporter point of view, 
> it 
> is not really hard to do such filtering.

Yes, that's a great addition, and should improve a lot, but I'm still not sure 
if it will be
sufficient (meaning that we could end up dropping all servers if the endpoint 
are still
too huge, as we used to do with the old exporter).

BTW, we also did that since we're using server-templates, and the naming in 
templates
make the server-name information useless (since we can't modify the name at 
runtime).
So we previously had a sufficient level of info at backend level, thanks to 
native
aggregations.

>> [ST_F_CHECK_STATUS]   = IST("untyped"),
>> What could be done to be able to retrieve them? (I thought about something 
>> similar to 
>> `HRSP_[1-5]XX`, where the different check status could be defined and 
>> counted).
>> 
>
> Hum, I can add the check status. Mapping all status on integers is possible. 
> However, having a metric per status is probably not the right solution, 
> because 
> it is not a counter but just a state (a boolean). If we do so, all status 
> would 
> be set to 0 except the current status. It is not really handy. But a mapping 
> is 
> possible. We already do this for the frontend/backend/server status 
> (ST_F_STATUS).

Yes, it would work perfectly. At the end the goal for us would be to be able to 
retrieve this
state. My idea about a counter was more about backend-level aggregations, if 
consistent
(I'm not sure it is actually, hence the feedback request).

>> * also for `check_status`, there is the case of L7STS and its associated 
>> values that are present
>> in another field. Most probably it could benefit from a better 
>> representation in a prometheus
>> output (thanks to labels)?
>>
> We can also export the metrics ST_F_CHECK_CODE. For the use of labels, I have 
> no 
> idea. For now, the labels are static in the exporter. And I don't know if it 
> is 
> pertinent to add dynamic info in labels. If so, what is your idea ? Add a 
> "code" 
> label associated to the check_status metric ?

Here again, my maybe-not-so-good idea was to keep the ability to retrieve all 
the
underlying details at backend level, such as:
* 100 servers are L7OK
* 1 server is L4TOUT
* 2 servers are L4CON
* 2 servers are L7STS
** 1 due to a HTTP 429
** 1 due to a HTTP 503

But this is maybe overkill in terms of complexity, we could maybe push more on
our ability to retrieve non-maint servers status.

> It is feasible. But only counters may be aggregated. It may be enabled using 
> a 
> parameter in the query-string. However, it is probably pertinent only when 
> the 
> server metrics are filtered out. Because otherwise, Prometheus can handle the 
> aggregation itself.

Sure, we should rely on this as much as possible.

--
Pierre


native prometheus exporter: retrieving check_status

2019-11-15 Thread Pierre Cheynier
Hi list,

We've recently tried to switch to the native prometheus exporter, but went 
quickly stopped in our initiative given the output on one of our preprod server:

$ wc -l metrics.out 
1478543 metrics.out
$ ls -lh metrics.out 
-rw-r--r-- 1 pierre pierre 130M nov.  15 15:33 metrics.out

This is not only due to a large setup, but essentially related to server lines, 
since we extensively user server-templates for server addition/deletion at 
runtime.

# backend & servers number
$ echo "show stat -1 2 -1" | sudo socat stdio /var/lib/haproxy/stats | wc -l
1309
$ echo "show stat -1 4 -1" | sudo socat stdio /var/lib/haproxy/stats | wc -l
36360
# But a lot of them are actually "waiting to be provisioned" (especially on 
this preprod environment)
$ echo "show stat -1 4 -1" | sudo socat stdio /var/lib/haproxy/stats | grep 
MAINT | wc -l
34113

We'll filter out the server metrics as a quick fix, and will hopefully submit 
something to do it natively, but we would also like to get your feedbacks about 
some use-cases we expected to solve with this native exporter.

Ultimately, one of them would be a great value-added for us: being able to 
count check_status types (and their values in the L7STS case) per backend.

So, there are 3 associated points:
* it's great to have new metrics (such as 
`haproxy_process_current_zlib_memory`), but we also noticed that some very 
useful ones were not present due to their type, example:
[ST_F_CHECK_STATUS]   = IST("untyped"),
What could be done to be able to retrieve them? (I thought about something 
similar to `HRSP_[1-5]XX`, where the different check status could be defined 
and counted).

* also for `check_status`, there is the case of L7STS and its associated values 
that are present in another field. Most probably it could benefit from a better 
representation in a prometheus output (thanks to labels)?

* what about getting some backend-level aggregation of server metrics, such as 
the one that was previously mentioned, to avoid retrieving all the server 
metrics but still be able to get some insights?
I'm thinking about an aggregation of some fields at backend level, which was 
not previously done with the CSV output.

Thanks for your feedbacks,

Pierre


[PATCH] ssl: ability to set TLS 1.3 ciphers using ssl-default-server-ciphersuites

2019-03-21 Thread Pierre Cheynier
Any attempt to put TLS 1.3 ciphers on servers failed with output 'unable
to set TLS 1.3 cipher suites'.

This was due to usage of SSL_CTX_set_cipher_list instead of
SSL_CTX_set_ciphersuites in the TLS 1.3 block (protected by
OPENSSL_VERSION_NUMBER >= 0x10101000L & so).

Signed-off-by: Pierre Cheynier 
Reported-by: Damien Claisse 
---
 src/ssl_sock.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/ssl_sock.c b/src/ssl_sock.c
index 138b1c58c..47548edc1 100644
--- a/src/ssl_sock.c
+++ b/src/ssl_sock.c
@@ -4785,7 +4785,7 @@ int ssl_sock_prepare_srv_ctx(struct server *srv)
 
 #if (OPENSSL_VERSION_NUMBER >= 0x10101000L && !defined OPENSSL_IS_BORINGSSL && 
!defined LIBRESSL_VERSION_NUMBER)
if (srv->ssl_ctx.ciphersuites &&
-   !SSL_CTX_set_cipher_list(srv->ssl_ctx.ctx, 
srv->ssl_ctx.ciphersuites)) {
+   !SSL_CTX_set_ciphersuites(srv->ssl_ctx.ctx, 
srv->ssl_ctx.ciphersuites)) {
ha_alert("Proxy '%s', server '%s' [%s:%d] : unable to set TLS 
1.3 cipher suites to '%s'.\n",
 curproxy->id, srv->id,
 srv->conf.file, srv->conf.line, 
srv->ssl_ctx.ciphersuites);
-- 
2.20.1


RE: faster than load-server-state-from-file?

2018-10-03 Thread Pierre Cheynier
Hi Willy,

> Not really. Maybe we should see how the state file parser works, because
> multiple seconds to parse only 30K lines seems extremely long.

I would even say multiple minutes :)

> I'm just thinking about a few things. Probably that among these 30K servers,
> most of them are in fact tracking other ones ? In this case it could make
> sense to have an option to only dump servers which are not tracking
> others, as for a reload it can make quite some sense. Is this the case
> for you ?

What do you mean by "tracking other ones"?

What I can tell is that, for historical reasons, we named all server the same 
way for each backends (ie. srvN) in the configuration template, and are using 
"server templates" to add MAINT servers in the pool so that they can be added 
at runtime later.

This naming thing can be changed now, but I don't know this issue could be 
related or not.

What we're doing basically when getting a new event:
* if it requires to delete / update / add server(s) in one or multiple pools we 
only use the runtime API and try to reuse free slots.
* if a backend/frontend has to be created / updated / deleted OR if the free 
slots for a given backend is full we reload using a configuration template.
* in Jinja2 this template looks like (simplified):

backend be_foo
   
  {%- for server in servers %}
   server srv{{loop.index0}} {{server.address}}:{{server.port}} weight 
{{server.weight}}{%- if server.tls %} ssl{%- endif %} check port 8500
  {%- endfor %}
   # Create 25 free slots, servers are numbered from N to N+25
   server-template srv {{ servers|length }}-{{ servers|length + 25 }} 0.0.0.0:0 
check disabled

Doing this I noticed that we have a lot of 'bad reconciliations' triggering 
warning logs, such as:

[WARNING] can't find server 'srv28' with id '29' in backend with id '9' or name 
'be_test'
[WARNING] backend name mismatch: from server state file: 'be_foo', from running 
config 'be_bar'

I don't know if these inconsistencies (that clearly have to be fixed) can cause 
additional delays.

Thanks,

Pierre


RE: h2 + text/event-stream: closed on both sides by FIN/ACK?

2018-09-24 Thread Pierre Cheynier
> Hi Pierre,
Hi Willy, 

> The close on the server side is expected, that's a limitation of the current
> design that we're addressing for 1.9 and which is much harder than initially
>expected. The reason is that streams are independent in H2 while in H1 the
> same stream remains idle and recycled for a new request, allowing us to keep
> the server-side connection alive. Thus in H2 we can't benefit from the
> keep-alive mechanisms we have in H1. But we're currently working on
> addressing this. As a side effect, it should end up considerably simplifying
> the H1 code as well, but for now it's a nightmare, too many changes at once...

OK, I conclude this SSE pattern is not working out-of-the-box when using h2 as 
of
now. Is it still true even if setting the user set the proper connection 
headers on
server side?

Thanks,

Pierre


RE: h2 + text/event-stream: closed on both sides by FIN/ACK?

2018-09-24 Thread Pierre Cheynier
> You'll notice that in the HTTP/2 case, the stream is closed as you mentioned
> (DATA len=0 + ES=1) then HAProxy immediately send FIN-ACK to the server.
> Same for the client just after it forwarded the headers. It never wait for 
> any 
> SSE frame.

EDIT: in fact, analyzing my capture, I see that my workstation (curl) may be the
originator, since it sends a close at TLS level (the close-notify)..

$ curl --version
curl 7.61.0 (x86_64-pc-linux-gnu) libcurl/7.61.0 OpenSSL/1.1.0h zlib/1.2.11 
libidn2/2.0.5 libpsl/0.20.2 (+libidn2/2.0.4) libssh2/1.8.0 nghttp2/1.32.0 
librtmp/2.3
Release-Date: 2018-07-11
Protocols: dict file ftp ftps gopher http https imap imaps ldap ldaps pop3 
pop3s rtmp rtsp scp sftp smb smbs smtp smtps telnet tftp 
Features: AsynchDNS IDN IPv6 Largefile GSS-API Kerberos SPNEGO NTLM NTLM_WB SSL 
libz TLS-SRP HTTP2 UnixSockets HTTPS-proxy PSL 

curl or haproxy issue? what do you think?

Pierre


faster than load-server-state-from-file?

2018-09-21 Thread Pierre Cheynier
I'm extensively using server-templates to avoid reloading too much but still, 
backend creation or deletion has to be done by reloading as far as I know. In 
my specific context, it can happen every 5/10s or so.
As a consequence, I have a lot of servers in the server-state file (>30K lines).

Trying to use load-server-state-from-file to prevent sending trafic to KO 
servers and retoring stats numbers, I feel that it slows down the reload a lot 
(multiple seconds).

Any known hint or alternative?

Thanks,

Pierre Cheynier


h2 + text/event-stream: closed on both sides by FIN/ACK?

2018-09-21 Thread Pierre Cheynier
Hi list,

We observed a weird behavior yesterday at introducing h2 in a preproduction 
environment: *the connection is being closed by haproxy both on server and 
client side by immediately sending a FIN/ACK when using SSE 
(text/event-stream)*.

Let me know if you see something obvious here, or if this is candidate to a bug.

We have a service using SSE through text/event-stream content-type.

In HTTP/1.1 we have a normal stream as expected :
< HTTP/1.1 200 OK
< Content-Type: text/event-stream
data: {"a": "b"}

data: {"a": "b"}

data: {"a": "b"}
(...)

HAProxy on its side adds the `Connection: close` header.

When adding 'alpn h2,http/1.1' to the bind directive, we observe the following: 
after the first 200OK, the connection is closed by haproxy both on server and 
client side by sending a FIN/ACK.

It's obviously the same pattern than above on LB<>backend side, since there is 
a translation h2 to http/1.1. On client side it gives:

$ curl -vv (...)
(...)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: none
(...)
* ALPN, server accepted to use h2
* Server certificate:
(...)
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x55d5e9228de0)
> GET /something HTTP/2
> Host: 
> User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:62.0) Gecko/20100101 
> Firefox/62.0
> Accept: text/event-stream
> Accept-Language: en-US,en;q=0.5
> Accept-Encoding: gzip, deflate, br
> Referer:  Cookie: jwt=
> Connection: keep-alive
> Pragma: no-cache
> Cache-Control: no-cache
> 
* Connection state changed (MAX_CONCURRENT_STREAMS == 100)!
< HTTP/2 200 
< content-type: text/event-stream
< 
* Connection #0 to host  left intact

So the connection is abruptly closed.
Here is the config:

$ haproxy -vv
HA-Proxy version 1.8.14-52e4d43 2018/09/20
Copyright 2000-2018 Willy Tarreau 

Build options :
  TARGET  = linux2628
  CPU = generic
  CC  = gcc
  CFLAGS  = -O2 -g -fno-strict-aliasing -Wdeclaration-after-statement -fwrapv 
-fno-strict-overflow -Wno-unused-label -DTCP_USER_TIMEOUT=18
  OPTIONS = USE_LINUX_TPROXY=1 USE_GETADDRINFO=1 USE_ZLIB=1 USE_REGPARM=1 
USE_OPENSSL=1 USE_SYSTEMD=1 USE_PCRE=1 USE_PCRE_JIT=1 USE_TFO=1 USE_NS=1

Default settings :
  maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200

Built with OpenSSL version : OpenSSL 1.1.1  11 Sep 2018
Running on OpenSSL version : OpenSSL 1.1.1  11 Sep 2018
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
OpenSSL library supports : TLSv1.0 TLSv1.1 TLSv1.2 TLSv1.3
Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT 
IP_FREEBIND
Encrypted password support via crypt(3): yes
Built with multi-threading support.
Built with PCRE version : 8.32 2012-11-30
Running on PCRE version : 8.32 2012-11-30
PCRE library supports JIT : yes
Built with zlib version : 1.2.7
Running on zlib version : 1.2.7
Compression algorithms supported : identity("identity"), deflate("deflate"), 
raw-deflate("deflate"), gzip("gzip")
Built with network namespace support.

Available polling systems :
  epoll : pref=300,  test result OK
   poll : pref=200,  test result OK
 select : pref=150,  test result OK
Total: 3 (3 usable), will use epoll.

Available filters :
[SPOE] spoe
[COMP] compression
[TRACE] trace

$ sudo cat /etc/haproxy/haproxy.cfg | head -70
global
 (...)
 nbproc 1
 daemon
 stats socket /var/lib/haproxy/stats level admin mode 644 expose-fd 
listeners
 stats timeout 2m
 tune.bufsize 33792
 ssl-default-bind-options no-sslv3 no-tlsv10 no-tlsv11 no-tls-tickets
 (...)
 hard-stop-after 5400s
 nbthread 6
 cpu-map auto:1/1-6 0-5

defaults
 mode http
 (...)
 timeout connect 10s
 timeout client 180s
 timeout server 180s
 timeout http-keep-alive 10s
 timeout http-request 10s
 timeout queue 1s
 timeout check 5s
 (...)
 option http-keep-alive
 option forwardfor except 127.0.0.0/8
 balance roundrobin
 maxconn 262134
 http-reuse safe
(...)

frontend fe_main
bind *:80 name http_1 process 1/1
bind *:80 name http_2 process 1/2
bind *:80 name http_3 process 1/3
bind *:443 name https_4 ssl crt /etc/haproxy/tls/fe_main process 1/4 alpn 
http/1.1,h2
bind *:443 name https_5 ssl crt /etc/haproxy/tls/fe_main process 1/5 alpn 
http/1.1,h2
bind *:443 name https_6 ssl crt /etc/haproxy/tls/fe_main process 1/6 alpn 
http/1.1,h2
(...)
# Nothing specific in the backend (no override of the aforementioned settings).

Any idea?

Best regards,

Pierre Cheynier


No enabled listener found and reloads triggered an inconsistent state.

2018-04-04 Thread Pierre Cheynier
Hi there,

We had an issue recently, using 1.8.5. For some reason we ended up entering in 
the "No enabled listener found" state (I guess the config file was incomplete, 
being written at that time, something like that).

Here are the logs:

Apr 03 17:51:49 hostname systemd[1]: Reloaded HAProxy Load Balancer.
Apr 03 17:54:22 hostname haproxy[27090]: [WARNING] 092/175149 (27090) : 
Reexecuting Master process
Apr 03 17:54:22 hostname haproxy[27090]: [ALERT] 092/175422 (27090) : 
[/usr/sbin/haproxy.main()] No enabled listener found (check for 'b
Apr 03 17:54:22 hostname haproxy[27090]: [WARNING] 092/175422 (27090) : 
Reexecuting Master process in waitpid mode
Apr 03 17:54:22 hostname haproxy[27090]: [WARNING] 092/175422 (27090) : 
Reexecuting Master process
Apr 03 17:54:22 hostname systemd[1]: Reloaded HAProxy Load Balancer.

Subsequent reloads were OK.

The issue here is that it lets HAProxy in an inconsistent state: connections 
are almost always in timeout, including the one on the Unix socket, the 
healtcheck endpoint set using monitor-uri, etc., until a real stop/start is 
done (meaning the parent/worker is restarted itself).
Doing reloads doesn't fix here.

Any hints?

Thanks,

Pierre


Re: mworker: seamless reloads broken since 1.8.1

2018-01-24 Thread Pierre Cheynier
On 23/01/2018 19:29, Willy Tarreau wrote:
> Pierre, please give a try to the latest 1.8 branch or the next nightly
> snapshot tomorrow morning. It addresses the aforementionned issue, and
> I hope it's the same you're facing.
>
> Cheers,
> Willy
Willy, I confirm that it works well again running the following version:

$ haproxy -v
HA-Proxy version 1.8.3-945f4cf 2018/01/23

Added nbthread again, reloads are transparents.

Thanks,

Pierre



signature.asc
Description: OpenPGP digital signature


Different health-check URI per server : how would you do that ?

2018-01-24 Thread Pierre Cheynier
Hi,

We have a use-case in which the health-check URI is depending on the
server-name (be reassured, only the health-check :) ).

It would be something like:

backend be_testmode http[...] option httpchk get /check
HTTP/1.1\r\nHost: test.tld default-server inter 3s fall 3 rise 2  
server srv01 10.2.3.4:31432 check port 10345   server srv02
10.2.3.5:32498 check port 18452

Except that in this case it will use `/check` on any server, I would like it to 
be something like /check/srv01 or
something highly configurable.

How would you do that? LUA, external checks (will probably cause trouble
regarding perfs), anything else? Is it worth thinking about this (pretty
rare) use-case in HAProxy itself?

Thanks in advance,

Pierre



0x6E601DDB.asc
Description: application/pgp-keys


signature.asc
Description: OpenPGP digital signature


Re: mworker: seamless reloads broken since 1.8.1

2018-01-17 Thread Pierre Cheynier
Hi,

On 08/01/2018 14:32, Pierre Cheynier wrote:
> I retried this morning, I confirm that on 1.8.3, using
(...)
> I get RSTs (not seamless reloads) when I introduce the global/nbthread
> X, after a systemctl haproxy restart.

Any news on that ?

I saw one mworker commit ("execvp failure depending on argv[0]") but I
guess it's completely independent.

Thanks,

Pierre





Re: Warnings when using dynamic cookies and server-template

2018-01-17 Thread Pierre Cheynier
On 17/01/2018 15:56, Olivier Houchard wrote:
>
>> So, as a conclusion, I'm just not sure that producing this warning is
>> relevant in case the IP is duplicated for several servers *if they are
>> disabled*...
> Or maybe we should just advocate using 0.0.0.0 when we mean "no IP" :)
Not sure about that, 0.0.0.0 is not valid in the config in this case but
it is in some others (thinking about "bind" for ex.)

Advertising it can be a bit confusing.

> It would be a bit painful, though doable, to don't check if the server is
> disabled, but to add the check when server is enabled. I don't know if
> it's worth it.
>

To me this would be the best (functionally speaking, don't know about
the performances aspects :) ), since in the context of cookies, you
probably doing it wrong if you enable 2 servers with the same IP.

But server-templates are a good use-case in which you want to be able to
do what you want with disabled servers.

To be discussed with people on your side ? we can clearly live with both
options.

Pierre




Re: Warnings when using dynamic cookies and server-template

2018-01-17 Thread Pierre Cheynier
Hi,

On 16/01/2018 18:48, Olivier Houchard wrote:
>
> Not really :) That's not a case I thought of.
> The attached patch disables the generation of the dynamic cookie if the IP
> is 0.0.0.0 or ::, so that it only gets generated when the server gets a real
> IP. Is it OK with you ?
I'm not sure this will fix the issue.
In this blogpost
https://www.haproxy.com/fr/blog/dynamic-scaling-for-microservices-with-runtime-api/,
the example given is

server-template websrv 1-100192.168.122.1:8080check disabled

Which in fact will do exactly the same as 1 hundred times

server websrvN 192.168.122.1:8080 check disabled

right ?
(I just confirmed that BTW, by switching my 'high-level' templating to
HAProxy's one, and I reproduce the issue).

So, as a conclusion, I'm just not sure that producing this warning is
relevant in case the IP is duplicated for several servers *if they are
disabled*...

Pierre


Re: Warnings when using dynamic cookies and server-template

2018-01-16 Thread Pierre Cheynier
Hi Olivier,


On 16/01/2018 15:43, Olivier Houchard wrote:
> I'm not so sure about this.
> It won't be checked again when server are enabled, so you won't get the
> warning if it's still the case.
> You shouldn't get those warnings unless multiple servers have the same IP,
> though. What does your server-template look like ?
In fact, it's a confusion between 3 level of templating... /
At the end it's not a server template in HAProxy, but raw servers, in a
disabled state:

backend be_service
    server srv0 10.0.0.2:1234
    server srv1 0.0.0.0:0 check disabled
    server srv2 0.0.0.0:0 check disabled
    server srv3 0.0.0.0:0 check disabled
    server srv4 0.0.0.0:0 check disabled

So I guess this warning was intended to cover this case, right ?

Pierre




Warnings when using dynamic cookies and server-template

2018-01-15 Thread Pierre Cheynier
Hello,

We started to use the server-template approach in which you basically
provision servers in backends using a "check disabled" state, then
re-enabling them using the Runtime API.

I recently noticed that when used with dynamic cookies, we end up
getting these warnings:

haproxy.c:149

    ha_warning("We generated two equal cookies for
two different servers.\n"
   "Please change the secret key for
'%s'.\n",
   s->proxy->id);

The check seems consistent, but not in the case of disabled servers,
what do you think ?

Pierre



signature.asc
Description: OpenPGP digital signature


Re: Cache & ACLs issue

2018-01-15 Thread Pierre Cheynier
Small update on this one: using a "path_beg" ACL it seems works...

So, I guess there is something wrong potentially outside of cache.c when
using path_end, but affecting the cache in some way.

I also noticed some suspicious entries in the cache, like hash=0, but
I'm not able to qualify when does it happens.

Regards,

Pierre


On 09/01/2018 19:37, Pierre Cheynier wrote:
> I'm experimenting the small objects cache feature in 1.8, maybe I'm
> doing something obviously wrong, but I don't get what...
>
> Here is my setup:
>
> (...)
>
> cache static_assets
>  total-max-size 100
>  max-age 60
>
> (...)
>
> frontend fe_main # HTTP(S) Service
>     bind *:80 name http
>
>     acl cached_service-acl hdr_dom(host) -i cached_service.localdomain
>     use_backend be_cached_service if cached_service
>
> backend be_cached_service
>     acl static_cached_paths path_end -i my/resource/path
>     http-request cache-use static_assets if static_cached_paths
>     http-response cache-store static_assets
>     server srv0 127.0.0.1:8000
>
> In that case I can request on /my/resource/path and I'll have something
> stored in the cache:
>
> $ $ curl -v -L http://127.0.0.1/my/resource/path -H "Host:
> cached_service.localdomain"
> (...)
> < HTTP/1.1 200 OK
> < Content-Type: application/json
> < Date: Tue, 09 Jan 2018 18:32:05 GMT
> < Content-Length: 14
> <
> [
>     "tmp"
> ]
>
> $ echo "show cache static_assets" | sudo socat stdio /var/lib/haproxy/stats
> 0x7fbfa94df03a: static_assets (shctx:0x7fbfa94df000, available
> blocks:102400)
> 0x7fbfa94e11ac hash:3952565486 size:190 (1 blocks), refcount:0, expire:54
>
> But, if I request something else, it currently override the first cached
> asset and will be served then...
>
> $ curl -v -L http://127.0.0.1/my/other/resource/path -H "Host:
> cached_service.localdomain"
> (...)
>
> < HTTP/1.1 200 OK
> < Content-Type: application/json
> < X-Consul-Index: 77
> < X-Consul-Knownleader: true
> < X-Consul-Lastcontact: 0
> < Date: Tue, 09 Jan 2018 18:33:30 GMT
> < Content-Length: 547
> <
> {    "something_more_verbose(...)"   }
>
> $ echo "show cache" | sudo socat stdio /var/lib/haproxy/stats
> 0x7f4c5ea5a03a: static_assets (shctx:0x7f4c5ea5a000, available
> blocks:102400)
> 0x7f4c5ea5a4cc hash:3952565486 size:797 (1 blocks), refcount:0, expire:55
>
> The entry has been flushed and replaced by the new one, independently
> from the expiration state.
>
> In that case it's consul that answer, so it explains these X-Consul
> headers for the 2nd response.
>
> Does it ring a bell to someone ?
>
> Thanks,
>
> Pierre
>
>




signature.asc
Description: OpenPGP digital signature


Cache & ACLs issue

2018-01-09 Thread Pierre Cheynier
I'm experimenting the small objects cache feature in 1.8, maybe I'm
doing something obviously wrong, but I don't get what...

Here is my setup:

(...)

cache static_assets
 total-max-size 100
 max-age 60

(...)

frontend fe_main # HTTP(S) Service
    bind *:80 name http

    acl cached_service-acl hdr_dom(host) -i cached_service.localdomain
    use_backend be_cached_service if cached_service

backend be_cached_service
    acl static_cached_paths path_end -i my/resource/path
    http-request cache-use static_assets if static_cached_paths
    http-response cache-store static_assets
    server srv0 127.0.0.1:8000

In that case I can request on /my/resource/path and I'll have something
stored in the cache:

$ $ curl -v -L http://127.0.0.1/my/resource/path -H "Host:
cached_service.localdomain"
(...)
< HTTP/1.1 200 OK
< Content-Type: application/json
< Date: Tue, 09 Jan 2018 18:32:05 GMT
< Content-Length: 14
<
[
    "tmp"
]

$ echo "show cache static_assets" | sudo socat stdio /var/lib/haproxy/stats
0x7fbfa94df03a: static_assets (shctx:0x7fbfa94df000, available
blocks:102400)
0x7fbfa94e11ac hash:3952565486 size:190 (1 blocks), refcount:0, expire:54

But, if I request something else, it currently override the first cached
asset and will be served then...

$ curl -v -L http://127.0.0.1/my/other/resource/path -H "Host:
cached_service.localdomain"
(...)

< HTTP/1.1 200 OK
< Content-Type: application/json
< X-Consul-Index: 77
< X-Consul-Knownleader: true
< X-Consul-Lastcontact: 0
< Date: Tue, 09 Jan 2018 18:33:30 GMT
< Content-Length: 547
<
{    "something_more_verbose(...)"   }

$ echo "show cache" | sudo socat stdio /var/lib/haproxy/stats
0x7f4c5ea5a03a: static_assets (shctx:0x7f4c5ea5a000, available
blocks:102400)
0x7f4c5ea5a4cc hash:3952565486 size:797 (1 blocks), refcount:0, expire:55

The entry has been flushed and replaced by the new one, independently
from the expiration state.

In that case it's consul that answer, so it explains these X-Consul
headers for the 2nd response.

Does it ring a bell to someone ?

Thanks,

Pierre




signature.asc
Description: OpenPGP digital signature


Re: mworker: seamless reloads broken since 1.8.1

2018-01-08 Thread Pierre Cheynier
Hi,

On 08/01/2018 10:24, Lukas Tribus wrote:
>
> FYI there is a report on discourse mentioning this problem, and the
> poster appears to be able to reproduce the problem without nbthread
> paramter as well:
>
> https://discourse.haproxy.org/t/seamless-reloads-dont-work-with-systemd/1954
>
>
> Lukas
I retried this morning, I confirm that on 1.8.3, using

$ haproxy -vv
HA-Proxy version 1.8.3-205f675 2017/12/30
Copyright 2000-2017 Willy Tarreau 

Build options :
  TARGET  = linux2628
  CPU = generic
  CC  = gcc
  CFLAGS  = -O2 -g -fno-strict-aliasing -Wdeclaration-after-statement
-fwrapv -Wno-unused-label -DTCP_USER_TIMEOUT=18
  OPTIONS = USE_LINUX_TPROXY=1 USE_GETADDRINFO=1 USE_ZLIB=1
USE_REGPARM=1 USE_OPENSSL=1 USE_SYSTEMD=1 USE_PCRE=1 USE_PCRE_JIT=1
USE_TFO=1 USE_NS=1

Default settings :
  maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200

I get RSTs (not seamless reloads) when I introduce the global/nbthread
X, after a systemctl haproxy restart.

Pierre




Re: mworker: seamless reloads broken since 1.8.1

2018-01-05 Thread Pierre Cheynier
On 05/01/2018 16:44, William Lallemand wrote:
> I'm able to reproduce, looks like it happens with the nbthread parameter only,
Exact, I observe the same.
At least I have a workaround for now to perform the upgrade.
> I'll try to find the problem in the code.
>
Thanks !

Pierre




Re: mworker: seamless reloads broken since 1.8.1

2018-01-05 Thread Pierre Cheynier

>> Hi,
>>
>>> Your systemd configuration is not uptodate.
>>>
>>> Please:
>>> - make sure haproxy is compiled with USE_SYSTEMD=1
>>> - update the unit file: start haproxy with -Ws instead of -W (ExecStart)
>>> - update the unit file: use Type=notify instead of Type=forking
>> In fact that should work with this configuration too.
> OK, I have to admit that we started experiments on 1.8-dev2, at that
> time I had to do that to make it work.
> And true, we build the RPM and so didn't notice there was some updates
> after the 1.8.0 release for the systemd unit file provided in contrib/.
> Currently recompiling, bumping the release on CI / dev environment etc...
>>  
>>> We always ship an uptodate unit file in
>>> contrib/systemd/haproxy.service.in (just make sure you maintain the
>>> $OPTIONS variable, otherwise you are missing the -x call for the
>>> seamless reload).
>> You don't need the -x with -W or -Ws, it's added automaticaly by the master
>> during a reload. 
> Interesting. Is this new ? Because I noticed it was not the case at some
> point.
>>> Run "systemctl daemon-reload" after updating the unit file and
>>> completely stop the old service (don't reload after updating the unit
>>> file), to make sure you have a "clean" situation.
>>>
>>> I don't see how this systemd thing would affect the actual seamless
>>> reload (systemd shouldn't be a requirement), but lets fix it
>>> nonetheless before continuing the troubleshooting. Maybe the
>>> regression only affects non-systemd mode.
>> Shouldn't be a problem, but it's better to use -Ws with systemd.
>>
>> During a reload, if the -x fail, you should have this kind of errors:
>>
>> [WARNING] 004/135908 (12013) : Failed to connect to the old process socket 
>> '/tmp/sock4'
>> [ALERT] 004/135908 (12013) : Failed to get the sockets from the old process!
>>
>> Are you seeing anything like this?
> Yes, in > 1.8.0. If I rollback to 1.8.0 it's fine on this aspect.
>
> I'll give updates after applying Lukas recommendations.
>
> Pierre
>
OK so now that I've applied all of Lukas recos (I kept the -x added ) :

* I don't see any ALERT log anymore.. Only the WARNs

Jan 05 14:47:12 hostname systemd[1]: Reloaded HAProxy Load Balancer.
Jan 05 14:47:12 hostname haproxy[59888]: [WARNING] 004/144712 (59888) :
Former worker 61331 exited with code 0
Jan 05 14:47:25 hostname haproxy[59888]: [WARNING] 004/144712 (59888) :
Reexecuting Master process
Jan 05 14:47:26 hostname systemd[1]: Reloaded HAProxy Load Balancer.
Jan 05 14:47:26 hostname haproxy[59888]: [WARNING] 004/144726 (59888) :
Former worker 61355 exited with code 0

* I still observe the same issue (here doing an ab during a
rolling/upgrade of my test app => consequently triggering N reloads on
HAProxy as long as the app instances are created/destroyed).

$ ab -n10  http://test-app.tld/
(..)
Benchmarking test-app.tld (be patient)
apr_socket_recv: Connection reset by peer (104)
Total of 3031 requests completed

Pierre




signature.asc
Description: OpenPGP digital signature


Re: mworker: seamless reloads broken since 1.8.1

2018-01-05 Thread Pierre Cheynier

> Hi,
>
>>> $ cat /usr/lib/systemd/system/haproxy.service
>>> [Unit]
>>> Description=HAProxy Load Balancer
>>> After=syslog.target network.target
>>>
>>> [Service]
>>> EnvironmentFile=/etc/sysconfig/haproxy
>>> ExecStartPre=/usr/sbin/haproxy -f $CONFIG -c -q
>>> ExecStart=/usr/sbin/haproxy -W -f $CONFIG -p $PIDFILE $OPTIONS
>>> ExecReload=/usr/sbin/haproxy -f $CONFIG -c -q
>>> ExecReload=/bin/kill -USR2 $MAINPID
>>> Type=forking
>>> KillMode=mixed
>>> Restart=always
>> Your systemd configuration is not uptodate.
>>
>> Please:
>> - make sure haproxy is compiled with USE_SYSTEMD=1
>> - update the unit file: start haproxy with -Ws instead of -W (ExecStart)
>> - update the unit file: use Type=notify instead of Type=forking
> In fact that should work with this configuration too.
OK, I have to admit that we started experiments on 1.8-dev2, at that
time I had to do that to make it work.
And true, we build the RPM and so didn't notice there was some updates
after the 1.8.0 release for the systemd unit file provided in contrib/.
Currently recompiling, bumping the release on CI / dev environment etc...
>  
>> We always ship an uptodate unit file in
>> contrib/systemd/haproxy.service.in (just make sure you maintain the
>> $OPTIONS variable, otherwise you are missing the -x call for the
>> seamless reload).
> You don't need the -x with -W or -Ws, it's added automaticaly by the master
> during a reload. 
Interesting. Is this new ? Because I noticed it was not the case at some
point.
>> Run "systemctl daemon-reload" after updating the unit file and
>> completely stop the old service (don't reload after updating the unit
>> file), to make sure you have a "clean" situation.
>>
>> I don't see how this systemd thing would affect the actual seamless
>> reload (systemd shouldn't be a requirement), but lets fix it
>> nonetheless before continuing the troubleshooting. Maybe the
>> regression only affects non-systemd mode.
> Shouldn't be a problem, but it's better to use -Ws with systemd.
>
> During a reload, if the -x fail, you should have this kind of errors:
>
> [WARNING] 004/135908 (12013) : Failed to connect to the old process socket 
> '/tmp/sock4'
> [ALERT] 004/135908 (12013) : Failed to get the sockets from the old process!
>
> Are you seeing anything like this?
Yes, in > 1.8.0. If I rollback to 1.8.0 it's fine on this aspect.

I'll give updates after applying Lukas recommendations.

Pierre




signature.asc
Description: OpenPGP digital signature


mworker: seamless reloads broken since 1.8.1

2018-01-05 Thread Pierre Cheynier
Hi list,

We've recently tried to upgrade from 1.8.0 to 1.8.1, then 1.8.2, 1.8.3
on a preprod environment and noticed that the reload is not so seamless
since 1.8.1 (easily getting TCP RSTs while reloading).

Having a short look on the haproxy-1.8 git remote on the changes
affecting haproxy.c, c2b28144 can be eliminated, so 3 commits remains:

* 3ce53f66 MINOR: threads: Fix pthread_setaffinity_np on FreeBSD.  (5
weeks ago)
* f926969a BUG/MINOR: mworker: detach from tty when in daemon mode  (5
weeks ago)
* 4e612023 BUG/MINOR: mworker: fix validity check for the pipe FDs  (5
weeks ago)

In case it matters: we use threads and did the usual worker setup (which
again works very well in 1.8.0).
Here is a config extract:

$ cat /etc/haproxy/haproxy.cfg:
(...)
user haproxy
group haproxy
nbproc 1
daemon
stats socket /var/lib/haproxy/stats level admin mode 644 expose-fd listeners
stats timeout 2m
nbthread 11
(...)

$ cat /etc/sysconfig/haproxy
(...)
CONFIG="/etc/haproxy/haproxy.cfg"
PIDFILE="/run/haproxy.pid"
OPTIONS="-x /var/lib/haproxy/stats"
(...)

$ cat /usr/lib/systemd/system/haproxy.service
[Unit]
Description=HAProxy Load Balancer
After=syslog.target network.target

[Service]
EnvironmentFile=/etc/sysconfig/haproxy
ExecStartPre=/usr/sbin/haproxy -f $CONFIG -c -q
ExecStart=/usr/sbin/haproxy -W -f $CONFIG -p $PIDFILE $OPTIONS
ExecReload=/usr/sbin/haproxy -f $CONFIG -c -q
ExecReload=/bin/kill -USR2 $MAINPID
Type=forking
KillMode=mixed
Restart=always

Does the behavior observed sounds consistent regarding the changes that
occurred between 1.8.0 and 1.8.1 ? Before trying to bisect, compile,
test etc. I'd like to get your feedback.

Thanks in advance,

Pierre




signature.asc
Description: OpenPGP digital signature


RE: [RFC][PATCHES] seamless reload

2017-05-04 Thread Pierre Cheynier
Hi Olivier,

Many thanks for that ! As you know, we are very interested on this topic.
We'll test your patches soon for sure.

Pierre


RE: frequently reload haproxy without sleep time result in old haproxy process never dying

2017-02-07 Thread Pierre Cheynier
Hi,

I guess you're using a systemd-based distro.  You should have a look at this 
thread https://www.mail-archive.com/haproxy@formilux.org/msg23867.html.

The patches were applied to 1.7, but apparently backported to 1.6.11 and 1.5.19 
since.

Now I have a clean termination of old processes, no more orphans, even when 
performing a ton of reloads.

Pierre


RE: HAProxy reloads lets old and outdated processes

2016-10-25 Thread Pierre Cheynier
Hi,


I didn't subscribed to the list and noticed that there was several exchanges on 
this thread that I didn't read so far.


To share a bit more of our context:


* we do not reload every 2ms, this was the setting used to be able to reproduce 
easily and in a short period of time. Our reload average is more around 5 to 
10s, which seems consistent to me on relatively big setups (I'm talking about 1 
hundred of physical nodes per DC that makes run up to 1 thousand of app 
instances).


* true, it's something that becomes very common as long as I/PaaS-style 
architectures are adopted. On our side we work with Apache Mesos and schedulers 
that add/remove backends as long as the end-user scale his application or if 
node/app fails, are under maintenance etc.


By the way, I noticed that a lot of these "trending" projects are using HAProxy 
as their external load balancing stack (and most of them are also usually run 
over systemd-based distros), so it seems to me that this will fix some setups 
(that apparently rely on Yelp approach to 'safely restart' their haproxy - but 
induce latencies).


Apart from that, we exchanged off-list with Willy about the submitted patch. It 
seems that if fixes the issue. I now have only one instance bound to the TCP 
sockets after the reloads, the others are there just to terminate the existing 
connections.


Pierre


RE: HAProxy reloads lets old and outdated processes

2016-10-24 Thread Pierre Cheynier
> A solution I use is to delay next reload in systemd unit until a
> reload is in progress.

Unfortunately, even when doing this you can end up in the situation described 
before, because for systemd a reload is basically a SIGUSR2 to send. You do not 
wait for some callback saying "I'm now OK and fully reloaded" (if I'm wrong, I 
could be interested in your systemd setup).

I naively tried such approach by adding a grace period of 2s (sleep) and avoid 
to send another reload during that period, but at some point you'll encounter 
the same issue when upstream contention will be higher (meaning that you'll 
have ton of things to reload, you'll then add delay and decrease real-time 
aspect of your solution etc etc).


RE: HAProxy reloads lets old and outdated processes

2016-10-24 Thread Pierre Cheynier
> Same for all of them. Very interesting, SIGUSR2 (12) is set
> in SigIgn :-)  One question is "why", but at least we know we
> have a workaround consisiting in unblocking these signals in
> haproxy-systemd-wrapper, as we did in haproxy.

> Care to retry with the attached patch ?

Same behaviour.

SigIgn is still 1000 (which is probably normal, I assume the goal 
was to ignore that).

Pierre



RE: HAProxy reloads lets old and outdated processes

2016-10-24 Thread Pierre Cheynier
Hi, 

Sorry, wrong order in the answers.

> Yes it has something to do with it because it's the systemd-wrapper which
> delivers the signal to the old processes in this mode, while in the normal
> mode the processes get the signal directly from the new process. Another
> important point is that exactly *all* users having problem with zombie
> processes are systemd users, with no exception. And this problem has never
> existed over the first 15 years where systems were using a sane init
> instead and still do not exist on non-systemd OSes.

Unfortunately, I remember we had the same issue (but less frequently) on 
CentOS6 which is init-based.
I tried to reproduce, but didn't succeed... So let's ignore that for now, it 
was maybe related to something else.

> OK that's interesting. And when this happens, they stay there forever ?

Yes, these process are never stopped and are still bound to the socket.

> Ah this is getting very interesting. Maybe we should hack systemd-wrapper
> to log the signals it receives and the signals and pids it sends to see
> what is happening here. It may also be that the signal is properly sent
> but never received (but why ?).

Clearly. Apparently I sometimes have a wrong information in the pidfile...

Have a look at journald logs: 

Oct 24 12:26:57 haproxys01e02-par haproxy-systemd-wrapper[44319]: 
haproxy-systemd-wrapper: executing /usr/sbin/haproxy -f 
/etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds -sf 44941
Oct 24 12:26:57 haproxys01e02-par haproxy-systemd-wrapper[44319]: [WARNING] 
297/122657 (44951) : config : 'option forwardfor' ignored for frontend 
'https-in' as it requires HTTP mode.
Oct 24 12:27:00 haproxys01e02-par haproxy-systemd-wrapper[44319]: 
haproxy-systemd-wrapper: executing /usr/sbin/haproxy -f 
/etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds -sf 44952
Oct 24 12:27:00 haproxys01e02-par haproxy-systemd-wrapper[44319]: [WARNING] 
297/122700 (44978) : config : 'option forwardfor' ignored for frontend 
'https-in' as it requires HTTP mode.
Oct 24 12:27:05 haproxys01e02-par haproxy-systemd-wrapper[44319]: 
haproxy-systemd-wrapper: executing /usr/sbin/haproxy -f 
/etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds -sf 44983
Oct 24 12:27:05 haproxys01e02-par haproxy-systemd-wrapper[44319]: [WARNING] 
297/122705 (45131) : config : 'option forwardfor' ignored for frontend 
'https-in' as it requires HTTP mode.
Oct 24 12:27:09 haproxys01e02-par haproxy-systemd-wrapper[44319]: 
haproxy-systemd-wrapper: executing /usr/sbin/haproxy -f 
/etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds -sf 45132
Oct 24 12:27:09 haproxys01e02-par haproxy-systemd-wrapper[44319]: [WARNING] 
297/122709 (45146) : config : 'option forwardfor' ignored for frontend 
'https-in' as it requires HTTP mode.

Hopefully I've an error in my config, which let me see the process of the first 
child :).
Here we can the that: 
* 44978 references (-sf) 44952 (child of 44951)
* 45131 references 44983=nobody that we've seen in the logs... (so 44978 and 
its child will stay alive forever !)
* 45146 references 45132 (child of 45131)

> That's very kind, thank you. However I don't have access to a docker
> machine but I know some people on the list do so I hope we'll quickly
> find the cause and hopefully be able to fix it (unless it's another
> smart invention from systemd to further annoy running deamons).

> Another important point, when you say you restart every 2ms, are you
> certain you have a way to ensure that everything is completely started
> before you issue your signal to kill the old process ? 
> (..)
> So at 2ms I could easily imagine that we're delivering signals to a
> starting process, maybe even before it has the time to register a signal
> handler, and that these signals are lost before the sub-processes are
> started. 

Clearly no, my test is trivial, but as I observe the behaviour on a platform 
that operates at a different time scale (reload every 1 to 10 seconds average), 
it was just a way to reproduce the issue and be able to investigate in the 
container for ex. with gdb.

> Regards,
> Willy

Thanks !
Pierre


RE: HAProxy reloads lets old and outdated processes

2016-10-24 Thread Pierre Cheynier
Hi,

> Pierre, could you please issue "grep ^Sig /proc/pid/status" for each
> wrapper and haproxy process ? I'm interested in seeing SigIgn and
> SigBlk particularly.
> 

Sure, here is the output for the following pstree: 

$ ps fauxww | grep haproxy | grep -v grep
root 43135  0.0  0.0  46340  1820 ?    Ss   12:11   0:00 
/usr/sbin/haproxy-systemd-wrapper -f /etc/haproxy/haproxy.cfg -p 
/run/haproxy.pid
haproxy  43136  0.0  0.0  88988 15732 ?    S    12:11   0:00  \_ 
/usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds
haproxy  43137  0.8  0.0  88988 14200 ?    Ss   12:11   0:00  |   \_ 
/usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds
haproxy  43190  0.1  0.0  88988 15720 ?    S    12:11   0:00  \_ 
/usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds -sf 43163
haproxy  43191  0.6  0.0  88988 14132 ?    Ss   12:11   0:00  |   \_ 
/usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds -sf 43163
haproxy  43235  0.3  0.0  88988 15720 ?    S    12:11   0:00  \_ 
/usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds -sf 43228
haproxy  43236  1.3  0.0  88988 14096 ?    Ss   12:11   0:00  \_ 
/usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds -sf 43228

$ grep ^Sig /proc/43135/status
SigQ:    0/192473
SigPnd:    
SigBlk:    
SigIgn:    1000
SigCgt:    000180004803

$ grep ^Sig /proc/43136/status
SigQ:    0/192473
SigPnd:    
SigBlk:    
SigIgn:    1000
SigCgt:    000180300205

$ grep ^Sig /proc/43137/status
SigQ:    0/192473
SigPnd:    
SigBlk:    
SigIgn:    1000
SigCgt:    000180300205

$ grep ^Sig /proc/43190/status
SigQ:    0/192473
SigPnd:    
SigBlk:    
SigIgn:    1000
SigCgt:    000180300205

$ grep ^Sig /proc/43191/status
SigQ:    0/192473
SigPnd:    
SigBlk:    
SigIgn:    1000
SigCgt:    000180300205




RE: HAProxy reloads lets old and outdated processes

2016-10-21 Thread Pierre Cheynier
Hi Willy,

Thanks for your answer and sorry for my delay.

First let's clarify again: we are on systemd-based OS (centOS7), so reload is 
done by sending SIGUSR2 to haproxy-systemd-wrapper.
Theoretically, this has absolutely no relation with our current issue (if I 
understand well the way the old process are managed)

This happens on servers with live traffic, but with a reasonable amount of 
connections. I'm also able to reproduce with no connections, but I've to be a 
bit more aggressive with the reloads frequency (probably because children are 
faster to die).

For me the problem is not that we still have connections or not, it is that in 
this case some old processes are never "aware" that they should die, so they 
continues to listen for incoming requests, thanks to SO_REUSEPORT.

Consequently, you end up with N process listening with different configs.

In the pstree I pasted in the previous message, there are 3 minutes between the 
first living instance and the last (and as you can see, we are quite aggressive 
with long connections) :

 timeout client 2s
 timeout server 5s
 timeout connect 200ms
 timeout http-keep-alive 200ms

Here is a Dockerfile that can be used to reproduce (where I use 
haproxy-systemd-wrapper, just run with default settings - ie nb of reloads=300 
and interval between each=2ms -) :

https://github.com/pierrecdn/haproxy-reload-issue

docker build -t haproxy-reload-issue . && docker run --rm -ti 
haproxy-reload-issue

Thanks,

Pierre
    
> Hi Pierre,
>
> (...)
>
> Is this with live traffic or on a test machine ? Could you please check
> whether these instances have one connection attached ? I don't see any
> valid reason for a dying process not to leave once it doesn't have any
> more connection. And during my last attempts at fixing such issues by
> carefully reviewing the code and hammering the systemd-wrapper like mad,
> I couldn't get this behaviour to happen a single time. Thus it would be
> nice to know what these processes are doing there and why they don't
> stop.
> 
> Regards,
> Willy
 


RE: HAProxy reloads lets old and outdated processes

2016-10-18 Thread Pierre Cheynier
Hi,
Any updates/findings on that issue ?

Many thanks,

Pierre

> From : Pierre Cheynier
> To: Lukas Tribus; haproxy@formilux.org
> Sent: Friday, October 14, 2016 12:54 PM
> Subject: RE: HAProxy reloads lets old and outdated processes
>     
> Hi Lukas,
> 
> > I did not meant no-reuseport to workaround or "solve" the problem 
> definitely, but rather to see if the problems can still be triggered, 
> since you can reproduce the problem easily.
> 
> This still happens using snapshot 20161005 with no-reuseport set, a bit less 
> probably because reload is faster.
> 
> Here is what I observe after reloading 50 times, waiting 0.1 sec between 
> each: 
> 
> $ ps fauxww | tail -9
> root 50253  0.1  0.0  46340  1820 ?    Ss   10:43   0:00 
> /usr/sbin/haproxy-systemd-wrapper -f /etc/haproxy/haproxy.cfg -p 
> /run/haproxy.pid
> haproxy  51003  0.0  0.0  78256  9144 ?    S    10:44   0:00  \_ 
> /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds -sf 
> 51000
> haproxy  51025  0.3  0.0  78256  9208 ?    Ss   10:44   0:00  |   \_ 
> /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds -sf 
> 51000
> haproxy  51777  0.0  0.0  78256  9144 ?    S    10:44   0:00  \_ 
> /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds -sf 
> 51771
> haproxy  51834  0.3  0.0  78256  9208 ?    Ss   10:44   0:00  |   \_ 
> /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds -sf 
> 51771
> haproxy  51800  0.0  0.0  78256  9140 ?    S    10:44   0:00  \_ 
> /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds -sf 
> 51785
> haproxy  51819  0.3  0.0  78256  9204 ?    Ss   10:44   0:00  |   \_ 
> /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds -sf 
> 51785
> haproxy  52083  0.0  0.0  78256  9144 ?    S    10:47   0:00  \_ 
> /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds -sf 
> 52076
> haproxy  52084  0.3  0.0  78256  3308 ?    Ss   10:47   0:00  \_ 
> /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds -sf 
> 52076
> 
> $ sudo ss -tanp |grep -i listen | grep 80
> LISTEN 0  128  *:80   *:* 
>   users:(("haproxy",pid=52084,fd=8))
> LISTEN 0  128  *:8080 *:* 
>   users:(("haproxy",pid=52084,fd=6))
> LISTEN 0  128    10.5.6.7:8000 *:*
>    users:(("haproxy",pid=52084,fd=7))
> 
> $ head -12 /etc/haproxy/haproxy.cfg
> global
>  log 127.0.0.1 local0 warning
>  log 127.0.0.1 local1 notice
>  maxconn 262144
>  user haproxy
>  group haproxy
>  nbproc 1
>  chroot /var/lib/haproxy
>  pidfile /var/run/haproxy.pid
>  stats socket /var/lib/haproxy/stats
>  noreuseport
> 
> Definitely, some instances seems to be "lost" (not referenced by another) and 
> will never be stopped.
> 
> In that case it will not impact the config consistency as only one is bound 
> to the socket, but the reload is far less transparent from a network point of 
> view.
> 
> Pierre



RE: HAProxy reloads lets old and outdated processes

2016-10-14 Thread Pierre Cheynier
Hi Lukas,

> I did not meant no-reuseport to workaround or "solve" the problem 
definitely, but rather to see if the problems can still be triggered, 
since you can reproduce the problem easily.

This still happens using snapshot 20161005 with no-reuseport set, a bit less 
probably because reload is faster.

Here is what I observe after reloading 50 times, waiting 0.1 sec between each: 

$ ps fauxww | tail -9
root 50253  0.1  0.0  46340  1820 ?Ss   10:43   0:00 
/usr/sbin/haproxy-systemd-wrapper -f /etc/haproxy/haproxy.cfg -p 
/run/haproxy.pid
haproxy  51003  0.0  0.0  78256  9144 ?S10:44   0:00  \_ 
/usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds -sf 51000
haproxy  51025  0.3  0.0  78256  9208 ?Ss   10:44   0:00  |   \_ 
/usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds -sf 51000
haproxy  51777  0.0  0.0  78256  9144 ?S10:44   0:00  \_ 
/usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds -sf 51771
haproxy  51834  0.3  0.0  78256  9208 ?Ss   10:44   0:00  |   \_ 
/usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds -sf 51771
haproxy  51800  0.0  0.0  78256  9140 ?S10:44   0:00  \_ 
/usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds -sf 51785
haproxy  51819  0.3  0.0  78256  9204 ?Ss   10:44   0:00  |   \_ 
/usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds -sf 51785
haproxy  52083  0.0  0.0  78256  9144 ?S10:47   0:00  \_ 
/usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds -sf 52076
haproxy  52084  0.3  0.0  78256  3308 ?Ss   10:47   0:00  \_ 
/usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds -sf 52076

$ sudo ss -tanp |grep -i listen | grep 80
LISTEN 0  128  *:80   *:*   
users:(("haproxy",pid=52084,fd=8))
LISTEN 0  128  *:8080 *:*   
users:(("haproxy",pid=52084,fd=6))
LISTEN 0  12810.5.6.7:8000 *:*  
 users:(("haproxy",pid=52084,fd=7))

$ head -12 /etc/haproxy/haproxy.cfg
global
 log 127.0.0.1 local0 warning
 log 127.0.0.1 local1 notice
 maxconn 262144
 user haproxy
 group haproxy
 nbproc 1
 chroot /var/lib/haproxy
 pidfile /var/run/haproxy.pid
 stats socket /var/lib/haproxy/stats
 noreuseport

Definitely, some instances seems to be "lost" (not referenced by another) and 
will never be stopped.

In that case it will not impact the config consistency as only one is bound to 
the socket, but the reload is far less transparent from a network point of view.

Pierre



HAProxy reloads lets old and outdated processes

2016-10-13 Thread Pierre Cheynier
Hi list,

I experiment the following behaviour : I'm on 1.6.8 (same behaviour in 
1.4/1.5), use systemd and noticed that when reloads are relatively frequent, 
old processes sometimes never dies and stays bound to the TCP socket(s), thanks 
to SO_REUSEPORT.

Here is an example of process tree: 
root 24115  0.0  0.0  46340  1824 ?Ss   14:34   0:00 
/usr/sbin/haproxy-systemd-wrapper -f /etc/haproxy/haproxy.cfg -p 
/run/haproxy.pid 
haproxy  27403  0.2  0.0  89272 20096 ?S14:49   0:00  \_ 
/usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds -sf 27366 
haproxy  27450  1.2  0.0  89272 14380 ?Rs   14:49   0:00  |   \_ 
/usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds -sf 27366 
haproxy  27410  0.2  0.0  89272 16008 ?S14:49   0:00  \_ 
/usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds -sf 27366 
haproxy  27458  1.2  0.0  89272 14392 ?Ss   14:49   0:00  |   \_ 
/usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds -sf 27366 
haproxy  27626  0.3  0.0  89272 16008 ?S14:49   0:00  \_ 
/usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds -sf 27623 
haproxy  27674  1.1  0.0  89272 14380 ?Ss   14:49   0:00  |   \_ 
/usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds -sf 27623 
haproxy  27722  0.2  0.0  89272 16008 ?S14:49   0:00  \_ 
/usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds -sf 27716 
haproxy  27762  1.0  0.0  89272 14368 ?Ss   14:49   0:00  |   \_ 
/usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds -sf 27716

The problem is easily repoducible: just loop over reload (systemctl / SIGUSR2), 
50 times without sleep for example.

It happens when two reloads are performed in a small amount of time. 
As a result, there is no 'back-reference' in the '-sf' of one haproxy instance 
to the previous one, and it becomes "disconnected" from the others (see 27450 
in my example which seems totally alone).
This is also visible in journalctl output (generally 2 haproxy instances has 
the  same PID reference in '-sf', resulting in one lost, see 27366 in my 
example).

I had a look at haproxy-systemd-wrapper.c and guessed that the PID file is only 
read and never written here.
To me it seems that a race condition happens and that several instances do not 
reference the previous one, maybe because the PID can be written after X 
reloads has been done.

Restarting the server is very impacting and, to me, this is why there was 
approaches like the one used at Yelp 
(https://engineeringblog.yelp.com/2015/04/true-zero-downtime-haproxy-reloads.html)
 consisting in letting the client do SYN-retries or buffering the SYNs  while 
doing a full restart.

This becomes impossible in PaaS-like approach where many events occurs and may 
trigger reloads every seconds. BTW, the new "no-reuseport" feature does not 
help in my case (as well as ip/nftables or tc workarounds) because it 
introduces latencies spikes potentially every second.

Maybe you've some insights to share before digging into that ?

Thanks, 

-Pierre