AW: [EXT] Re: error HAproxy with Galera Cluster v4

2024-05-10 Thread Marno Krahmer
Hey,

There actually is some stuff in the haproxy documentation about this:
https://docs.haproxy.org/2.9/configuration.html#4-option%20mysql-check

MySQL will block a client host when it does more unsuccessful authentication 
requests than configured in the global variable “max_connect_errors”.

This can happen when you do health check more frequently than “real” MySQL 
connection come it.

You can change the value of max_connect_errors according to the documentation: 
https://dev.mysql.com/doc/refman/8.0/en/server-system-variables.html#sysvar_max_connect_errors

Running a “FLUSH HOSTS;” on the affected MySQL node will (temporarily) solve 
that problem too.

If you don’t want to change that variable, you can either decrease the healh 
check interval, or could use a different health check mechanism.

In our company, we use a small script running on every MySQL-Node, that exposes 
an HTTP-Enpoint, reporting the MySQL-state.
Then haproxy is making a HTTP-Request for monitoring and allows us to configure 
expected response code & content.

Cheers
Marno


Von: Willy Tarreau 
Datum: Freitag, 10. Mai 2024 um 14:28
An: Iglesias Paz, Jaime 
Cc: haproxy@formilux.org 
Betreff: [EXT] Re: error HAproxy with Galera Cluster v4
Hello,

On Fri, May 10, 2024 at 12:00:17PM +, Iglesias Paz, Jaime wrote:
> Hey guys, I have a problem with HAProxy and Galera Cluster v4 MySQL (3 
> nodes). I boot the HAProxy server and it returns the following error:
>
> may 10 13:48:20 phaproxysql1 haproxy[661]: Proxy stats started.
> may 10 13:48:20 phaproxysql1 haproxy[661]: Proxy stats started.
> may 10 13:48:20 phaproxysql1 haproxy[661]: [NOTICE] 130/134820 (661) : New 
> worker #1 (663) forked
> may 10 13:48:20 phaproxysql1 systemd[1]: Started HAProxy Load Balancer.
> may 10 13:48:20 phaproxysql1 haproxy[663]: [WARNING] 130/134820 (663) : 
> Server galeramanagerprd/nodo1prd is DOWN, reason: Layer7 wrong status, code: 
> 1129, info: "Host 'X' is blocked because of many connection errors; 
> unblock>
> may 10 13:48:21 phaproxysql1 haproxy[663]: [WARNING] 130/134821 (663) : 
> Server galeramanagerprd/nodo2prd is DOWN, reason: Layer7 wrong status, code: 
> 1129, info: "Host 'X' is blocked because of many connection errors; 
> unblock>
> may 10 13:48:21 phaproxysql1 haproxy[663]: [WARNING] 130/134821 (663) : 
> Server galeramanagerprd/nodo3prd is DOWN, reason: Layer7 wrong status, code: 
> 1129, info: "Host '' is blocked because of many connection errors; 
> unblock>
> may 10 13:48:21 phaproxysql1 haproxy[663]: [NOTICE] 130/134821 (663) : 
> haproxy version is 2.2.9-2+deb11u6
> may 10 13:48:21 phaproxysql1 haproxy[663]: [NOTICE] 130/134821 (663) : path 
> to executable is /usr/sbin/haproxy
> may 10 13:48:21 phaproxysql1 haproxy[663]: [ALERT] 130/134821 (663) : proxy 
> 'galeramanagerprd' has no server available!
>
> The haproxy.cfg configuration file:
> 
> defaults
> log global
> modehttp
> option  httplog
> option  dontlognull
> timeout connect 5000
> timeout client  5
> timeout server  5
> errorfile 400 /etc/haproxy/errors/400.http
> errorfile 403 /etc/haproxy/errors/403.http
> errorfile 408 /etc/haproxy/errors/408.http
> errorfile 500 /etc/haproxy/errors/500.http
> errorfile 502 /etc/haproxy/errors/502.http
> errorfile 503 /etc/haproxy/errors/503.http
> errorfile 504 /etc/haproxy/errors/504.http
>
> listen galeramanagerprd
> bind *:3306
> balance source
> mode tcp
> #option tcplog
> option tcpka
> option mysql-check user haproxy
> server nodo1prd X:3306 check weight 1
> server nodo2prd X:3306 check weight 1
> server nodo3prd X:3306 check weight 1
> 
>
> (*) for security I change the IPs to X
>
> Reviewing the documentation I can't find where the problem may be.

That reminds me of something a long time ago, where there was a limit on
the number of check a mysql server would accept from a same IP address,
and it was necessary to change the setting to unlimited. I don't remember
the details but there was something to do using some insert commands. I
don't know if this is still needed after all these years, but the error
message strongly suggests something like this.

Willy


Re: How to debug "IH" termination state on HTTP connections?

2023-11-25 Thread Marno Krahmer
 Hey Christopher,
thanks a lot for the config snipped.
"Luckily" the issue appeared again, shortly after applied the config.

So here is the output of the ring file:


<0>2023-11-25T14:41:41.050846+00:00 [01|h1|0|mux_h1.c:4377] reporting error to 
the app-layer stream : [F,RUN] [MSG_DONE, MSG_DONE] - req=(.fl=0x1550 
.curr_len=0 .body_len=0)  res=(.fl=0x1515 .curr_len=0 .body_len=0) - 
h1c=0x7f50672668f0(0x0200) conn=0x7f5043245e10(0x801c0300) 
h1s=0x7f506722a2f0(0x00014010) sd=0x7f50672387e0(0x04014001) 
sc=0x7f506722d890(0x00014422)
<0>2023-11-25T14:56:31.907270+00:00 [05|h1|0|mux_h1.c:3129] processing error : 
[B,RUN] [MSG_DONE, MSG_RPBEFORE] - req=(.fl=0x1511 .curr_len=0 
.body_len=2352560)  res=(.fl=0x1404 .curr_len=0 .body_len=0) - 
h1c=0x7f503b255a70(0x8000) conn=0x7f5043c78f10(0x0300) 
h1s=0x7f503b252e40(0x00015040) sd=0x7f503b239f90(0x05010001) 
sc=0x7f503b233260(0x0401)
<0>2023-11-25T14:56:32.998177+00:00 [05|h1|0|mux_h1.c:3129] processing error : 
[B,RUN] [MSG_DONE, MSG_DONE] - req=(.fl=0x1511 .curr_len=0 
.body_len=2501993)  res=(.fl=0x1535 .curr_len=0 .body_len=0) - 
h1c=0x7f503b256620(0x8100) conn=0x7f506722ed10(0x00040300) 
h1s=0x7f503b23afa0(0x00015040) sd=0x7f503b23bf30(0x0101c001) 
sc=0x7f503b234240(0x00040003)
<0>2023-11-25T14:56:32.998187+00:00 [05|h1|0|mux_h1.c:3169] txn done but data 
waiting to be sent, set error on h1c : [B,RUN] [MSG_DONE, MSG_DONE] - 
req=(.fl=0x1511 .curr_len=0 .body_len=2501993)  res=(.fl=0x1535 
.curr_len=0 .body_len=0) - h1c=0x7f503b256620(0x8100) 
conn=0x7f506722ed10(0x00040300) h1s=0x7f503b23afa0(0x00015040) 
sd=0x7f503b23bf30(0x0101c001) sc=0x7f503b234240(0x00040003)
<0>2023-11-25T14:56:33.052598+00:00 [05|h1|0|mux_h1.c:3129] processing error : 
[B,RUN] [MSG_DONE, MSG_RPBEFORE] - req=(.fl=0x1511 .curr_len=0 
.body_len=2551166)  res=(.fl=0x1404 .curr_len=0 .body_len=0) - 
h1c=0x7f503b232eb0(0x8000) conn=0x7f504723e230(0x0300) 
h1s=0x7f503b23b1b0(0x00015040) sd=0x7f503b22ea00(0x05010001) 
sc=0x7f503b2420d0(0x0401)
<0>2023-11-25T14:56:33.651131+00:00 [05|h1|0|mux_h1.c:3129] processing error : 
[B,RUN] [MSG_DONE, MSG_DONE] - req=(.fl=0x1511 .curr_len=0 
.body_len=2618038)  res=(.fl=0x1535 .curr_len=0 .body_len=0) - 
h1c=0x7f503b22a6e0(0x8100) conn=0x7f504b22c3e0(0x00040300) 
h1s=0x7f503b252d70(0x00015040) sd=0x7f503b22b5e0(0x0101c001) 
sc=0x7f503b229720(0x00040003)
<0>2023-11-25T14:56:33.651139+00:00 [05|h1|0|mux_h1.c:3169] txn done but data 
waiting to be sent, set error on h1c : [B,RUN] [MSG_DONE, MSG_DONE] - 
req=(.fl=0x1511 .curr_len=0 .body_len=2618038)  res=(.fl=0x1535 
.curr_len=0 .body_len=0) - h1c=0x7f503b22a6e0(0x8100) 
conn=0x7f504b22c3e0(0x00040300) h1s=0x7f503b252d70(0x00015040) 
sd=0x7f503b22b5e0(0x0101c001) sc=0x7f503b229720(0x00040003)
<0>2023-11-25T14:56:33.807314+00:00 [05|h1|0|mux_h1.c:3129] processing error : 
[B,RUN] [MSG_DONE, MSG_RPBEFORE] - req=(.fl=0x1511 .curr_len=0 
.body_len=2697148)  res=(.fl=0x1404 .curr_len=0 .body_len=0) - 
h1c=0x7f503b22cdb0(0x8000) conn=0x7f5033249120(0x0300) 
h1s=0x7f503b241ae0(0x00015040) sd=0x7f503b238d50(0x05010001) 
sc=0x7f503b238a30(0x0401)
<0>2023-11-25T14:56:34.125475+00:00 [05|h1|0|mux_h1.c:3129] processing error : 
[B,RUN] [MSG_DONE, MSG_RPBEFORE] - req=(.fl=0x1511 .curr_len=0 
.body_len=2777462)  res=(.fl=0x1404 .curr_len=0 .body_len=0) - 
h1c=0x7f503b22ab00(0x8000) conn=0x7f506725c350(0x0300) 
h1s=0x7f503b22a090(0x00015040) sd=0x7f503b232800(0x05010001) 
sc=0x7f503b229640(0x0401)
<0>2023-11-25T14:56:59.867896+00:00 [05|h1|0|mux_h1.c:3129] processing error : 
[B,RUN] [MSG_DONE, MSG_RPBEFORE] - req=(.fl=0x1511 .curr_len=0 
.body_len=31809438)  res=(.fl=0x1404 .curr_len=0 .body_len=0) - 
h1c=0x7f503b240ec0(0x8000) conn=0x7f5043c78da0(0x0300) 
h1s=0x7f503b2571a0(0x00015040) sd=0x7f503b25c4d0(0x05010001) 
sc=0x7f503b25c470(0x0401)
Does that help you to debug further?
In case you need any additional information, feel free to ping me.

Thanks a lot
Marno



Am Freitag, 24. November 2023 um 22:17:20 MEZ hat Christopher Faulet 
 Folgendes geschrieben:  
 
 Le 20/11/2023 à 20:23, Marno Krahmer a écrit :
> Hello,
> 
> since a while I see connection errors in my HAProxy-Logs, looking like this:
> 
> <134>Nov 20 13:19:10 haproxy[8]: :60923 [20/Nov/2023:13:18:41.494] 
> http~ nextcloud/nextcloud 0/0/18/-1/28956 500 208 - - IH-- 19/19/0/0/0 0/0 
> {} "PUT 
> https:///remote.php/dav/uploads//5D56BCEB-AE7E-423A-B424-DCAB3F98C590/3
>  HTTP/2.0"
> 
> According to the documentation, a termination state of "I" should never 
> happen 
> and be reported together with logs.
> 
> Now my Problem is: I don't have any more logs, besides that one line being 
> logged.
> Therefore my question: What can I do to get furthe

How to debug "IH" termination state on HTTP connections?

2023-11-20 Thread Marno Krahmer
Hello,
since a while I see connection errors in my HAProxy-Logs, looking like this:

<134>Nov 20 13:19:10 haproxy[8]: :60923 [20/Nov/2023:13:18:41.494] 
http~ nextcloud/nextcloud 0/0/18/-1/28956 500 208 - - IH-- 19/19/0/0/0 0/0 
{} "PUT 
https:///remote.php/dav/uploads//5D56BCEB-AE7E-423A-B424-DCAB3F98C590/3
 HTTP/2.0"

According to the documentation, a termination state of "I" should never happen 
and be reported together with logs.

Now my Problem is: I don't have any more logs, besides that one line being 
logged.
Therefore my question: What can I do to get further information about when/why 
this occurs?


To give more background information: I am running HAProxy 2.9-dev10-db09cd6 
(the docker image "haproxytech/haproxy-ubuntu-quic:2.9"
As you can see from the log line, (even though I have enabled HTTP/3), this 
error occurs on a HTTP/2 SSL-Connection.The backend is a "NextCloud" instance. 
So far, I only observed those "IH" Errors when uploading photos via the 
smartphone application. I am not able to reproduce those errors on purpose, but 
once one happens, there is a chance that retrying the request will produce the 
same error again."In front" of nextcloud actually is an Apache2-Webserver (that 
ships with the nextcloud docker container).
I was able to find the request in the apache logs:
 -  [20/Nov/2023:13:19:39 +] "PUT 
/remote.php/dav/uploads//5D56BCEB-AE7E-423A-B424-DCAB3F98C590/3 
HTTP/1.1" 204 656 "-" "Mozilla/5.0 (iOS) Nextcloud-iOS/4.9.1"
(Don't be surprised that the timestamps don't perfectly match. Apparently the 
clock on both machines are not in sync).
(And: Actually this request returned a 204, because this was already a retry 
form the client. In the initial request, the response was 201, but caused the 
same IH error)
I tried restarting HAProxy multiple times, but every now and then, it happens 
again.
I remember, that I also had that issue with older 2.9 builds, but I don't 
remember any more, if this also happened on 2.8 build.

If helpful to you, this is my haproxy -vv:


HAProxy version 2.9-dev10-db09cd6 2023/11/18 - https://haproxy.org/Status: 
development branch - not safe for use in production.Known bugs: 
https://github.com/haproxy/haproxy/issues?q=is:issue+is:openRunning on: Linux 
6.4.9-arch1-1 #1 SMP PREEMPT_DYNAMIC Tue, 08 Aug 2023 22:14:05 + 
x86_64Build options :  TARGET  = linux-glibc  CPU     = generic  CC      = cc  
CFLAGS  = -O2 -g -Wall -Wextra -Wundef -Wdeclaration-after-statement 
-Wfatal-errors -Wtype-limits -Wshift-negative-value -Wshift-overflow=2 
-Wduplicated-cond -Wnull-dereference -fwrapv -Wno-address-of-packed-member 
-Wno-unused-label -Wno-sign-compare -Wno-unused-parameter -Wno-clobbered 
-Wno-missing-field-initializers -Wno-cast-function-type -Wno-string-plus-int 
-Wno-atomic-alignment  OPTIONS = USE_PTHREAD_EMULATION=1 USE_LINUX_TPROXY=1 
USE_GETADDRINFO=1 USE_OPENSSL=1 USE_LUA=1 USE_SLZ=1 USE_TFO=1 USE_QUIC=1 
USE_PROMEX=1 USE_PCRE2=1 USE_PCRE2_JIT=1  DEBUG   = -DDEBUG_STRICT 
-DDEBUG_MEMORY_POOLS
Feature list : -51DEGREES +ACCEPT4 +BACKTRACE -CLOSEFROM +CPU_AFFINITY +CRYPT_H 
-DEVICEATLAS +DL -ENGINE +EPOLL -EVPORTS +GETADDRINFO -KQUEUE -LIBATOMIC 
+LIBCRYPT +LINUX_CAP +LINUX_SPLICE +LINUX_TPROXY +LUA +MATH -MEMORY_PROFILING 
+NETFILTER +NS -OBSOLETE_LINKER +OPENSSL -OPENSSL_AWSLC -OPENSSL_WOLFSSL -OT 
-PCRE +PCRE2 +PCRE2_JIT -PCRE_JIT +POLL +PRCTL -PROCCTL +PROMEX 
+PTHREAD_EMULATION +QUIC -QUIC_OPENSSL_COMPAT +RT +SHM_OPEN +SLZ +SSL 
-STATIC_PCRE -STATIC_PCRE2 -SYSTEMD +TFO +THREAD +THREAD_DUMP +TPROXY -WURFL 
-ZLIB
Default settings :  bufsize = 16384, maxrewrite = 1024, maxpollevents = 200
Built with multi-threading support (MAX_TGROUPS=16, MAX_THREADS=256, 
default=8).Built with OpenSSL version : OpenSSL 3.1.2+quic 1 Aug 2023Running on 
OpenSSL version : OpenSSL 3.1.2+quic 1 Aug 2023OpenSSL library supports TLS 
extensions : yesOpenSSL library supports SNI : yesOpenSSL library supports : 
TLSv1.0 TLSv1.1 TLSv1.2 TLSv1.3OpenSSL providers loaded : defaultBuilt with Lua 
version : Lua 5.4.4Built with the Prometheus exporter as a serviceBuilt with 
network namespace support.Built with libslz for stateless 
compression.Compression algorithms supported : identity("identity"), 
deflate("deflate"), raw-deflate("deflate"), gzip("gzip")Built with transparent 
proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBINDBuilt with 
PCRE2 version : 10.39 2021-10-29PCRE2 library supports JIT : yesEncrypted 
password support via crypt(3): yesBuilt with gcc compiler version 11.4.0
Available polling systems :      epoll : pref=300,  test result OK       poll : 
pref=200,  test result OK     select : pref=150,  test result OKTotal: 3 (3 
usable), will use epoll.
Available multiplexer protocols :(protocols marked as  cannot be 
specified using 'proto' keyword)       quic : mode=HTTP  side=FE     mux=QUIC  
flags=HTX|NO_UPG|FRAMED         h2 : mode=HTTP  side=FE|BE  mux=H2    
flags=HTX|HOL_RISK|NO_UPG       fcgi : mode=HTTP  side=BE     mux=FCGI  

Help wanted: Random delays in https request processing

2023-05-30 Thread Marno Krahmer
Hey,

I noticed, that I am experiencing a strange issue with https requests (both on 
http/1.1 and http/2):

It seems like around 1 of 500 / 1 of 1000 requests gets delayed by around 60 to 
90  Seconds between the Client and HAProxy.
All other requests work fine and are blazingly fast.

What the client application logs:
17:59:20.488006 Starting REST request
17:59:40.492774 REST request: PUT 
https://mydomain.com:443/super_fancy_url/_doc/1313409683%3A58505246%3A2023-05-30T15%3A59%3A20Z?refresh=false
 returned 0 and took 1.01ms(name_lookup_time: 0.09ms, connect_time: 0.09ms)

This error happens, because the client does not receive a http response within 
the 20 seconds configured timeout.

When looking through the HAProxy logs, I find a log line for this request, but 
for whatever reason, the time logged there does not match the request time:

May 30 18:00:40 haproxy[458192]: 10.152.40.11:42054 [30/May/2023:18:00:40.964] 
HTTP_MYDOMAIN~ HTTP_MYDOMAIN/super-server.mydomain.com 0/0/0/9/9 201 494 - - 
 32852/116/0/0/0 0/0 "PUT 
/super_fancy_url/_doc/1313409683%3A58505246%3A2023-05-30T15%3A59%3A20Z?refresh=false
  HTTP/1.1"

I double-checked the system clocks on the client and the HAProxy node and can 
ensure, that they are in sync.

As the traffic is SSL encrypted, I don’t think, I can do a useful tcp-dump here.

Interesting is, that the application claims to be able to connect on the 
TCP-Level to HAProxy in less than a millisecond. (That might be, as we have 
sub-millisecond latency in our network), but it seems to really be the 
HTTP-Request that is delayed. But HAProxy reports, that the connection to / 
response from the backend server took only 9ms.
To me, that does not explain an exceeded timeout of >20 seconds.

When looking at examples (same source and destination), where the request was 
not delayed / timing out, then the times logged by HAproxy were correct too.

The hopefully important bits from my config:

global
  # Send logs to syslog
  log 10.12.244.22:1514 local0 notice
  log 10.12.244.22:1514 local1 info

  maxconn 10
  ulimit-n 655360
  nbthread 64

  external-check
  spread-checks 10
  insecure-fork-wanted

  master-worker

  # Hard stop old workers after some time to prevent 
https://github.com/haproxy/haproxy/issues/1701
  hard-stop-after 576h # 24 days

  stats socket /var/run/haproxy/admin.sock mode 666 expose-fd listeners level 
admin
  stats socket /var/run/haproxy/haproxy.sock expose-fd listeners mode 666
  stats socket /var/run/haproxy.sock expose-fd listeners mode 666 level admin
  stats timeout 30s
  user haproxy
  group haproxy
  daemon

  # Generated by 
https://ssl-config.mozilla.org/#server=haproxy=1.8.8=intermediate
  ssl-default-bind-ciphers 
ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384
  ssl-default-bind-options no-sslv3 no-tlsv10 no-tlsv11 no-tls-tickets

  ssl-default-server-ciphers 
ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384
  ssl-default-server-options no-sslv3 no-tlsv10 no-tlsv11 no-tls-tickets

  tune.ssl.default-dh-param 2048
  tune.bufsize 524288

defaults
  log global
  option  dontlognull
  option  redispatch
  option  log-health-checks
  timeout connect 1s
  timeout client  600s
  timeout server  600s
  timeout client-fin 5s
  timeout server-fin 5s
  timeout check   250
  errorfile 400 /etc/haproxy/errors/400.http
  errorfile 403 /etc/haproxy/errors/403.http
  errorfile 408 /etc/haproxy/errors/408.http
  errorfile 500 /etc/haproxy/errors/500.http
  errorfile 502 /etc/haproxy/errors/502.http
  errorfile 503 /etc/haproxy/errors/503.http
  errorfile 504 /etc/haproxy/errors/504.http
  default-server maxconn 10
  option allbackups

listen HTTP_MYDOMAIN
  mode http

  bind 10.200.4.198:80
  bind 10.200.4.198:443 ssl crt /etc/ssl/haproxy/
  option httpchk GET /_cluster/health?local=true
  option  httplog
  option forwardfor header X-Real-Ip

  http-request replace-value Upgrade (.*) websocket # 
https://bishopfox.com/blog/h2c-smuggling-request
  http-request del-header X-Forwarded-For
  http-request set-header X-Forwarded-Port 443 if { ssl_fc }
  http-request set-header X-Forwarded-Port 80 if !{ ssl_fc }
  http-request del-header X-Forwarded-Proto
  http-request set-header X-Forwarded-Proto https if { ssl_fc }
  http-request set-header X-Forwarded-Proto http if !{ ssl_fc }


  timeout check 1500ms
  default-server inter 2s

  http-response set-header LB-FQDN "haproxy-1.mydomain.com"
  http-response set-header LB-Backend-Server %s
  # https://serverfault.com/questions/650588/haproxy-timing-connection-diagram
  http-response set-header LB-Times "connect: %Th ms; queue: %Tw ms; 
be-connect: %Tc ms; 

Understanding "show sess" output, especially "age" and "exp"

2022-08-12 Thread Marno Krahmer

Hey,
 (I already sent this mail to this mailing list a few days ago, but could not 
see it in the list archive at 
https://www.mail-archive.com/haproxy@formilux.org/maillist.html, so I assume, 
it never arrived. Therefore I am sending it again from a different mail address 
now):

I have to reload my HAproxy instance quite often, which brings me the problem, 
that I have a lot of old processes hanging around for a long time (until all 
TCP connections are closed).

While debugging which TCP frontend connections are held open for so long, I 
stumbled upon some stuff in “show sess”, that I don’t understand.

 

I am debugging worker number 2282927, which according to “show proc” on the 
HAProxy master became “old” around 8 days ago:

root@haproxy:~# echo "show proc" | socat - 
UNIX-CONNECT:/var/run/haproxy-master.sock

#           

2547034 master  291 [failed: 0] 14d23h05m41s    
2.5.5-1ppa1~focal

# workers

71486   worker  0   0d00h18m56s 
2.5.5-1ppa1~focal

# old workers

45664   worker  1   0d00h21m50s 
2.5.5-1ppa1~focal

4123437 worker  2   0d00h34m45s 
2.5.5-1ppa1~focal

4119866 worker  3   0d00h35m06s 
2.5.5-1ppa1~focal

2789145 worker  6   0d03h00m59s 
2.5.5-1ppa1~focal

2810275 worker  9   0d18h18m56s 
2.5.5-1ppa1~focal

2747490 worker  11  0d18h26m00s 
2.5.5-1ppa1~focal

2739822 worker  13  0d18h26m48s 
2.5.5-1ppa1~focal

1909810 worker  20  0d19h56m47s 
2.5.5-1ppa1~focal

920217  worker  21  0d21h45m59s 
2.5.5-1ppa1~focal

892916  worker  22  0d21h49m00s 
2.5.5-1ppa1~focal

888287  worker  23  0d21h49m32s 
2.5.5-1ppa1~focal

883627  worker  24  0d21h50m07s 
2.5.5-1ppa1~focal

321016  worker  25  0d22h51m50s 
2.5.5-1ppa1~focal

161560  worker      26  0d23h09m16s 
2.5.5-1ppa1~focal

3308892 worker  29  1d01h06m47s 
2.5.5-1ppa1~focal

2601505 worker  30  1d18h15m59s 
2.5.5-1ppa1~focal

1011329 worker  32  1d21h15m59s 
2.5.5-1ppa1~focal

215685  worker  40  1d22h45m59s 
2.5.5-1ppa1~focal

4011660 worker  44  1d23h30m59s 
2.5.5-1ppa1~focal

2010216 worker  62      2d03h15m59s 
2.5.5-1ppa1~focal

2515249 worker  69  4d17h30m59s 
2.5.5-1ppa1~focal

2089818 worker  97  5d18h01m00s 
2.5.5-1ppa1~focal

894259  worker  101 5d20h16m00s 
2.5.5-1ppa1~focal

115485  worker  131 6d21h26m16s 
2.5.5-1ppa1~focal

2575664 worker  163 8d16h10m28s 
2.5.5-1ppa1~focal

2565887 worker  165 8d16h11m26s 
2.5.5-1ppa1~focal

2282927 worker  166 8d17h11m59s 
2.5.5-1ppa1~focal

 

By executing “echo "@2282927 show sess" | socat - 
UNIX-CONNECT:/var/run/haproxy-master.sock” I am fetching all sessions, that are 
still active on that process.
Besides others, I stubled upon and investigated session “0x55d36e82cd70”.



0x55d36e82cd70: proto=tcpv4 src=10.12.64.162:919 fe=TCP_NFS-CLUSTER 
be=TCP_NFS-CLUSTER srv=myhost.example.com ts=00 epoch=0 age=13m23s calls=32 
rate=0 cpu=0 lat=0 rq[f=848000h,i=0,an=00h,rx=26s,wx=,ax=] 
rp[f=80048202h,i=0,an=00h,rx=26s,wx=,ax=] s0=[8,28h,fd=1027,ex=] 
s1=[8,200018h,fd=1211,ex=] exp=26s

 

What confuses me there, is “age=13m23s”, which I interpret as: This connection 
was opened 13 minutes ago.
In case that info is important: This is a listen section configured in mode 
“TCP”. (Config is attached to the bottom of the mail)
First question: Is that assumption correct?
Second question: Why is a new connection accepted by a worker 13 minutes ago, 
that should not receive any new connections since 8 days ago?

As this connection seems to be close to expire anyways (“exp=26s”), I continued 
monitoring it in a while loop:

root@haproxy:~# while true; do date; echo $(echo "@2282927 show sess" | socat - 
UNIX-CONNECT:/var/run/haproxy-master.sock | grep 0x55d36e82cd70); sleep 1; done

Wed 10 Aug 2022 10:58:15 AM CEST

0x55d36e82cd70: proto=tcpv4 src=10.12.64.162:919 fe=TCP_NFS-CLUSTER 
be=TCP_NFS-CLUSTER srv=myhost.example.com ts=00 epoch=0 age=13m23s calls=32 
rate=0 cpu=0 lat=0 rq[f=848000h,i=0,an=00h,rx=26s,wx=,ax=] 
rp[f=80048202h,i=0,an=00h,rx=26s,wx=,ax=] s0=[8,28h,fd=1027,ex=] 
s1=[8,200018h,fd=1211,ex=] exp=26s

Wed 10 Aug 2022 10:58:16 AM CEST

0x55d36e82cd70: proto=tcpv4 

Re: [EXT] FTP Server in passive mode with HAProxy Frontend and Backend nodes

2022-04-15 Thread Marno Krahmer


Hey Roberto,

Yes, there is a misconfiguration in both config snippets that you sent:

frontend Frontend_FTP

   bind *:21
   bind *:2-20010
   mode tcp
   option tcplog
   timeout client 1h
   default_backend HAProxy_BE

backend HAProxy_BE

mode tcp
server HAProxy-Node-2 172.17.17.1:21check port 
21

In your frontend, you are accepting connections on Port 21 and 2-20010

But in your backends, you forward all connections to Port 21, even the data 
connections.
I don’t know if you can configure HAProxy in a way to dynamically use the same 
port to the backend, that was used in the frontend.
But I am not aware of such a feature.

You could explicitly create all listeners for the data ports you use and 
explicitly forward them to the same port.

Would not be beautiful config, but would work.

Cheers
Marno

Am 15.04.2022 um 02:39 schrieb Roberto Carna :


Dear all, I have to put to work an FTP server (Filezilla) in my backend 
network, as this:

Internet -- Firewall -- HAProxy Frontend -- HAProxy Backend -- FTP server 
(passive mode)

This is my configuration in my HAProxy FE:

frontend Frontend_FTP

   bind *:21
   bind *:2-20010
   mode tcp
   option tcplog
   timeout client 1h
   default_backend HAProxy_BE

backend HAProxy_BE

mode tcp
server HAProxy-Node-2 172.17.17.1:21 check port 
21

This is my configuration in my HAProxy BE:

frontend Backend_FTP

   bind *:21
   bind *:2-20010
   mode tcp
   option tcplog
   timeout client 1h
   default_backend FTP_Server

backend FTP_Server

mode tcp
server HOST-FTP 10.12.1.4:21 check port 21

The FTP control session works OK, but the data session fails.

Is there any error in the HAProxy configuration files from Frontend and Backend?

Special thanks, regards!!!






Re: [EXT] Re: [EXT] Re: Regarding the new dark mode dashboard in 2.5

2022-03-08 Thread Marno Krahmer
Hey Tim,

I added a reference to the GitHub issue to the second line of the commit 
message.

Cheers
Marno
 

Am 08.03.22, 14:42 schrieb "Tim Düsterhus" :

Marno,

On 3/8/22 14:38, Marno Krahmer wrote:
> Is it enough to send the patch to this mailing list?
> 

Yes, and the patch looks good to me. Just one thing: Please reference 
the issue ID in the commit message like this:

This fixes GitHub issue #1461.

Adding Willy to Cc.

Best regards
Tim Düsterhus



0001-MINOR-stats-Add-dark-mode-support-for-socket-rows.patch
Description: 0001-MINOR-stats-Add-dark-mode-support-for-socket-rows.patch


Re: [EXT] Re: Regarding the new dark mode dashboard in 2.5

2022-03-08 Thread Marno Krahmer
Hey,

I added a patch to the GitHub issue that explicitly sets a color for this case.

I just took the same color as for backend servers that don't have a health 
check assigned.
This definetely improves the situation and makes information readable again.

It would be nice to have the patch applied and Ciprian on GitHub already asked 
for a backport to 2.5.

Is it enough to send the patch to this mailing list?

Thanks a lot
Marno


Am 08.03.22, 13:30 schrieb "Tim Düsterhus" :

Ciprian,

On 3/8/22 12:57, Ciprian Craciun wrote:
> I've forgotten the screenshot... :)

see: https://github.com/haproxy/haproxy/issues/1461

Best regards
Tim Düsterhus


From 74bc376bb290d50b5fd140a8f8f8d87f59899322 Mon Sep 17 00:00:00 2001
From: Marno Krahmer 
Date: Tue, 8 Mar 2022 13:45:09 +0100
Subject: [PATCH] MINOR: stats: Add dark mode support for socket rows

In commit e9ed63e548 dark mode support was added to the stats page. The
initial commit does not include  dark mode color overwrites for the
.socket CSS class. This commit colors socket rows the same way as
backends that acre active but do not have a health check defined.

This fixes an issue where reading information from socket lines became
really hard in dark mode due to suboptimal coloring of the cell
background and the font in it.
---
 src/stats.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/stats.c b/src/stats.c
index d15ce094d..f044588d4 100644
--- a/src/stats.c
+++ b/src/stats.c
@@ -3332,6 +3332,7 @@ static void stats_dump_html_head(struct appctx *appctx, 
struct uri_auth *uri)
  " .hr { border-color: #8c8273; }\n"
  " .titre { background-color: #1aa6a6; color: #e8e6e3; }\n"
  " .frontend {background: #2f3437;}\n"
+ " .socket {background: #2a2d2f;}\n"
  " .backend {background: #2f3437;}\n"
  " .active_down {background: #76;}\n"
  " .active_going_up {background: #b99200;}\n"
-- 
2.32.0 (Apple Git-132)



[PATCH] MEDIUM: stats: include disabled proxies that hold active sessions to stats

2021-06-28 Thread Marno Krahmer
Hello,

I would like to add a path to HAProxy.
This patch is supposed to change how stats are handled for disabled proxies.

Prior to this patch, when outputting stats information, disabled proxies were 
ignored / skipped.
This was an issue with old processes after a reload of HAProxy.

It caused “the old process”, that was still holding active sessions, to not 
report stats any more.
This made it impossible for any monitoring solution to figure out, how many 
currently active sessions exist.

While this issue might barely be noticeable when using HAProxy for 
HTTP-Traffic, for long-running TCP-Sessions, this can become an issue.

This patch will now not only check, if a proxy is disabled, but also if it 
still holds active sessions. And as long as it does, it will still output 
statistics.

Initially I opened the following Issue on GitHub: 
https://github.com/haproxy/haproxy/issues/1307

The patch:

From 0648fc0c148fe463ea9f0c77f34beeb484688eac Mon Sep 17 00:00:00 2001
From: Marno Krahmer 
Date: Thu, 24 Jun 2021 16:51:08 +0200
Subject: [PATCH] MEDIUM: stats: include disabled proxies that hold active
sessions to stats

After reloading HAProxy, the old process may still hold active sessions.
Currently there is no way to gather information, how many sessions such
a process still holds. This patch will not exclude disabled proxies from
stats output when they hold at least one active session. This will allow
sending `!@ show stat` through a master socket to the disabled
process and have it returning its stats data.
---
src/stats.c | 7 +--
1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/src/stats.c b/src/stats.c
index 3458924b7..a5620577a 100644
--- a/src/stats.c
+++ b/src/stats.c
@@ -3575,8 +3575,11 @@ static int stats_dump_proxies(struct stream_interface 
*si,
  }

  px = appctx->ctx.stats.obj1;
-  /* skip the disabled proxies, global frontend 
and non-networked ones */
-  if (!px->disabled && px->uuid > 0 && (px->cap & 
(PR_CAP_FE | PR_CAP_BE))) {
+ /* skip the global frontend proxies and 
non-networked ones
+ * also skip disabled proxies unless they are 
still holding active sessions.
+ * This change allows retrieving stats from "old" 
proxies after a reload.
+ */
+ if ((!px->disabled || px->served > 0) && px->uuid 
> 0 && (px->cap & (PR_CAP_FE | PR_CAP_BE))) {
  if 
(stats_dump_proxy_to_buffer(si, htx, px, uri) == 0)
  return 0;
  }
--
2.17.0


Thanks a lot
Marno