Hello John,

Thank you for your answer.

The timeout for client and server is set to 28801s in frontend and backend sections (should replace values set in default section). 28801s is one second more than wait_timeout and interactive_timeout set in Mariadb. Moreover the retries I can see in application log appera already after about 20 minutes.

Le 20/10/2022 à 20:52, John Lauro wrote :
That's what 50s?  You are probably doing pooling and it's using LRU instead of actually cycling through connections.  At least that is what I have seen node typically do.

Instead of 50 seconds, try:
    timeout client          12h
    timeout server          12h

You might want to enable logging on haproxy and general logging on maria.  If you see what I have seen in the past, you will notice that most of the SQL requests come through one connection, then next highest from a second, and so-on until you get to a connection that is mostly idle.
------------------------------------------------------------------------
*From:* Artur <[email protected]>
*Sent:* Tuesday, October 18, 2022 5:15 AM
*To:* haproxy <[email protected]>
*Subject:* TCP connections resets during backend transfers
Hello,

While renewing a node.js servers and a galera cluster (mariadb) I'm
seeing an unexpected behaviour on TCP connections between node.js
application and mariadb.
There is a lot of connections resets during transfers on backend side.

My previous (working) setup was based on Debian 10, mariadb 10.5,
node.js 16 (and some dependencies) and haproxy 2.6.
I had a server running several node.js processes and a 3-node galera
mariadb cluster.
To provide some HA, I configured haproxy as a TCP proxy for mariadb
connections.
The usual setup is :
node.js -> haproxy -> mariadb
node.js application uses a connection pool to maintain several open
connections to database server that may be idle for a long time.
The timeouts are adjusted in haproxy to avoid disconnecting idle
connections.
This setup worked just fine on old servers.

Then I've setup new servers on Debian 11: a new mariadb galera cluster
(10.6), a new node.js application server (no real changes in node.js
software versions there) and haproxy (2.6.6 currently).
The global setup of all of this is quite the same as before but not
exactly the same. I tried however to be as close as possible to the old
setup.
Now, once I started the node.js application, the database connections
are established and after about 20 minutes I start to see application
warnings about lost connections to database.
On haproxy stats page I can see lot of 'connections resets during
tranfers' backend side.
On database side I can see idle processes that stay there even if I
close node.js application or restart haproxy. These have to timeout or
be killed to disappear. As if there was no communication any more
between haproxy and mariadb (on these tcp connections).
At the same moment other database connections are established or
continue to function. Maybe something related to idle connections ?

If it may help : all these servers are VMs in OVH public cloud and
communications between servers are established through a private vlan in
the same datacenter.

If I remove haproxy from workflow (node.js -> mariadb) I cannot see any
error anymore. But I don't understand why it worked fine before and is
working this way right now...
Any help is welcome.

My current haproxy setup :

global
   log /dev/log  local0
   log /dev/log  local1 notice
   chroot /var/lib/haproxy
   stats socket /run/haproxy/admin.sock mode 660 level admin
   stats timeout 30s
   user haproxy
   group haproxy
   daemon

   # Default SSL material locations
   ca-base /etc/ssl/certs
   crt-base /etc/ssl/private

   # See:
https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fssl-config.mozilla.org%2F%23server%3Dhaproxy%26server-version%3D2.0.3%26config%3Dintermediate&amp;data=05%7C01%7Cjohn.lauro%40covenanteyes.com%7C1615f59ae41445e417e708dab0e9783f%7C41175d2868f5486593eb6372ba83c5bb%7C0%7C0%7C638016814063244113%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=U2r2enW1n3iM%2BFRo2FXU0Ob63XPE6Wcry3WZSg7t0wU%3D&amp;reserved=0 <https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fssl-config.mozilla.org%2F%23server%3Dhaproxy%26server-version%3D2.0.3%26config%3Dintermediate&amp;data=05%7C01%7Cjohn.lauro%40covenanteyes.com%7C1615f59ae41445e417e708dab0e9783f%7C41175d2868f5486593eb6372ba83c5bb%7C0%7C0%7C638016814063244113%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=U2r2enW1n3iM%2BFRo2FXU0Ob63XPE6Wcry3WZSg7t0wU%3D&amp;reserved=0>
   ssl-default-bind-ciphers
ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384
   ssl-default-bind-ciphersuites
TLS_AES_128_GCM_SHA256:TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256
   ssl-default-bind-options no-sslv3 no-tlsv10 no-tlsv11 no-tls-tickets

   ssl-dh-param-file /etc/haproxy/ssl/dhparams.pem
   tune.ssl.default-dh-param 2048

   maxconn 50000

   #nosplice

defaults
   log global
   option dontlognull
   option dontlog-normal
   timeout connect 5000
   timeout client  50000
   timeout server  50000

   #option tcpka

   errorfile 400 /etc/haproxy/errors/400.http
   errorfile 403 /etc/haproxy/errors/403.http
   errorfile 408 /etc/haproxy/errors/408.http
   errorfile 500 /etc/haproxy/errors/500.http
   errorfile 502 /etc/haproxy/errors/502.http
   errorfile 503 /etc/haproxy/errors/503.http
   errorfile 504 /etc/haproxy/errors/504.http

   option splice-auto
   option splice-request
   option splice-response

frontend db3_front
   bind 127.0.1.1:3306
   mode tcp
   # haproxy client connection timeout is 1 second longer than the
default mariadb wait_timeout which is 28800 seconds
   # this avoids haproxy to close an idle connection with no reason
   timeout client 28801s
   maxconn 10000
   no log
   default_backend db3_back

backend db3_back
   mode tcp
   # haproxy server connection timeout is 1 second longer than the
default mariadb wait_timeout which is 28800 seconds
   # this avoids haproxy to close an idle connection with no reason
   timeout server 28801s
   option mysql-check user hacheck post-41
   fullconn 10000
   timeout check 10s
   server db3sbg5 10.140.154.94:3306 maxconn 10000 check on-marked-down
shutdown-sessions
   server db3de1  10.140.3.131:3306  maxconn 10000 backup check
on-marked-down shutdown-sessions
   server db3gra5 10.140.103.12:3306 maxconn 10000 backup check
on-marked-down shutdown-sessions

[similar redis proxy config removed]

listen stats
   bind *:443 interface ens3 ssl crt /etc/haproxy/ssl/server.pem alpn
h2,http/1.1
   mode http
   no log
   maxconn 100
   stats enable
   stats uri /...
   stats refresh 5s
   stats show-legends
   stats show-node
   stats admin if TRUE
   ...

I tried some modifications in haproxy config (nosplice or tcpka) but
errors are still there.
I also tried previous haproxy versions (2.6.5, 2.6.4) but it doesn't
solve the problem.

--
Best regards,
Artur



--
Best regards,
Artur

Reply via email to