hey,
first, use "option mysql-check", for better service checking. you'll
have to add a user and access to the database, and the howto is in the
configuration.txt file
(https://www.haproxy.org/download/2.1/doc/configuration.txt). the
"option httpchk" is doing you nothing because the backend isnt talking
HTTP and the mode is tcp, for mysql.
second, look into the proxy protocol, and you can have HAProxy send the
client IP in a TCP header, similar to the X-Forwarded-For header in
HTTP. you need to add a line like:
proxy-protocol-networks=::1, localhost, <ip.add.re.ss/nm>
into the my.cnf or mariadb-server.cnf file. replace the ip with a
network cidr, without the brackets, to specify client ranges that should
be sent using the proxy protocol. then add the check option
"send-proxy-v2" to the sever line in the backend of HAproxy. mine is:
server mariadb1 192.168.88.1:3306 check inter 10000 send-proxy-v2
this will help you better identify the client that is losing the connection.
if there is a firewall between the client and HAProxy, look at the logs
there. the firewall could be reaping the connections if they are long
running and the firewall hits a threshold, gets busy or maybe has a
policy update pushed to it. something in between could be an issue.
hope this helps,
brendan
On 8/1/23 8:11 PM, David Greenwald wrote:
Hi all,
Looking for some help with a networking issue we've been debugging for
several days. We use haproxy to TCP load-balance between Kafka
Connectors and a Percona MySQL cluster. In this set-up, the connectors
(i.e., Java JDBC) maintain long-running connections and read the
database binlogs. This includes regular polling.
We have seen connection issues starting with haproxy 2.4 and
persisting through 2.8 which result in the following errors in MySQL:
|2023-07-31T17:25:45.745607Z 3364649 [Note] Got an error reading
communication packets|
As you can see, this doesn't include a host or user and is happening
early in the connection. The host cache shows handshake errors here
regularly accumulating.
We were unable to see errors on the haproxy side with tcplog on and
have been unable to get useful information from tcpdump, netstat, etc.
We are aware FE/BE connection closure behavior changed in 2.4. The 2.4
option of idle-close-on-response seemed like a possible solution but
isn't compatible with mode tcp, so we're not sure what's happening
here or next steps for debugging. Appreciate any help or guidance here.
We're running haproxy in Kubernetes using the official container, and
are also not seeing any issues with current haproxy versions with our
other (Python) applications.
A simplified version of our config:
global
daemon
maxconn 25000
defaults
balance roundrobin
option dontlognull
option redispatch
timeout http-request 5s
timeout queue 1m
timeout connect 4s
timeout client 50s
timeout server 30s
timeout http-keep-alive 10s
timeout check 10s
retries 3
frontend main_writer
bind :3306
mode tcp
timeout client 30s
timeout client-fin 30s
default_backend main_writer
backend main_writer
mode tcp
balance leastconn
option httpchk
timeout server 30s
timeout tunnel 3h
server db1 <ip>:3306 check port 9200 on-marked-down
shutdown-sessions weight 100 inter 3s rise 1 fall 2
server db2 <ip>:3306 check port 9200 on-marked-down
shutdown-sessions weight 100 backup
server db3 <ip>:3306 check port 9200 on-marked-down
shutdown-sessions weight 100 backup
*
David Greenwald
Senior Site Reliability Engineer
david.greenw...@discogsinc.com
*
The contents of this communication are confidential. If you are not
the intended recipient, please immediately notify the sender by reply
email and delete this message and its attachments, if any.