On Thu, Sep 13, 2012 at 5:07 AM, Adrian C. <[email protected]> wrote:
> Hello, I've spent a few days looking into failed connections to mysql,
> redis and sphinx frontends done by a PHP webapplication. Respective PHP
> extension will always report the same thing "read error on connection".
>
> Network operations looked into the network gear and swear can't find any
> problems in their words so my last resort is asking on the mailing list.
>
>
> I've experimented with tcpka, and timeouts, but it's always the same
> outcome, so the configuration I will include is just the 10th iteration,
> from bare bones to defining everything to match server side timeouts.
>
> Since configuration is pretty big I will have to paste it at the end,
> and paste the error logs I obtained with tcplog and log-separate-errors
> first, which I could use a hand interpreting... every one of the failed
> connection has one retry even though retries is set to 3, and every
> session state is always "--".
>
> I appreciate any help, thanks.
>
>
> -------
> Sep 12 21:27:58 haproxy: 10.20.10.110:34681 [12/Sep/2012:21:27:50.221]
> site_db_videos_read cluster_db_videos_read/server057 0/8011/8049 841 --
> 752/7/7/1/1 0/0
> Sep 12 21:28:50 haproxy: 10.20.10.110:36783 [12/Sep/2012:21:28:42.708]
> site_redis_ro cluster_redis_ro/server033 0/8011/8011 0 -- 781/0/0/0/1
> 0/0
> Sep 12 21:29:06 haproxy: 10.20.10.122:45845 [12/Sep/2012:21:28:58.508]
> site_redis_ro cluster_redis_ro/server030 0/8011/8011 0 -- 744/0/0/0/1
> 0/0
> Sep 12 21:31:19 haproxy: 10.20.10.114:56521 [12/Sep/2012:21:31:11.897]
> site_redis_ro cluster_redis_ro/server031 0/8006/8007 0 -- 832/0/0/0/1
> 0/0
> Sep 12 21:33:00 haproxy: 10.20.10.114:33322 [12/Sep/2012:21:32:52.532]
> site_redis_ro cluster_redis_ro/server030 0/8012/8012 0 -- 832/0/0/0/1
> 0/0
> Sep 12 21:33:24 haproxy: 10.20.10.118:36628 [12/Sep/2012:21:33:16.045]
> site_redis_ro cluster_redis_ro/server033 0/8010/8011 0 -- 869/1/1/0/1
> 0/0
> Sep 12 21:33:59 haproxy: 10.20.10.114:36730 [12/Sep/2012:21:33:51.286]
> site_db_videos_read cluster_db_videos_read/server061 0/8010/8059 4254 --
> 823/13/13/1/1 0/0
> Sep 12 21:34:12 haproxy: 10.20.10.122:37885 [12/Sep/2012:21:34:04.138]
> site_db_videos_read cluster_db_videos_read/server059 0/8006/8049 841 --
> 860/6/6/1/1 0/0
> Sep 12 21:34:26 haproxy: 10.20.10.118:63468 [12/Sep/2012:21:34:18.852]
> site_db_videos_read cluster_db_videos_read/server057 0/8011/8022 3608 --
> 847/10/10/1/1 0/0
> Sep 12 21:34:29 haproxy: 10.20.10.110:41509 [12/Sep/2012:21:34:20.994]
> site_redis_ro cluster_redis_ro/server031 0/8014/8014 0 -- 815/1/1/0/1
> 0/0
> Sep 12 21:34:37 haproxy: 10.20.10.122:49418 [12/Sep/2012:21:34:29.168]
> site_db_videos_read cluster_db_videos_read/server061 0/8008/8062 3819 --
> 802/18/18/4/1 0/0
> Sep 12 21:35:00 haproxy: 10.20.10.114:55224 [12/Sep/2012:21:34:52.434]
> site_redis_ro cluster_redis_ro/server032 0/8013/8013 0 -- 819/0/0/0/1
> 0/0
> Sep 12 21:35:18 haproxy: 10.20.10.110:63133 [12/Sep/2012:21:35:10.644]
> site_redis_ro cluster_redis_ro/server033 0/8012/8012 0 -- 778/0/0/0/1
> 0/0
> Sep 12 21:39:48 haproxy: 10.20.10.122:62621 [12/Sep/2012:21:39:40.346]
> site_redis_ro cluster_redis_ro/server033 0/8006/8007 0 -- 809/0/0/0/1
> 0/0
> Sep 12 21:40:17 haproxy: 10.20.10.114:60697 [12/Sep/2012:21:40:09.083]
> site_redis_ro cluster_redis_ro/server031 0/8004/8005 0 -- 835/0/0/0/1
> 0/0
> Sep 12 21:43:05 haproxy: 10.20.10.114:45013 [12/Sep/2012:21:42:57.080]
> site_db_videos_read cluster_db_videos_read/server062 0/8007/8047 844 --
> 788/5/5/0/1 0/0
> Sep 12 21:43:58 haproxy: 10.20.10.118:58582 [12/Sep/2012:21:43:50.075]
> site_redis_ro cluster_redis_ro/server032 0/8005/8005 0 -- 898/2/2/0/1
> 0/0
> Sep 12 21:44:05 haproxy: 10.20.10.114:62927 [12/Sep/2012:21:43:57.680]
> site_redis_ro cluster_redis_ro/server031 0/8015/8015 0 -- 831/0/0/0/1
> 0/0
> -------------
>
>
>
> haproxy.conf
> ------------
>
> global
>  log 127.0.0.1 local3 err
>  uid 65530
>  gid 65530
>  ulimit-n 262144
>  maxconn  100000
>  daemon
>
> defaults
>  log     global
>  mode    http
>  retries 3
>  option  redispatch
>  backlog 65536
>  timeout http-request 9s
>  timeout connect 5000ms
>  timeout client 50000ms
>  timeout server 50000ms
>
> backend cluster_redis_ro
>  mode tcp
>  option srvtcpka
>  balance leastconn
>  timeout connect 8s
>  # Match redis default server timeout
>  timeout server 2m
>  server server030 10.20.10.178:6379 weight 1 maxconn 140 check port 6379
> inter 5s fastinter 2s rise 5 fall 5
>  server server031 10.20.10.130:6379 weight 1 maxconn 140 check port 6379
> inter 5s fastinter 2s rise 5 fall 5
>  server server032 10.20.1.78:6379  weight 1 maxconn 140 check port 6379
> inter 5s fastinter 2s rise 5 fall 5
>  server server033 10.20.1.22:6379  weight 1 maxconn 140 check port 6379
> inter 5s fastinter 2s rise 5 fall 5
>
> backend cluster_db_videos_read
>  mode tcp
>  option srvtcpka
>  option mysql-check user haproxy
>  timeout connect 10s
>  # Allow long mysql queries
>  timeout server 20m
>  balance leastconn
>  server server057 10.20.10.146:3306 weight 10 maxconn 62 check port 3306
> inter 6s fastinter 2s rise 5 fall 5
>  server server059 10.20.10.150:3306 weight 10 maxconn 62 check port 3306
> inter 6s fastinter 2s rise 5 fall 5
>  server server061 10.20.10.154:3306 weight 10 maxconn 62 check port 3306
> inter 6s fastinter 2s rise 5 fall 5
>  server server052 10.20.10.194:3306 weight 10 maxconn 62 check port 3306
> inter 6s fastinter 2s rise 5 fall 5
>
>
> frontend site_redis_ro 10.20.1.49:6379
>  maxconn 1024
>  mode tcp
>  option clitcpka
>  option tcplog
>  option log-separate-errors
>  # Match redis default server timeout
>  timeout client 2m
>  default_backend cluster_redis_ro
>
> frontend site_db_videos_read 10.20.1.49:3306
>  mode tcp
>  option clitcpka
>  option tcplog
>  option log-separate-errors
>  maxconn 512
>  # Allow long mysql queries and match server timeout
>  timeout client 20m
>  default_backend cluster_db_videos_read
> ----------------
>
>
> --
> Adrian C. (anrxc) | anrxc..sysphere.org | PGP ID: D20A0618
> PGP FP: 02A5 628A D8EE 2A93 996E  929F D5CB 31B7 D20A 0618
>


Hi,

It looks like your server reached a limit on their side, or where not
available when HAProxy tried to send them a request, so HAProxy had to
wait for 1 timeout connect before trying to reconnect.
To me, your timeout are setup too long.
Redis: connect is currently 8s, it should be at most 4s. server
timeout is setup to 2m, it should be at most a few seconds since redis
is suppose to be very fast!
The same on mysql...


Well, wait, can you tell us your sysctl value for
net.ipv4.tcp_tw_reuse and for net.ipv4.ip_local_port_range ?


I'm asking that because on mysql, you have to wait for 8s too while
the timeout connect is 10s on this backend...
It does not look like to be TCP retransmit since it would be either 3s
or 9s (3 + 6)...
Have you enable the stats page?
Maybe a screenshot could help in such case.

cheers

Reply via email to