Thanks Bryan,
The problem I'm having is isolated to the first one second of the
connection not the end
Here is a summary of the tcp traffic. Hopefully it makes the example more
clear.
*client connects to haproxy: (all good)*
38.057127 IP 127.0.0.1.39888 > 127.0.0.1.9011: Flags [S], seq
2113072542, win 43690, options [mss 65495,sackOK,TS val 82055529 ecr
0,nop,wscale 7], length 0
38.057156 IP 127.0.0.1.9011 > 127.0.0.1.39888: Flags [S.], seq
3284611992, ack 2113072543, win 43690, options [mss 65495,sackOK,TS val
82055529 ecr 82055529,nop,wscale 7], length 0
38.057178 IP 127.0.0.1.39888 > 127.0.0.1.9011: Flags [.], ack 1, win
342, options [nop,nop,TS val 82055529 ecr 82055529], length 0
*haproxy starts connecting to server (SYN) (good)*
38.057295 IP 10.10.10.10.34289 > 99.99.99.99.8000: Flags [S], seq
333335567, win 29200, options [mss 1460,sackOK,TS val 82055529 ecr
0,nop,wscale 7], length 0
*client sends 198 bytes to initiate SSL connection and we have an ACK
(good)*
38.060539 IP 127.0.0.1.39888 > 127.0.0.1.9011: Flags [P.], seq 1:199,
ack 1, win 342, options [nop,nop,TS val 82055530 ecr 82055529], length 198
38.060598 IP 127.0.0.1.9011 > 127.0.0.1.39888: Flags [.], ack 199, win
350, options [nop,nop,TS val 82055530 ecr 82055530], length 0
*haproxy finishes connecting to the server (SYNACK/ACK) (good)*
38.120527 IP 99.99.99.99.8000 > 10.10.10.10.34289: Flags [S.], seq
4125907118, ack 333335568, win 28960, options [mss 1460,sackOK,TS val
662461622 ecr 82055529,nop,wscale 8], length 0
38.120619 IP 10.10.10.10.34289 > 99.99.99.99.8000: Flags [.], ack 1,
win 229, options [nop,nop,TS val 82055545 ecr 662461622], length 0
*And now there is nothing for 5 seconds when we should have seen a data
packet from haproxy to the server with the "length 198" payload. *
*It appears that haproxy never tried to send the data!?!? *
*The server then disconnects 5 seconds into the transaction since it got no
data. (that is the way the server is suppose to behave)*
43.183207 IP 99.99.99.99.8000 > 10.10.10.10.34289: Flags [F.], seq 1,
ack 1, win 114, options [nop,nop,TS val 662466683 ecr 82055545], length 0
On Mon, Apr 10, 2017 at 7:58 PM, Bryan Talbot <[email protected]>
wrote:
>
> On Apr 8, 2017, at Apr 8, 2:24 PM, Lincoln Stern <[email protected]>
> wrote:
>
> I'm not sure how to interpret this, but it appears that haproxy is dropping
> client payload intermittently (1/100). I have included tcpdumps and logs
> to
> show what is happening.
>
> Am I doing something wrong? I have no idea what could be causing this or
> how
> to go about debugging it. I cannot reproduce it, but I do observe in
> production ~2 times
> a day across 20 instances and 2K connections.
>
> Any help or advice would be greatly appreciated.
>
>
>
>
> You’re in TCP mode with 60 second timeouts. So, if the connection is idle
> for that long then the proxy will disconnect. If you need idle connections
> to stick around longer and mix http and tcp traffic then you probably want
> to set “timeout tunnel” to however long you’re willing to let idle tcp
> connections sit around and not impact http timeouts. If you only need
> long-lived tcp “tunnel” connections, then you can instead just increase
> both your “timeout client” and “timeout server” timeouts to cover your
> requirements.
>
> -Bryan
>
>
>
> What I'm trying to accomplish is to provide HA availability over two routes
> (i.e. internet providers). One acts as primary and I gave it a "static-rr"
> "weight" of 256 and the other as backup and has a weight of "1". Backup
> should only be used in case of primary failure.
>
>
> log:
> Apr 4 18:55:27 app055 haproxy[13666]: 127.0.0.1:42262
> [04/Apr/2017:18:54:41.585] ws-local servers/server1 1/86/45978 4503 5873 --
> 0/0/0/0/0 0/0
> Apr 4 22:46:37 app055 haproxy[13666]: 127.0.0.1:47130
> [04/Apr/2017:22:46:36.931] ws-local servers/server1 1/62/663 7979 517 --
> 0/0/0/0/0 0/0
> Apr 4 22:46:38 app055 haproxy[13666]: 127.0.0.1:32931
> [04/Apr/2017:22:46:37.698] ws-local servers/server1 1/55/405 3062 553 --
> 1/1/1/1/0 0/0
> Apr 4 22:46:43 app055 haproxy[13666]: 127.0.0.1:41748
> [04/Apr/2017:22:46:43.190] ws-local servers/server1 1/115/452 7979 517 --
> 2/2/2/2/0 0/0
> Apr 4 22:46:46 app055 haproxy[13666]: 127.0.0.1:57226
> [04/Apr/2017:22:46:43.576] ws-local servers/server1 1/76/3066 2921 538 --
> 1/1/1/1/0 0/0
> Apr 4 22:46:47 app055 haproxy[13666]: 127.0.0.1:39656
> [04/Apr/2017:22:46:47.072] ws-local servers/server1 1/67/460 8254 528 --
> 1/1/1/1/0 0/0
> Apr 4 22:47:38 app055 haproxy[13666]: 127.0.0.1:39888
> [04/Apr/2017:22:46:38.057] ws-local servers/server1 1/63/60001 0 0 cD
> 0/0/0/0/0 0/0
> Apr 5 08:44:55 app055 haproxy[13666]: 127.0.0.1:42650
> [05/Apr/2017:08:44:05.529] ws-local servers/server1 1/53/49645 4364 4113 --
> 0/0/0/0/0 0/0
>
>
> tcpdump:
> 22:46:38.057127 IP 127.0.0.1.39888 > 127.0.0.1.9011: Flags [S], seq
> 2113072542, win 43690, options [mss 65495,sackOK,TS val 82055529 ecr
> 0,nop,wscale 7], length 0
> 22:46:38.057156 IP 127.0.0.1.9011 > 127.0.0.1.39888: Flags [S.], seq
> 3284611992, ack 2113072543, win 43690, options [mss 65495,sackOK,TS val
> 82055529 ecr 82055529,nop,wscale 7], length 0
> 22:46:38.057178 IP 127.0.0.1.39888 > 127.0.0.1.9011: Flags [.], ack 1, win
> 342, options [nop,nop,TS val 82055529 ecr 82055529], length 0
> 22:46:38.057295 IP 10.10.10.10.34289 > 99.99.99.99.8000: Flags [S], seq
> 333335567, win 29200, options [mss 1460,sackOK,TS val 82055529 ecr
> 0,nop,wscale 7], length 0
> 22:46:38.060539 IP 127.0.0.1.39888 > 127.0.0.1.9011: Flags [P.], seq
> 1:199, ack 1, win 342, options [nop,nop,TS val 82055530 ecr 82055529],
> length 198
> 22:46:38.060598 IP 127.0.0.1.9011 > 127.0.0.1.39888: Flags [.], ack 199,
> win 350, options [nop,nop,TS val 82055530 ecr 82055530], length 0
> ... client payload acked ...
> 22:46:38.120527 IP 99.99.99.99.8000 > 10.10.10.10.34289: Flags [S.], seq
> 4125907118, ack 333335568, win 28960, options [mss 1460,sackOK,TS val
> 662461622 ecr 82055529,nop,wscale 8], length 0
> 22:46:38.120619 IP 10.10.10.10.34289 > 99.99.99.99.8000: Flags [.], ack 1,
> win 229, options [nop,nop,TS val 82055545 ecr 662461622], length 0
> ... idle timeout by server 5 seconds later...
> 22:46:43.183207 IP 99.99.99.99.8000 > 10.10.10.10.34289: Flags [F.], seq
> 1, ack 1, win 114, options [nop,nop,TS val 662466683 ecr 82055545], length 0
> 22:46:43.183387 IP 127.0.0.1.9011 > 127.0.0.1.39888: Flags [F.], seq 1,
> ack 199, win 350, options [nop,nop,TS val 82056810 ecr 82055530], length 0
> 22:46:43.184011 IP 10.10.10.10.34289 > 99.99.99.99.8000: Flags [.], ack 2,
> win 229, options [nop,nop,TS val 82056811 ecr 662466683], length 0
> 22:46:43.184025 IP 127.0.0.1.39888 > 127.0.0.1.9011: Flags [.], ack 2, win
> 342, options [nop,nop,TS val 82056811 ecr 82056810], length 0
> 22:46:43.184715 IP 127.0.0.1.39888 > 127.0.0.1.9011: Flags [P.], seq
> 199:206, ack 2, win 342, options [nop,nop,TS val 82056811 ecr 82056810],
> length 7
> 22:46:43.184795 IP 127.0.0.1.9011 > 127.0.0.1.39888: Flags [.], ack 206,
> win 350, options [nop,nop,TS val 82056811 ecr 82056811], length 0
> 22:46:43.184849 IP 127.0.0.1.39888 > 127.0.0.1.9011: Flags [F.], seq 206,
> ack 2, win 342, options [nop,nop,TS val 82056811 ecr 82056811], length 0
> 22:46:43.184877 IP 127.0.0.1.9011 > 127.0.0.1.39888: Flags [.], ack 207,
> win 350, options [nop,nop,TS val 82056811 ecr 82056811], length 0
> 22:47:38.058683 IP 10.10.10.10.34289 > 99.99.99.99.8000: Flags [F.], seq
> 1, ack 2, win 229, options [nop,nop,TS val 82070529 ecr 662466683], length 0
> 22:47:38.116336 IP 99.99.99.99.8000 > 10.10.10.10.34289: Flags [R], seq
> 4125907120, win 0, length 0
>
>
> config:
> global
> daemon
> maxconn 10
> log /dev/log local0
> stats socket /dev/shm/haproxy.sock mode 666 level admin
>
> defaults
> log global
> option tcplog
> log-format "%ci:%cp [%t] %ft %b/%s %Tw/%Tc/%Tt %B %U %ts
> %ac/%fc/%bc/%sc/%rc %sq/%bq"
> option log-health-checks
> option redispatch
> mode tcp
> retries 3
> timeout check 900ms
> timeout connect 500ms
> timeout queue 2s
> timeout client 60000ms
> timeout server 60000ms
>
> resolvers mydns1
> nameserver dns2 8.8.4.4:53
> resolve_retries 50000
> timeout retry 5s
> hold other 30s
> hold refused 30s
> hold nx 30s
> hold timeout 30s
> hold valid 10s
>
> resolvers mydns2
> nameserver dns3 172.31.0.254:53
> resolve_retries 1000
> timeout retry 10s
> hold other 30s
> hold refused 30s
> hold nx 30s
> hold timeout 30s
> hold valid 10s
>
> frontend ws-local
> bind *:9011
> default_backend servers
>
> backend servers
> balance static-rr
> default-server rise 1 inter 1h fastinter 10s downinter 10s error-limit
> 1
> server server1 ssl.somedomain.com:8000 init-addr 127.0.0.1 check
> observe layer4 weight 256 resolvers mydns1
> server server2 ssl.somedomain.com:8000 init-addr 127.0.0.1 check
> observe layer4 weight 1 resolvers mydns2 source 172.31.0.1
>
>
>
> $ haproxy -vv
> HA-Proxy version 1.7.3-1ppa1~trusty 2017/03/01
> Copyright 2000-2017 Willy Tarreau <[email protected]>
>
> Build options :
> TARGET = linux2628
> CPU = generic
> CC = gcc
> CFLAGS = -g -O2 -fPIE -fstack-protector --param=ssp-buffer-size=4
> -Wformat -Werror=format-security -D_FORTIFY_SOURCE=2
> OPTIONS = USE_ZLIB=1 USE_REGPARM=1 USE_OPENSSL=1 USE_LUA=1 USE_PCRE=1
> USE_NS=1
>
> Default settings :
> maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200
>
> Encrypted password support via crypt(3): yes
> Built with zlib version : 1.2.8
> Running on zlib version : 1.2.8
> Compression algorithms supported : identity("identity"),
> deflate("deflate"), raw-deflate("deflate"), gzip("gzip")
> Built with OpenSSL version : OpenSSL 1.0.1f 6 Jan 2014
> Running on OpenSSL version : OpenSSL 1.0.1f 6 Jan 2014
> OpenSSL library supports TLS extensions : yes
> OpenSSL library supports SNI : yes
> OpenSSL library supports prefer-server-ciphers : yes
> Built with PCRE version : 8.31 2012-07-06
> Running on PCRE version : 8.31 2012-07-06
> PCRE library supports JIT : no (USE_PCRE_JIT not set)
> Built with Lua version : Lua 5.3.1
> Built with transparent proxy support using: IP_TRANSPARENT
> IPV6_TRANSPARENT IP_FREEBIND
> Built with network namespace support
>
> Available polling systems :
> epoll : pref=300, test result OK
> poll : pref=200, test result OK
> select : pref=150, test result OK
> Total: 3 (3 usable), will use epoll.
>
> Available filters :
> [COMP] compression
> [TRACE] trace
> [SPOE] spoe
>
>
> --
> lfs
>
>
>