Re: 1.5 dev22 issue on freebsd10-stable
On Wed, Apr 16, 2014 at 01:56:29PM +0800, k simon wrote: Hi,Willy, I'm sorry about strace only support i386 on FreeBSD box, but I'm working on amd64. Argh! Without that we'll be fairly limited. You can try with truss but we'll surely miss some information. On a personal note, I'd say that I consider the support for strace and tcpdump as absolute prerequisite when it comes to any platform going into production, to the point of even reconsidering the platform if it misses them. Willy
Re: 1.5 dev22 issue on freebsd10-stable
Le 16/04/2014 08:39, Willy Tarreau a écrit : On a personal note, I'd say that I consider the support for strace and tcpdump as absolute prerequisite when it comes to any platform going into production, to the point of even reconsidering the platform if it misses them. Willy well FreeBSD has dtrace and truss for that so there is possibility for the same followup :) regards, Ghislain.
Re: 1.5 dev22 issue on freebsd10-stable
On 16 April 2014 13:41, Ghislain gad...@aqueos.com wrote: Le 16/04/2014 08:39, Willy Tarreau a écrit : On a personal note, I'd say that I consider the support for strace and tcpdump as absolute prerequisite when it comes to any platform going into production, to the point of even reconsidering the platform if it misses them. Willy well FreeBSD has dtrace and truss for that so there is possibility for the same followup :) ktrace is quite useful too...
Re: 1.5 dev22 issue on freebsd10-stable
On Wed, Apr 16, 2014 at 02:32:03PM +0100, Simon Dick wrote: On 16 April 2014 13:41, Ghislain gad...@aqueos.com wrote: Le 16/04/2014 08:39, Willy Tarreau a écrit : On a personal note, I'd say that I consider the support for strace and tcpdump as absolute prerequisite when it comes to any platform going into production, to the point of even reconsidering the platform if it misses them. Willy well FreeBSD has dtrace and truss for that so there is possibility for the same followup :) ktrace is quite useful too... Sure, but I mean that the level of precision you get with strace is so nice that I'd prefer to run in 32-bit mode to have it than in a blind 64-bit mode. Willy
Re: 1.5 dev22 issue on freebsd10-stable
于 14-4-16 21:35, Willy Tarreau 写道: On Wed, Apr 16, 2014 at 02:32:03PM +0100, Simon Dick wrote: On 16 April 2014 13:41, Ghislain gad...@aqueos.com wrote: Le 16/04/2014 08:39, Willy Tarreau a écrit : On a personal note, I'd say that I consider the support for strace and tcpdump as absolute prerequisite when it comes to any platform going into production, to the point of even reconsidering the platform if it misses them. Willy well FreeBSD has dtrace and truss for that so there is possibility for the same followup :) ktrace is quite useful too... Sure, but I mean that the level of precision you get with strace is so nice that I'd prefer to run in 32-bit mode to have it than in a blind 64-bit mode. Willy OK, I'm not a developer and never used dtrace or ktrace before. May some gurus be kind give me some tips about use it. Simon
1.5 dev22 issue on freebsd10-stable
Hi,List, I got a 1.5 dev22 issue on freebsd 10-stable. It reported like below, it's generate about 2-3 errors per minute when using http-keep-alive ,it's about 5-8 errors per minute with http-server-close. I tried use source ip:port1-port2 in server section, but nothing helped. Then I stop it,compiled haproxy 1.4-25 and execute it, the error messages disappears. Is it a version 1.5 bug ? Regards Simon Apr 15 14:56:05 localhost haproxy[17725]: Connect() failed for backend squid3-bulk-keepalive: local address already in use. Apr 15 14:56:10 localhost haproxy[17725]: Connect() failed for backend squid3-bulk-keepalive: local address already in use. Apr 15 14:56:12 localhost haproxy[17725]: Connect() failed for backend squid3-bulk-keepalive: local address already in use. Apr 15 14:56:17 localhost haproxy[17725]: Connect() failed for backend squid3-bulk-keepalive: local address already in use. Apr 15 14:56:20 localhost haproxy[17725]: Connect() failed for backend squid3-bulk-keepalive: local address already in use. Apr 15 14:56:24 localhost haproxy[17725]: Connect() failed for backend squid3-bulk-keepalive: local address already in use. Apr 15 14:56:26 localhost haproxy[17725]: Connect() failed for backend squid3-bulk-keepalive: local address already in use. net.inet.ip.portrange.lowfirst: 1023 net.inet.ip.portrange.lowlast: 600 net.inet.ip.portrange.first: 12000 net.inet.ip.portrange.last: 65535 net.inet.ip.portrange.hifirst: 12000 net.inet.ip.portrange.hilast: 65535 # sockstat -4 |wc -l 13630 # netstat -an | awk '/^tcp/ {++S[$NF]} END {for(a in S) print a,S[a]}' LISTEN 3 FIN_WAIT_1 1406 FIN_WAIT_2 41 SYN_SENT 2 LAST_ACK 540 CLOSING 131 CLOSE_WAIT 41 CLOSED 5 SYN_RCVD 53 TIME_WAIT 2183 ESTABLISHED 8557 # haproxy -vv HA-Proxy version 1.5-dev22-1a34d57 2014/02/03 Copyright 2000-2014 Willy Tarreau w...@1wt.eu Build options : TARGET = freebsd CPU = generic CC = cc CFLAGS = -pipe -O2 -fno-strict-aliasing -DFREEBSD_PORTS OPTIONS = USE_GETADDRINFO=1 USE_ZLIB=1 USE_OPENSSL=1 USE_PCRE=1 USE_PCRE_JIT=1 Default settings : maxconn = 2000, bufsize = 16384, maxrewrite = 8192, maxpollevents = 200 Encrypted password support via crypt(3): yes Built with zlib version : 1.2.8 Compression algorithms supported : identity, deflate, gzip Built with OpenSSL version : OpenSSL 1.0.1g 7 Apr 2014 Running on OpenSSL version : OpenSSL 1.0.1g 7 Apr 2014 OpenSSL library supports TLS extensions : yes OpenSSL library supports SNI : yes OpenSSL library supports prefer-server-ciphers : yes Built with PCRE version : 8.34 2013-12-15 PCRE library supports JIT : yes Available polling systems : kqueue : pref=300, test result OK poll : pref=200, test result OK select : pref=150, test result OK Total: 3 (3 usable), will use kqueue.
Re: 1.5 dev22 issue on freebsd10-stable
Hi Simon, On Tue, Apr 15, 2014 at 04:22:35PM +0800, k simon wrote: Hi,List, I got a 1.5 dev22 issue on freebsd 10-stable. It reported like below, it's generate about 2-3 errors per minute when using http-keep-alive ,it's about 5-8 errors per minute with http-server-close. I tried use source ip:port1-port2 in server section, but nothing helped. Then I stop it,compiled haproxy 1.4-25 and execute it, the error messages disappears. Is it a version 1.5 bug ? I suspect this is caused by the health check bug which doesn't immediately close the connections in raw TCP mode, and which probably marks them in TIME_WAIT state, preventing you from reusing these ports. Please check with latest snapshot if it goes away. Willy
Re: 1.5 dev22 issue on freebsd10-stable
Hi,Willy, Does your mean BUG/MINOR: tcpcheck connect wrong behavior or BUG/MEDIUM: checks: immediately report a connection success ? I have not used tcp-check, just used http-check. Does it have the same bug? And the out connections to the server farm is about just 900+, is TW state really a problem ? I have set the portrange from 12000 to 6. Simon 于 14-4-15 18:15, Willy Tarreau 写道: Hi Simon, On Tue, Apr 15, 2014 at 04:22:35PM +0800, k simon wrote: Hi,List, I got a 1.5 dev22 issue on freebsd 10-stable. It reported like below, it's generate about 2-3 errors per minute when using http-keep-alive ,it's about 5-8 errors per minute with http-server-close. I tried use source ip:port1-port2 in server section, but nothing helped. Then I stop it,compiled haproxy 1.4-25 and execute it, the error messages disappears. Is it a version 1.5 bug ? I suspect this is caused by the health check bug which doesn't immediately close the connections in raw TCP mode, and which probably marks them in TIME_WAIT state, preventing you from reusing these ports. Please check with latest snapshot if it goes away. Willy
Re: 1.5 dev22 issue on freebsd10-stable
On Wed, Apr 16, 2014 at 12:46:40AM +0800, k simon wrote: Hi,Willy, Does your mean BUG/MINOR: tcpcheck connect wrong behavior or BUG/MEDIUM: checks: immediately report a connection success ? I don't remember, all I can say is that whatever is tagged BUG must be applied. I have not used tcp-check, just used http-check. Does it have the same bug? And the out connections to the server farm is about just 900+, is TW state really a problem ? I have set the portrange from 12000 to 6. You must never have timewaits on a client, only on a server. So if on your haproxy box you're seeing timewaits for connections going to the backend servers, there's something wrong. Haproxy deploys great efforts at avoiding them by doing a setsockopt(SO_LINGER) to force the system to close with a reset. If you still get them after upgrading, please run strace on the process so that we find what could be causing them, as it would be abnormal. Regards, Willy
Re: 1.5 dev22 issue on freebsd10-stable
Hi,Willy, You must never have timewaits on a client, only on a server. So if on your haproxy box you're seeing timewaits for connections going to the backend servers, there's something wrong. Haproxy deploys great efforts at avoiding them by doing a setsockopt(SO_LINGER) to force the system to close with a reset. If you still get them after upgrading, please run strace on the process so that we find what could be causing them, as it would be abnormal. It seems that the timewait states occurs in the clients directions. # netstat -an | awk '/^tcp/ {++S[$NF]} END {for(a in S) print a,S[a]}' LISTEN 3 FIN_WAIT_1 1558 FIN_WAIT_2 53 SYN_SENT 4 LAST_ACK 780 CLOSING 77 CLOSE_WAIT 52 CLOSED 9 SYN_RCVD 80 TIME_WAIT 7743 ESTABLISHED 7722 My backend interface: # ifconfig vlan60 vlan60: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500 options=3RXCSUM,TXCSUM ether 00:1b:21:36:62:1b inet 192.168.130.84 netmask 0xff00 broadcast 192.168.130.255 inet 192.168.130.85 netmask 0x broadcast 192.168.130.85 inet 192.168.130.86 netmask 0x broadcast 192.168.130.86 media: Ethernet 1000baseT full-duplex status: active vlan: 60 parent interface: igb1 # netstat -an |grep 192.168.130 |more tcp4 0 0 192.168.130.85.16416 192.168.130.33.3004 ESTABLISHED tcp4 0 0 192.168.130.85.15506 192.168.130.33.3004 ESTABLISHED tcp4 0 0 192.168.130.85.56697 192.168.130.53.3005 ESTABLISHED tcp4 0 0 192.168.130.85.19907 192.168.130.34.3005 ESTABLISHED tcp4 0 0 192.168.130.85.18708 192.168.130.34.3005 ESTABLISHED tcp4 0 0 192.168.130.85.17137 192.168.130.33.3004 ESTABLISHED tcp4 0 0 192.168.130.85.17950 192.168.130.33.3004 ESTABLISHED tcp4 0 0 192.168.130.85.19640 192.168.130.34.3005 ESTABLISHED tcp4 0 0 192.168.130.85.41590 192.168.130.52.3003 ESTABLISHED tcp4 0 0 192.168.130.85.22277 192.168.130.35.3006 ESTABLISHED tcp4 0 0 192.168.130.85.36508 192.168.130.52.3002 ESTABLISHED tcp4 0 0 192.168.130.85.12990 192.168.130.32.3003 ESTABLISHED tcp4 0 0 192.168.130.85.26643 192.168.130.40.3003 ESTABLISHED tcp4 0 0 192.168.130.85.51775 192.168.130.53.3004 ESTABLISHED tcp4 0 0 192.168.130.85.44149 192.168.130.50.3002 ESTABLISHED tcp4 0 0 192.168.130.85.57427 192.168.130.53.3006 ESTABLISHED tcp4 0 0 192.168.130.85.42355 192.168.130.50.3002 ESTABLISHED tcp4 0 0 192.168.130.85.21283 192.168.130.35.3006 ESTABLISHED tcp4 0 0 192.168.130.85.24548 192.168.130.40.3003 ESTABLISHED tcp4 0 0 192.168.130.85.23880 192.168.130.35.3006 ESTABLISHED tcp4 0 0 192.168.130.85.31224 192.168.130.54.3005 ESTABLISHED tcp4 0 0 192.168.130.85.13662 192.168.130.32.3003 ESTABLISHED # netstat -an |grep TIME_WAIT |more tcp4 0 0 114.80.234.108.80 10.100.1.4.2577TIME_WAIT tcp4 0 0 114.80.234.108.80 10.100.1.2.4149TIME_WAIT tcp4 0 0 114.80.234.72.80 10.100.1.3.2331TIME_WAIT tcp4 0 0 114.80.234.108.80 10.100.1.4.2576TIME_WAIT tcp4 0 0 114.80.234.108.80 10.100.1.2.4148TIME_WAIT tcp4 0 0 114.80.234.108.80 10.100.1.4.2575TIME_WAIT tcp4 0 0 114.80.234.72.80 10.100.1.3.2330TIME_WAIT tcp4 0 0 114.80.234.73.80 10.100.1.2.38769 TIME_WAIT tcp4 0 0 114.80.234.108.80 10.100.1.2.4147TIME_WAIT tcp4 0 0 114.80.234.108.80 10.100.1.4.2574TIME_WAIT tcp4 0 0 114.80.234.108.80 10.100.1.2.4146TIME_WAIT tcp4 0 0 114.80.234.108.80 10.100.1.4.2573TIME_WAIT tcp4 0 0 114.80.234.108.80 10.100.1.2.4145TIME_WAIT tcp4 0 0 114.80.234.72.80 10.100.1.3.2329TIME_WAIT tcp4 0 0 114.80.234.108.80 10.100.1.2.4144TIME_WAIT tcp4 0 0 114.80.234.73.80 10.100.1.2.38768 TIME_WAIT tcp4 0 0 114.80.234.108.80 10.100.1.4.2572TIME_WAIT tcp4 0 0 114.80.234.108.80 10.100.1.2.4143TIME_WAIT tcp4 0 0 114.80.234.72.80 10.100.1.3.2328TIME_WAIT tcp4 0 0 114.80.234.108.80 10.100.1.2.4142TIME_WAIT tcp4 0 0 114.80.234.108.80 10.100.1.4.2571TIME_WAIT tcp4 0 0 114.112.66.220.80 221.234.47.81.53770TIME_WAIT tcp4 0 0 114.80.234.108.80 10.100.1.2.4141TIME_WAIT tcp4 0 0 114.80.234.108.80 10.100.1.4.2570TIME_WAIT tcp4 0 0 114.80.234.72.80 10.100.1.3.2327TIME_WAIT tcp4 0 0 114.80.234.73.80 10.100.1.2.38767 TIME_WAIT tcp4 0
Re: 1.5 dev22 issue on freebsd10-stable
Hi Simon, On Wed, Apr 16, 2014 at 10:25:46AM +0800, k simon wrote: Hi,Willy, You must never have timewaits on a client, only on a server. So if on your haproxy box you're seeing timewaits for connections going to the backend servers, there's something wrong. Haproxy deploys great efforts at avoiding them by doing a setsockopt(SO_LINGER) to force the system to close with a reset. If you still get them after upgrading, please run strace on the process so that we find what could be causing them, as it would be abnormal. It seems that the timewait states occurs in the clients directions. # netstat -an | awk '/^tcp/ {++S[$NF]} END {for(a in S) print a,S[a]}' LISTEN 3 FIN_WAIT_1 1558 FIN_WAIT_2 53 SYN_SENT 4 LAST_ACK 780 CLOSING 77 CLOSE_WAIT 52 CLOSED 9 SYN_RCVD 80 TIME_WAIT 7743 ESTABLISHED 7722 The numbers are not very high. I'm surprized that you have so many FIN_WAIT_1 and LAST_ACK though. You'll need to log some strace output to a file on each process (please log timestamps using strace -tt as well) so that we can compare the behaviour between 1.4 and 1.5. Regards, Willy
Re: 1.5 dev22 issue on freebsd10-stable
Hi,Willy, I'm sorry about strace only support i386 on FreeBSD box, but I'm working on amd64. # uname -a FreeBSD ha-l1-n2 10.0-STABLE FreeBSD 10.0-STABLE #0 r264098: Fri Apr 4 10:57:19 CST 2014 root@ha-l1-n2:/usr/obj/usr/src/sys/10-stable-r264098 amd64 Simon 于 14-4-16 13:40, Willy Tarreau 写道: Hi Simon, On Wed, Apr 16, 2014 at 10:25:46AM +0800, k simon wrote: Hi,Willy, You must never have timewaits on a client, only on a server. So if on your haproxy box you're seeing timewaits for connections going to the backend servers, there's something wrong. Haproxy deploys great efforts at avoiding them by doing a setsockopt(SO_LINGER) to force the system to close with a reset. If you still get them after upgrading, please run strace on the process so that we find what could be causing them, as it would be abnormal. It seems that the timewait states occurs in the clients directions. # netstat -an | awk '/^tcp/ {++S[$NF]} END {for(a in S) print a,S[a]}' LISTEN 3 FIN_WAIT_1 1558 FIN_WAIT_2 53 SYN_SENT 4 LAST_ACK 780 CLOSING 77 CLOSE_WAIT 52 CLOSED 9 SYN_RCVD 80 TIME_WAIT 7743 ESTABLISHED 7722 The numbers are not very high. I'm surprized that you have so many FIN_WAIT_1 and LAST_ACK though. You'll need to log some strace output to a file on each process (please log timestamps using strace -tt as well) so that we can compare the behaviour between 1.4 and 1.5. Regards, Willy