Re: Problems with long connect times
driver: tg3 version: 3.98 firmware-version: 5721-v3.55a bus-info: :03:00.0 Not running bnx2. Looks like it's not a 65563 limit either, I've been graphing it and it's up to 80k sometimes, but it goes up and down. When it fails, it seems like it's either 3 seconds or 9 seconds. Would tcp retransmits cause that? I just compiled a kernel with a default retransmit of 1sec, but I haven't tested it yet. Here's the output of netstat -s: IcmpMsg: InType0: 18 InType3: 50818 InType8: 699 OutType0: 699 OutType3: 50841 OutType8: 18 Tcp: 2059992268 active connections openings 1933849278 passive connection openings 4543998 failed connection attempts 2093186 connection resets received 142 connections established 3547584716 segments received 3643865881 segments send out 20003371 segments retransmited 0 bad segments received. 6179288 resets sent UdpLite: TcpExt: 4237091 resets received for embryonic SYN_RECV sockets 1915476798 TCP sockets finished time wait in fast timer 28901367 time wait sockets recycled by time stamp 119887 packets rejects in established connections because of timestamp 2171355337 delayed acks sent 292818 delayed acks further delayed because of locked socket Quick ack mode was activated 697528 times 15213 times the listen queue of a socket overflowed 15213 SYNs to LISTEN sockets dropped 2125065 packets directly queued to recvmsg prequeue. 18179 bytes directly in process context from backlog 7564477 bytes directly received in process context from prequeue 3465788360 packet headers predicted 7232 packets header predicted and directly queued to user 2567319929 acknowledgments not containing data payload received 2718897 predicted acknowledgments 80328 times recovered from packet loss by selective acknowledgements Detected reordering 3118 times using FACK Detected reordering 46 times using SACK Detected reordering 32513 times using time stamp 55394 congestion windows fully recovered without slow start 44249 congestion windows partially recovered using Hoe heuristic 115 congestion windows recovered without slow start by DSACK 101091 congestion windows recovered without slow start after partial ack 4019 TCP data loss events TCPLostRetransmit: 17 11 timeouts after reno fast retransmit 443124 timeouts after SACK recovery 266 timeouts in loss state 83502 fast retransmits 33980 forward retransmits 8964 retransmits in slow start 4227010 other TCP timeouts 421 SACK retransmits failed 698471 DSACKs sent for old packets 118559 DSACKs received 34 DSACKs for out of order packets received 868905 connections reset due to unexpected data 2054320 connections reset due to early user close 1876779 connections aborted due to timeout TCPSACKDiscard: 1820 TCPDSACKIgnoredOld: 110422 TCPDSACKIgnoredNoUndo: 4762 TCPSpuriousRTOs: 18 TCPSackShifted: 9702 TCPSackMerged: 59174 TCPSackShiftFallback: 71815157 IpExt: InMcastPkts: 8816 OutMcastPkts: 3589637 InBcastPkts: 29338 Thanks again for all your help. Jonah On 10/13/09 9:37 PM, Willy Tarreau w...@1wt.eu wrote: On Tue, Oct 13, 2009 at 12:52:55PM -0700, Jonah Horowitz wrote: netstat -ant | grep tcp | tr -s ' ' ' ' | awk '{print $6}' | sort | uniq -c 193 CLOSE_WAIT 316 CLOSING 215 ESTABLISHED 252 FIN_WAIT1 4 FIN_WAIT2 1 LAST_ACK 10 LISTEN 237 SYN_RECV 61384 TIME_WAIT So, clearly there's a time_wait problem. I've already tuned the kernel to set the time_wait counter to 20 seconds (down from 60). I'm tempted to crank it down further, although googling around recommends against it. Is it possible to up the number of outstanding time_wait connections? This host looks like it's hitting a 65536 connection limit. No, TIME_WAIT are not an issue, and are even normal. It's useless to try to reduce them, your proxy can simply re-use them. The only case where it is not possible is when the proxy closed the connection first (eg: option forceclose) but your config does not have this. I'm more concerned by the SYN_RECV which indicate that you did not get an ACK from a client. I'm suspecting you have a high packet loss rate. What type of NIC are you running from ? Wouldn't this be a bnx2 with firmware 1.9.6 ? (use ethtool -i eth0). If so, you must find a firmware on your vendor's site and upgrade it, as this one is very common and very buggy. Regards, Willy -- Jonah Horowitz · Monitoring Manager · jhorow...@looksmart.net W: 415-348-7694 · F: 415-348-7033 · M: 415-513-7202 LookSmart - Premium and Performance Advertising Solutions 625 Second Street, San Francisco, CA 94107
RE: Problems with long connect times
netstat -ant | grep tcp | tr -s ' ' ' ' | awk '{print $6}' | sort | uniq -c 193 CLOSE_WAIT 316 CLOSING 215 ESTABLISHED 252 FIN_WAIT1 4 FIN_WAIT2 1 LAST_ACK 10 LISTEN 237 SYN_RECV 61384 TIME_WAIT So, clearly there's a time_wait problem. I've already tuned the kernel to set the time_wait counter to 20 seconds (down from 60). I'm tempted to crank it down further, although googling around recommends against it. Is it possible to up the number of outstanding time_wait connections? This host looks like it's hitting a 65536 connection limit. -Original Message- From: Hank A. Paulson [mailto:h...@spamproof.nospammail.net] Sent: Monday, October 12, 2009 9:14 PM To: haproxy@formilux.org Subject: Re: Problems with long connect times A couple of guesses you might look at - I have found the stats page to show deceptively low numbers at times. You might want to check the http log stats that show the global/frontend/backend queue numbers around the time those requests. My guess is that the cases where you are seeing 3 second times it is that the backends are slow to connect or they have reached maxconn. Also, you might want to double check that the clients are sending the requests in a timely fashion. netstat -ant | wc -l do you have conntrack running as in the recent situation here on the ml? Any other messages in /var/log/messages? netstat -s have any growing stats? I assume you have lots backends if they are all at only maxconn 20 On 10/12/09 5:15 PM, Jonah Horowitz wrote: I'm having a problem where occasionally under load, the time to complete the tcp handshake is taking much longer than it should: Picture (Device Independent Bitmap) My suspicion is that the number of connections available to the haproxy server are some how constrained and it can't answer connections for a moment. I'm not sure how to debug this. Has anyone else seen something like this? According to the haproxy stats page, I've never come close to my connection limit. I'm using about 1000 concurrent connections and my request rate maxes out at 4400 requests per second. I'm not seeing any messages in dmesg or my /var/log/messages. I'm running 1.4-dev3 on Linux 2.6.30.5. My config is below: TIA, Jonah --- compile options --- make USE_REGPARM=1 USE_STATIC_PCRE=1 USE_LINUX_SPLICE=1 TARGET=linux26 CPU_CFLAGS='-O2 -march=x86-64 -m64' --- config --- global maxconn 2000 pidfile /usr/pkg/haproxy/run/haproxy.pid stats socket /usr/pkg/haproxy/run/stats log /usr/pkg/haproxy/jail/log daemon user daemon group daemon defaults timeout queue 3000 timeout server 3000 timeout client 3000 timeout connect 3000 option splice-auto frontend stats bind :8080 mode http use_backend stats if TRUE backend stats mode http stats enable stats uri /stats stats refresh 5s frontend query log global option dontlog-normal option httplog bind :80 mode http use_backend query if TRUE backend query mode http balance roundrobin option httpchk GET /r?q=LOOKSMARTKEYWORDLISTINGMONITORisp=DROPus option forwardfor option httpclose server foo1 foo1:8080 weight 150 maxconn 20 check inter 1000 rise 2 fall 1 server foo2 foo2:8080 weight 150 maxconn 20 check inter 1000 rise 2 fall 1 server foo2 foo3:8080 weight 150 maxconn 20 check inter 1000 rise 2 fall 1 ...
RE: Nbproc question
Here's the output of top on the system: top - 09:50:36 up 4 days, 18:50, 1 user, load average: 1.31, 1.59, 1.55 Tasks: 117 total, 2 running, 115 sleeping, 0 stopped, 0 zombie Cpu(s): 2.5%us, 9.9%sy, 0.0%ni, 75.0%id, 0.0%wa, 0.5%hi, 12.1%si, 0.0%st Mem: 8179536k total, 997748k used, 7181788k free, 139236k buffers Swap: 9976356k total,0k used, 9976356k free, 460396k cached PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 752741 daemon20 0 34760 24m 860 R 100 0.3 871:15.76 haproxy It's a quad core system, but haproxy is taking 100% of one core. We're doing less than 5k req/sec and the box has two 2.6ghz Opterons in it. Do you know how much health checks affect cpu utilization of an haproxy process? We have about 100 backend servers and we're running inter 500 rise 2 fall 1 I haven't tried adjusting that, although when it was set to the default our error rates were much higher. Thanks, Jonah -Original Message- From: Willy Tarreau [mailto:w...@1wt.eu] Sent: Monday, September 28, 2009 9:50 PM To: Jonah Horowitz Cc: haproxy@formilux.org Subject: Re: Nbproc question On Mon, Sep 28, 2009 at 06:43:58PM -0700, Jonah Horowitz wrote: In the documentation it seems to discourage using the nbproc directive. What¹s the situation with this? I¹m running a server with 8 cores, so I¹m tempted to up the nbproc. Is the process normally multithreaded? no the process is not multithreaded. Is nbproc something I can use for performance tuning, or is it just for file handles? It can bring you small performance gains at the expense of a more complex monitoring, since the stats will still only reflect the process which receives the stats request. Also, health-checks will be performed by each process, causing an increased load on your servers. And the connection limitation will not work anymore, as any process won't know that there are other processes already using a server. It was initially designed to workaround per-process file handle limitations on some systems, but it is true that it brings a minor performance advantage. However, considering that you can reach 4 connections per second with a single process on a cheap core2duo 2.66 GHz, and that forwarding data at 10 Gbps on this machine consumes only 20% of a core, you can certainly understand why I don't see the situations where it would make sense to use nbproc. Regards, Willy
RE: artificial maxconn imposed
I fixed the nf_contrack problem with this (really just the first one, but the others were good too). HAProxy sysctl changes For network tuning, add the following to /etc/sysctl.conf: net.ipv4.netfilter.ip_conntrack_max = 16777216 net.ipv4.tcp_max_tw_buckets = 16777216 increase TCP max buffer size setable using setsockopt() net.core.rmem_max = 16777216 net.core.wmem_max = 16777216 increase Linux autotuning TCP buffer limits min, default, and max number of bytes to use set max to at least 4MB, or higher if you use very high BDP paths net.ipv4.tcp_rmem = 4096 87380 16777216 net.ipv4.tcp_wmem = 4096 65536 16777216 -jonah -Original Message- From: David Birdsong [mailto:david.birds...@gmail.com] Sent: Friday, September 18, 2009 3:06 PM To: haproxy Subject: artificial maxconn imposed I've set ulimit -n 2 maxconn in defaults is 16384 and still somehow when i check the stats page,maxconn is limited to 1, sure enough requests start piling up. any suggestions on where else to look? i'm sure it's an OS thing, so: Fedora 10 x86_64 16GB of RAM this command doesn't turn anything up find /proc/sys/net/ipv4 -type f -exec cat {} \; | grep 1 (also dmesg shows nf_conntrack: table full, dropping packet.) which i think is another problem. might be time to switch to a *BSD.
Backend Server UP/Down Debugging?
I’m watching my servers on the back end and occasionally they flap. I’m wondering if there is a way to see why they are taken out of service. I’d like to see the actual response, or at least a HTTP status code. Jonah Horowitz · Monitoring Manager · jhorow...@looksmart.net mailto:jhorow...@looksmart.net w: 415.348.7694 · c: 415.513.7202 · f: 415.348.7020 625 Second Street, San Francisco, CA 94107
Re: realtime switch to another backend if got 5xx error?
I'm trying to figure out how this works. I desperately need to figure out a way to monitor servers and either take any server that sends any 5xx error out of rotation, or failing that, at least redirect the query to a different server. The clients that use this web service are SOAP/XML clients, so they're not real web browsers. Also, we don't use any cookies. It looks like this config just tells the client to make a second request. Am I missing something here? I know I can use httpchk, but I don't want to run inter 1 because then all my traffic is monitoring traffic. Each server is normally doing several hundred requests per second, and our haproxy test setup is a couple orders of magnitude higher on % of 500 errors. (10% vs .01%). Any ideas? Thanks, Jonah On 6/11/09 7:45 AM, Maciej Bogucki macbogu...@gmail.com wrote: Dawid Sieradzki / Gadu-Gadu S.A. pisze: Hi. The problem is how to silent switch to another backend in realtime if got 500 answer from backend, without http_client knowledge Yes i know, httpchk, but the error 500 is 10 per hour, we don't know when and why. So, it is a race who get 500 first - httpchk or http_client. If You don't know what i mean: example config: 8 frontend (..) default_backend back_1 backend back_1 option httpchk GET /index.php HTTP/1.1\r\nHost:\ test.pl mode http retries 10 balance roundrobin server chk1 127.0.0.1:81 weight 1 check server chk2 127.0.0.1:82 weight 1 check server chk3 127.0.0.1:83 weight 1 check backup 8-- http_client - haproxy - (backend1|backend2|backend3) let's go inside request: A. haproxy recived request from http_client B. haproxy sent request from http_client to backend1 C. backend1 said 500 internal server error I want: :-) D. haproxy sent request from_http to backend2 (or backup backend or another one, or one more time to backend1) I have: :-( D. haproxy sent 500 internal server error to http_client from backend1 E. haproxy will mark backend1 as down if got 2 errror 500 from backend1 It is possible to do that? Hello, Yes it is possible but it could be dengerous for some kinde of application fe. billing system ;) Here is an example how to do it. I know that it is the hack but it works good ;P frontend fr1 default_backend back_1 rspirep ^HTTP/...\ [23]0..* \0\nSet-Cookie:\ cookiexxx=0;path=/;domain=.yourdomain.com rspirep ^(HTTP/...)\ 5[0-9][0-9].* \1\ 202\ Again\ Please\nSet-Cookie:\ cookiexxx=1;path=/;domain=.yourdomain.com\nRefresh:\ 6\nContent-Length:\ Lenght_xxx\nContent-Type:\ text/html\n\nFRAMESET\ cols=100%FRAME\ src=http://www.yourdomain.com/redispatch.pl; backend back_1 cookie cookiexxx server chk1 127.0.0.1:81 weight 1 check server chk2 127.0.0.1:82 weight 1 check server chk3 127.0.0.1:83 weight 1 check cookie 1 backup Remember to set Lenght_xxx properly. Best Regards Maciej Bogucki -- Jonah Horowitz · Monitoring Manager · jhorow...@looksmart.net W: 415-348-7694 · F: 415-348-7033 · M: 415-513-7202 LookSmart - Premium and Performance Advertising Solutions 625 Second Street, San Francisco, CA 94107