Re: Problems with long connect times

2009-10-14 Thread Jonah Horowitz

driver: tg3
version: 3.98
firmware-version: 5721-v3.55a
bus-info: :03:00.0

Not running bnx2.  Looks like it's not a 65563 limit either, I've been
graphing it and it's up to 80k sometimes, but it goes up and down.

When it fails, it seems like it's either 3 seconds or 9 seconds.  Would tcp
retransmits cause that?  I just compiled a kernel with a default retransmit
of 1sec, but I haven't tested it yet.

Here's the output of netstat -s:

IcmpMsg:
InType0: 18
InType3: 50818
InType8: 699
OutType0: 699
OutType3: 50841
OutType8: 18
Tcp:
2059992268 active connections openings
1933849278 passive connection openings
4543998 failed connection attempts
2093186 connection resets received
142 connections established
3547584716 segments received
3643865881 segments send out
20003371 segments retransmited
0 bad segments received.
6179288 resets sent
UdpLite:
TcpExt:
4237091 resets received for embryonic SYN_RECV sockets
1915476798 TCP sockets finished time wait in fast timer
28901367 time wait sockets recycled by time stamp
119887 packets rejects in established connections because of timestamp
2171355337 delayed acks sent
292818 delayed acks further delayed because of locked socket
Quick ack mode was activated 697528 times
15213 times the listen queue of a socket overflowed
15213 SYNs to LISTEN sockets dropped
2125065 packets directly queued to recvmsg prequeue.
18179 bytes directly in process context from backlog
7564477 bytes directly received in process context from prequeue
3465788360 packet headers predicted
7232 packets header predicted and directly queued to user
2567319929 acknowledgments not containing data payload received
2718897 predicted acknowledgments
80328 times recovered from packet loss by selective acknowledgements
Detected reordering 3118 times using FACK
Detected reordering 46 times using SACK
Detected reordering 32513 times using time stamp
55394 congestion windows fully recovered without slow start
44249 congestion windows partially recovered using Hoe heuristic
115 congestion windows recovered without slow start by DSACK
101091 congestion windows recovered without slow start after partial ack
4019 TCP data loss events
TCPLostRetransmit: 17
11 timeouts after reno fast retransmit
443124 timeouts after SACK recovery
266 timeouts in loss state
83502 fast retransmits
33980 forward retransmits
8964 retransmits in slow start
4227010 other TCP timeouts
421 SACK retransmits failed
698471 DSACKs sent for old packets
118559 DSACKs received
34 DSACKs for out of order packets received
868905 connections reset due to unexpected data
2054320 connections reset due to early user close
1876779 connections aborted due to timeout
TCPSACKDiscard: 1820
TCPDSACKIgnoredOld: 110422
TCPDSACKIgnoredNoUndo: 4762
TCPSpuriousRTOs: 18
TCPSackShifted: 9702
TCPSackMerged: 59174
TCPSackShiftFallback: 71815157
IpExt:
InMcastPkts: 8816
OutMcastPkts: 3589637
InBcastPkts: 29338


Thanks again for all your help.

Jonah


On 10/13/09 9:37 PM, Willy Tarreau w...@1wt.eu wrote:

 On Tue, Oct 13, 2009 at 12:52:55PM -0700, Jonah Horowitz wrote:
 netstat -ant | grep tcp | tr -s ' ' ' ' | awk '{print $6}' | sort | uniq
 -c
193 CLOSE_WAIT
316 CLOSING
215 ESTABLISHED
252 FIN_WAIT1
  4 FIN_WAIT2
  1 LAST_ACK
 10 LISTEN
237 SYN_RECV
  61384 TIME_WAIT
 
 So, clearly there's a time_wait problem.  I've already tuned the kernel
 to set the time_wait counter to 20 seconds (down from 60).  I'm tempted
 to crank it down further, although googling around recommends against
 it.  Is it possible to up the number of outstanding time_wait
 connections?  This host looks like it's hitting a 65536 connection
 limit.
 
 No, TIME_WAIT are not an issue, and are even normal. It's useless to
 try to reduce them, your proxy can simply re-use them. The only case
 where it is not possible is when the proxy closed the connection first
 (eg: option forceclose) but your config does not have this.
 
 I'm more concerned by the SYN_RECV which indicate that you did not
 get an ACK from a client. I'm suspecting you have a high packet loss
 rate. What type of NIC are you running from ? Wouldn't this be a
 bnx2 with firmware 1.9.6 ? (use ethtool -i eth0). If so, you must
 find a firmware on your vendor's site and upgrade it, as this one
 is very common and very buggy.
 
 Regards,
 Willy
 

-- 
Jonah Horowitz · Monitoring Manager · jhorow...@looksmart.net
W: 415-348-7694 · F: 415-348-7033 · M: 415-513-7202
LookSmart - Premium and Performance Advertising Solutions
625 Second Street, San Francisco, CA 94107





RE: Problems with long connect times

2009-10-13 Thread Jonah Horowitz
netstat -ant | grep tcp | tr -s ' ' ' ' | awk '{print $6}' | sort | uniq
-c
   193 CLOSE_WAIT
   316 CLOSING
   215 ESTABLISHED
   252 FIN_WAIT1
 4 FIN_WAIT2
 1 LAST_ACK
10 LISTEN
   237 SYN_RECV
 61384 TIME_WAIT

So, clearly there's a time_wait problem.  I've already tuned the kernel
to set the time_wait counter to 20 seconds (down from 60).  I'm tempted
to crank it down further, although googling around recommends against
it.  Is it possible to up the number of outstanding time_wait
connections?  This host looks like it's hitting a 65536 connection
limit.



 -Original Message-
 From: Hank A. Paulson [mailto:h...@spamproof.nospammail.net]
 Sent: Monday, October 12, 2009 9:14 PM
 To: haproxy@formilux.org
 Subject: Re: Problems with long connect times
 
 A couple of guesses you might look at -
 I have found the stats page to show deceptively low numbers at times.
 You might want to check the http log stats that show the
 global/frontend/backend queue numbers around the time those requests.
 My guess
 is that the cases where you are seeing 3 second times it is that the
 backends
 are slow to connect or they have reached maxconn. Also, you might want
 to
 double check that the clients are sending the requests in a timely
 fashion.
 
 netstat -ant | wc -l
 
 do you have conntrack running as in the recent situation here on the
 ml?
 Any other messages in /var/log/messages?
 netstat -s have any growing stats?
 
 I assume you have lots backends if they are all at only maxconn 20
 
 
 On 10/12/09 5:15 PM, Jonah Horowitz wrote:
  I'm having a problem where occasionally under load, the time to
 complete
  the tcp handshake is taking much longer than it should:
 
  Picture (Device Independent Bitmap)
 
  My suspicion is that the number of connections available to the
 haproxy
  server are some how constrained and it can't answer connections for
a
  moment. I'm not sure how to debug this. Has anyone else seen
 something
  like this?
 
  According to the haproxy stats page, I've never come close to my
  connection limit. I'm using about 1000 concurrent connections and my
  request rate maxes out at 4400 requests per second. I'm not seeing
 any
  messages in dmesg or my /var/log/messages.
 
  I'm running 1.4-dev3 on Linux 2.6.30.5. My config is below:
 
  TIA,
 
  Jonah
 
  --- compile options ---
 
  make USE_REGPARM=1 USE_STATIC_PCRE=1 USE_LINUX_SPLICE=1
 TARGET=linux26
  CPU_CFLAGS='-O2 -march=x86-64 -m64'
 
  --- config ---
 
  global
 
  maxconn 2000
 
  pidfile /usr/pkg/haproxy/run/haproxy.pid
 
  stats socket /usr/pkg/haproxy/run/stats
 
  log /usr/pkg/haproxy/jail/log daemon
 
  user daemon
 
  group daemon
 
  defaults
 
  timeout queue 3000
 
  timeout server 3000
 
  timeout client 3000
 
  timeout connect 3000
 
  option splice-auto
 
  frontend stats
 
  bind :8080
 
  mode http
 
  use_backend stats if TRUE
 
  backend stats
 
  mode http
 
  stats enable
 
  stats uri /stats
 
  stats refresh 5s
 
  frontend query
 
  log global
 
  option dontlog-normal
 
  option httplog
 
  bind :80
 
  mode http
 
  use_backend query if TRUE
 
  backend query
 
  mode http
 
  balance roundrobin
 
  option httpchk GET /r?q=LOOKSMARTKEYWORDLISTINGMONITORisp=DROPus
 
  option forwardfor
 
  option httpclose
 
  server foo1 foo1:8080 weight 150 maxconn 20 check inter 1000 rise 2
 fall 1
 
  server foo2 foo2:8080 weight 150 maxconn 20 check inter 1000 rise 2
 fall 1
 
  server foo2 foo3:8080 weight 150 maxconn 20 check inter 1000 rise 2
 fall 1
 
  ...
 




RE: Nbproc question

2009-09-29 Thread Jonah Horowitz
Here's the output of top on the system:

top - 09:50:36 up 4 days, 18:50,  1 user,  load average: 1.31, 1.59, 1.55
Tasks: 117 total,   2 running, 115 sleeping,   0 stopped,   0 zombie
Cpu(s):  2.5%us,  9.9%sy,  0.0%ni, 75.0%id,  0.0%wa,  0.5%hi, 12.1%si,  0.0%st
Mem:   8179536k total,   997748k used,  7181788k free,   139236k buffers
Swap:  9976356k total,0k used,  9976356k free,   460396k cached

PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND  
  
 752741 daemon20   0 34760  24m  860 R  100  0.3 871:15.76 haproxy  
  

It's a quad core system, but haproxy is taking 100% of one core.

We're doing less than 5k req/sec and the box has two 2.6ghz Opterons in it.

Do you know how much health checks affect cpu utilization of an haproxy process?

We have about 100 backend servers and we're running inter 500 rise 2 fall 1

I haven't tried adjusting that, although when it was set to the default our 
error rates were much higher.

Thanks,

Jonah


-Original Message-
From: Willy Tarreau [mailto:w...@1wt.eu] 
Sent: Monday, September 28, 2009 9:50 PM
To: Jonah Horowitz
Cc: haproxy@formilux.org
Subject: Re: Nbproc question

On Mon, Sep 28, 2009 at 06:43:58PM -0700, Jonah Horowitz wrote:
 In the documentation it seems to discourage using the nbproc directive.
 What¹s the situation with this?  I¹m running a server with 8 cores, so I¹m
 tempted to up the nbproc.  Is the process normally multithreaded?

no the process is not multithreaded.

 Is nbproc
 something I can use for performance tuning, or is it just for file handles?

It can bring you small performance gains at the expense of a more
complex monitoring, since the stats will still only reflect the
process which receives the stats request. Also, health-checks will
be performed by each process, causing an increased load on your
servers. And the connection limitation will not work anymore, as
any process won't know that there are other processes already
using a server.

It was initially designed to workaround per-process file handle
limitations on some systems, but it is true that it brings a minor
performance advantage.

However, considering that you can reach 4 connections per second
with a single process on a cheap core2duo 2.66 GHz, and that forwarding
data at 10 Gbps on this machine consumes only 20% of a core, you can
certainly understand why I don't see the situations where it would
make sense to use nbproc.

Regards,
Willy




RE: artificial maxconn imposed

2009-09-18 Thread Jonah Horowitz
I fixed the nf_contrack problem with this (really just the first one,
but the others were good too).

HAProxy sysctl changes

For network tuning, add the following to /etc/sysctl.conf:

net.ipv4.netfilter.ip_conntrack_max = 16777216
net.ipv4.tcp_max_tw_buckets = 16777216

increase TCP max buffer size setable using setsockopt()

net.core.rmem_max = 16777216
net.core.wmem_max = 16777216

increase Linux autotuning TCP buffer limits min, default, and max number
of bytes to use set max to at least 4MB, or higher if you use very high
BDP paths

net.ipv4.tcp_rmem = 4096 87380 16777216  
net.ipv4.tcp_wmem = 4096 65536 16777216

-jonah

-Original Message-
From: David Birdsong [mailto:david.birds...@gmail.com] 
Sent: Friday, September 18, 2009 3:06 PM
To: haproxy
Subject: artificial maxconn imposed

I've set ulimit -n 2

maxconn in defaults is 16384 and still somehow when i check the stats
page,maxconn is limited to 1, sure enough requests start piling
up.

any suggestions on where else to look?  i'm sure it's an OS thing, so:

Fedora 10 x86_64 16GB of RAM

this command doesn't turn anything up
find /proc/sys/net/ipv4 -type f -exec cat {} \; | grep 1


(also dmesg shows nf_conntrack: table full, dropping packet.) which i
think is another problem.  might be time to switch to a *BSD.




Backend Server UP/Down Debugging?

2009-08-26 Thread Jonah Horowitz
I’m watching my servers on the back end and occasionally they flap.  I’m 
wondering if there is a way to see why they are taken out of service.  I’d like 
to see the actual response, or at least a HTTP status code.

 

Jonah Horowitz · Monitoring Manager · jhorow...@looksmart.net 
mailto:jhorow...@looksmart.net 

w: 415.348.7694 · c: 415.513.7202 · f: 415.348.7020

625 Second Street, San Francisco, CA 94107

 



Re: realtime switch to another backend if got 5xx error?

2009-07-30 Thread Jonah Horowitz
I'm trying to figure out how this works.  I desperately need to figure out a
way to monitor servers and either take any server that sends any 5xx error
out of rotation, or failing that, at least redirect the query to a different
server.

The clients that use this web service are SOAP/XML clients, so they're not
real web browsers.  Also, we don't use any cookies.

It looks like this config just tells the client to make a second request.
Am I missing something here?

I know I can use httpchk, but I don't want to run inter 1 because then all
my traffic is monitoring traffic.  Each server is normally doing several
hundred requests per second, and our haproxy test setup is a couple orders
of magnitude higher on % of 500 errors. (10% vs .01%).

Any ideas?

Thanks,

Jonah



On 6/11/09 7:45 AM, Maciej Bogucki macbogu...@gmail.com wrote:

 Dawid Sieradzki / Gadu-Gadu S.A. pisze:
 Hi.
 
 The problem is how to silent switch to another backend in realtime if
 got 500 answer from backend, without http_client knowledge
 Yes i know, httpchk, but the error 500 is 10 per hour, we don't know
 when and why.
 So, it is a race who get 500 first - httpchk or http_client.
 
 If You don't know what i mean:
 
 example config:
 
 8
 
 frontend
 (..)
 default_backend back_1
 
 backend back_1
option httpchk GET /index.php HTTP/1.1\r\nHost:\ test.pl
mode http
retries 10
balance roundrobin
 
  server chk1 127.0.0.1:81 weight 1 check
  server chk2 127.0.0.1:82 weight 1 check
  server chk3 127.0.0.1:83 weight 1 check backup
 
 8--
 
 http_client - haproxy - (backend1|backend2|backend3)
 
 let's go inside request:
 
 A. haproxy recived request from http_client
 B. haproxy sent request from http_client to backend1
 C. backend1 said 500 internal server error
 
 I want: :-)
 D. haproxy sent request from_http to backend2 (or backup backend or
 another one, or one more time to backend1)
 
 I have: :-(
 D. haproxy sent 500 internal server error to http_client from backend1
 E. haproxy will mark backend1 as down if got 2  errror 500 from backend1
 
 
 It is possible to do that?
 
 Hello,
 
 Yes it is possible but it could be dengerous for some kinde of
 application fe. billing system ;)
 Here is an example how to do it. I know that it is the hack but it works
 good ;P
 
 frontend fr1
 default_backend back_1
 rspirep ^HTTP/...\ [23]0..* \0\nSet-Cookie:\
 cookiexxx=0;path=/;domain=.yourdomain.com
 rspirep ^(HTTP/...)\ 5[0-9][0-9].* \1\ 202\ Again\
 Please\nSet-Cookie:\
 cookiexxx=1;path=/;domain=.yourdomain.com\nRefresh:\ 6\nContent-Length:\
 Lenght_xxx\nContent-Type:\ text/html\n\nFRAMESET\ cols=100%FRAME\
 src=http://www.yourdomain.com/redispatch.pl;
 
 backend back_1
  cookie  cookiexxx
  server chk1 127.0.0.1:81 weight 1 check
  server chk2 127.0.0.1:82 weight 1 check
  server chk3 127.0.0.1:83 weight 1 check cookie 1 backup
 
 Remember to set Lenght_xxx properly.
 
 Best Regards
 Maciej Bogucki
 
 

-- 
Jonah Horowitz · Monitoring Manager · jhorow...@looksmart.net
W: 415-348-7694 · F: 415-348-7033 · M: 415-513-7202
LookSmart - Premium and Performance Advertising Solutions
625 Second Street, San Francisco, CA 94107