Re: Perfect sysctl

2010-01-01 Thread Willy Tarreau
Hi Marcus,

On Fri, Jan 01, 2010 at 12:09:05PM +0100, Marcus Herou wrote:
 Thanks!
 
 This is an excerpt of the haproxy conf, does it look OK ? Will HAProxy set
 ulimit for the haproxy user?

Yes it does since 1.3.X (I don't remember what X, but a small one).

 How can I tell if root could actually set the specified ulimit ?

root can always set it. The issues generally come from login scripts which
lower the initial limit. That's the reason why haproxy knows how to tweak
the parameter : getting rid of the complex settings on the system when
dealing with non-root users. What haproxy does when you start it as root
is :
  1) set ulimit to the proper value
  2) change uid/gid

So your user will have the correct number of FDs. BTW, you don't need to
set the limit by hand, it knows how to automatically compute it from the
number of servers, connections, listeners, etc...

 
 I have these settings in /etc/sysctl.conf
 
 net.ipv4.tcp_syncookies = 1
 net.ipv4.tcp_max_syn_backlog = 262144
 net.core.somaxconn = 262144
 
 I attach the sysctl.conf for completeness, I am sure it contains lots if
 stupid config rows since it is very much copy'pasted, but I've tried to go
 through each setting to understand what it affects.

262144 is a bit large. It's only when you reach that number that SYN
cookies will take effect. Having to deal with 256k sockets during a
SYN flood can cause high CPU usages (though it works), reason why I
found that lowering it a bit (10-20k instead) shows best results.
Note that for each of these sockets, multiple SYN-ACK packets will be
emitted, which is another reason not to have too many of them.

 About the swap, yeah the machine got out of memory due to that an
 auto-restart script started to many java-processes.

OK.

Regards,
Willy




Re: Perfect sysctl

2009-12-30 Thread Angelo Höngens
On 30-12-2009 14:04, Marcus Herou wrote:
 Hi Willy, thanks for your answer it got filtered, that's why I missed it
 for two weeks.
 
 Let's start with describing the service.
 
 We are hosting javascripts of the sizes up to 20K and serve flash and
 image banners as well which of course are larger. That is basically it..
 Ad Serving.
 
 On the LB's we have about 2MByte/s per LB  = 2x2MByte/s = 4MByte/s
 ~30MBit/s at peak, that is not the issue.
 
 I've created a little script which parse the active connections from
 the HAProxy stat interface and plots it into Cacti, it peaks at 100
 (2x100) connections per machine which is very little in your world I guess.
 
 I've attached a plot of tcp-connections as well. Nothing fancy there
 either besides that the number of TIME_WAIT sockets are in the 1
 range (log scale)
 
 Here's the problem:
 
 Everyother day I receive alarms from Pingdom that the service is not
 available and if I watch the syslog I get at about the same timings
 hints about possible SYN flood. At the same timings we receive emails
 from sites using us that our service is damn slow.
 
 What I feel is that we get hickups on the LB's somehow and that
 requests get queued. If I count the number of rows in the access logs on
 the machines behind the LB it decreases at the same timings and with the
 same factor on each machine (perhaps 10-20%) leading me to think that
 the narrow point is not on the backend side.


Maybe interesting, maybe not: I had some problems like this as well and
in my case I think this was caused by the limited number of outgoing
ports from my proxy machines..

I don't use connection keep-alives, and I think my balancers were
reusing ports faster than the backend windows machines could handle or
something like that.. Anyway, after I changed my FreeBSD's available
outgoing ports range, all problems were solved again.

Here's my sysctl for my FreeBSD 7.2 machines, but as Willy said, this
might not work for everyone.

kern.maxfiles=65535
kern.maxfilesperproc=32767
kern.ipc.maxsockbuf=16777216
kern.ipc.somaxconn=32768
net.inet.tcp.rfc1323=1
net.inet.tcp.sendbuf_max=16777216
net.inet.tcp.recvbuf_max=16777216
net.inet.tcp.sendbuf_auto=1
net.inet.tcp.sendbuf_inc=16384
net.inet.tcp.recvbuf_auto=1
net.inet.tcp.recvbuf_inc=524288
net.inet.tcp.inflight.enable=0
net.inet.tcp.hostcache.expire=1
net.inet.ip.portrange.first=1024
net.inet.ip.portrange.last=65535
net.inet.ip.portrange.hifirst=49152
net.inet.ip.portrange.hilast=65535

@Willy or someone else, feel free to comment on these settings if you
see something strange.

-- 


With kind regards,


Angelo Höngens
systems administrator

MCSE on Windows 2003
MCSE on Windows 2000
MS Small Business Specialist
--
NetMatch
tourism internet software solutions

Ringbaan Oost 2b
5013 CA Tilburg
+31 (0)13 5811088
+31 (0)13 5821239

a.hong...@netmatch.nl
www.netmatch.nl
--





Re: Perfect sysctl

2009-12-15 Thread Willy Tarreau
Hi Marcus,

On Tue, Dec 15, 2009 at 10:53:31AM +0100, Marcus Herou wrote:
 Hi guys.
 
 I would appreciate it a lot if someone could share a sysctl.conf which is
 known to run smoothly on a HAProxy machine with busy sites behind it.

This is a question I regularly hear.

very busy sites does not mean much. Tuning is a tradeoff between being
good at one job and being good at another one. People who run with very
large numbers of concurrent connections will not tune the same way as
people forwarding high data rates, which in turn will not tune the same
way as people experiencing high session setup/teardown rates.

People with large numbers of servers sometimes don't want to wait long
on each request either while people with small numbers of servers will
prefer to wait long in order not to sacrifice large parts of their clients
in case something temporarily goes wrong.

You see, that's just a tradeoff. You need to define a little bit your
workload (bit rate, session rate, session concurrency, number of servers,
response times, etc...). The more info you provide, the finer the tuning.

 There
 are so many variables that one possibly can fuck up so it is better to start
 from something which is known to work.

Well, I can tell you for sure that among the few people who are *really*
experiencing high loads on busy machines, you won't find similar tuning,
and that the few common parts will not help at all alone.

And I would really recommend against blindly copy-pasting tuning parameters
from another machine, as you can see your system collapse for no apparent
reason (typical error is to copy tcp_mem settings with the wrong units).

Regards,
Willy