On Thu, 01 Sep 2011 22:55:48 +0200, Krzysztof Olędzki wrote:
On 2011-09-01 22:15, Julien Vehent wrote:
We do not do any keepalive passed lighttpd. Everything is
non-keepalive
because we need the x-forwarded-for header on each request for
tomcat.
(that's yet another story)
With http-server-close you will still get a proper x-forwarded-for
header.
Anyway, the reason why I don't want to put haproxy in front of
everything right now is because we have a shit load of rewrite rules
and
it will take me forever to convert all of that into haproxy acl's
syntax.
Sure, but passing the same traffic over several proxies looks very
suboptimal to me. Especially that HAProxy was designed to be a proxy
and lighthttp was not.
Oh, don't get me wrong,the sooner I can move away from lighttpd and
remove a piece of infrastructure, the better. But, hey, you know how it
is. $boss keep filling up the todo list and I never had time to work on
that.
Plus, whether it's lighttpd or haproxy, the memory impact on the
kernel
is going to be the same.
Kernel - yes, userspace - not. Remember that proxy needs some memory
for each active connection. And this is much, much more memory that
is
required by a kernel.
Right now, I'm considering putting a keepalive timeout at 5 seconds.
Maybe even less.
From my pov 5s for keepalive looks very reasonable.
I settled for that: 5s keepalive.
As a side question: do you know where I can find the information
regarding connection size and conntrack size in the kernel ?
(other
than
printf size_of(sk_buff) :p).
Conntrack has nothing to do with sk_buff. However, you are able to
find this information with for example:
# egrep "(#|connt)" /proc/slabinfo
Very nice ! I'll read about slabinfo. Here is the result from the
current lighttpd server
# name<active_objs> <num_objs> <objsize> <objperslab>
<pagesperslab> : tunables<limit> <batchcount> <sharedfactor> :
slabdata
<active_slabs> <num_slabs> <sharedavail>
ip_conntrack_expect 0 0 136 28 1 : tunables 120
60
8 : slabdata 0 0 0
ip_conntrack 23508 39767 304 13 1 : tunables 54
27
8 : slabdata 3059 3059 15
304 bytes? Which kernel version?
2.6.18-238.12.1.el5 x86_64
CentOS release 5.6 (Final)
It should be around 208 bytes on x86 and 264 bytes on x86_64 (2 x
longer pointers), but this is not all. Each conntrack can have some
additional data attached, which is known as "extends". Currently
there
are 5 possible extends:
- helper - struct nf_conn_help: 16 bytes (x86) / 24 bytes
(x86_64)
- nat - struct nf_conn_nat: 16 bytes (x86) / 24 bytes (x86_64)
- acct - struct nf_conn_counter: 16 bytes
- ecache - struct nf_conntrack_ecache: 16 bytes
- zone - struct nf_conntrack_zone: 2 bytes
So, as you can see, in the worst case there can be 66 / 82 more
bytes
allocated with each conntrack and this goes into kmalloc-x slab
that
rounds it into 2^n bytes.
That's for conntrack only ? This is pretty low (346 bytes max per
connection), I thought conntrack would consume more than that.
No, conntracks are rather cheap and a decent hardware is able handle
even 500K without much trouble. However, as they are hashed and
collected into buckets you may waste much more memory that a number
of
conntracks migh indicate. Especially that with big load you really
need to bumb nf_conntrack.hashsize not to saturate your CPU.
What about the sk_buff and other structure? I didn't dive in the
networking layer for some time. I don't need an exact number, but
just
an idea of how much memory we are talking about.
Size of sk_buff depends on MTU, driver used and several other
factors. With MTU=1500 you typically are able to fit into 2048 or
4096
bytes.
some test results I configured conntrack to accept 500k connections
(given the memory footprint, it shouldn't exceed 200MB) and launched an
ab test.
ab -n 20000 -c 20000 -k http://website.com/banner.jpg
(ab doesn't support a value of c > 20k. do you know of a injector that
support keepalive and can go beyond that ?)
From what I saw in tcpdump, ab establishes all connections before doing
the GET. So I get an idea of how much memory is consumed by those 20k
connections.
slabtop returns
OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME
40586 40586 100% 0.30K 3122 13 12488K ip_conntrack
and top
PR NI VIRT RES SHR S %CPU %MEM TIME+ SWAP CODE DATA COMMAND
15 0 862m 786m 780 S 22.9 3.3 1:44.86 76m 172 786m lighttpd
12MB for ip_conntrack and 786MB for lighttpd are OK for my setup. with
your number, I need to add another 80MB max (20k*4k) for the network
memory footprint. I can easily manage 4x that.
So, keepalive that is, with a idle timeout set to 5 seconds. and
s/lighttpd/haproxy/ ASAP ;)
I'm now looking at some other kernel parameters related to conntrack.
This answer http://1nw.eu/!S7 on serverfault contains interesting
information.
Thanks a lot !
Julien