On Thu, 01 Sep 2011 22:55:48 +0200, Krzysztof Olędzki wrote:
On 2011-09-01 22:15, Julien Vehent wrote:
We do not do any keepalive passed lighttpd. Everything is non-keepalive because we need the x-forwarded-for header on each request for tomcat.
(that's yet another story)

With http-server-close you will still get a proper x-forwarded-for header.

Anyway, the reason why I don't want to put haproxy in front of
everything right now is because we have a shit load of rewrite rules and
it will take me forever to convert all of that into haproxy acl's
syntax.

Sure, but passing the same traffic over several proxies looks very
suboptimal to me. Especially that HAProxy was designed to be a proxy
and lighthttp was not.


Oh, don't get me wrong,the sooner I can move away from lighttpd and remove a piece of infrastructure, the better. But, hey, you know how it is. $boss keep filling up the todo list and I never had time to work on that.

Plus, whether it's lighttpd or haproxy, the memory impact on the kernel
is going to be the same.

Kernel - yes, userspace - not. Remember that proxy needs some memory
for each active connection. And this is much, much more memory that is
required by a kernel.

Right now, I'm considering putting a keepalive timeout at 5 seconds.
Maybe even less.

From my pov 5s for keepalive looks very reasonable.


I settled for that: 5s keepalive.

As a side question: do you know where I can find the information
regarding connection size and conntrack size in the kernel ? (other
than
printf size_of(sk_buff) :p).

Conntrack has nothing to do with sk_buff. However, you are able to
find this information with for example:
# egrep "(#|connt)" /proc/slabinfo


Very nice ! I'll read about slabinfo. Here is the result from the
current lighttpd server

# name<active_objs>  <num_objs>  <objsize>  <objperslab>
<pagesperslab> : tunables<limit> <batchcount> <sharedfactor> : slabdata
<active_slabs>  <num_slabs>  <sharedavail>
ip_conntrack_expect 0 0 136 28 1 : tunables 120 60
    8 : slabdata      0      0      0
ip_conntrack 23508 39767 304 13 1 : tunables 54 27
8 : slabdata   3059   3059     15

304 bytes? Which kernel version?


2.6.18-238.12.1.el5 x86_64
CentOS release 5.6 (Final)

It should be around 208 bytes on x86 and 264 bytes on x86_64 (2 x
longer pointers), but this is not all. Each conntrack can have some
additional data attached, which is known as "extends". Currently
there
are 5 possible extends:
- helper - struct nf_conn_help: 16 bytes (x86) / 24 bytes (x86_64)
  - nat - struct nf_conn_nat: 16 bytes (x86) / 24 bytes (x86_64)
  - acct - struct nf_conn_counter: 16 bytes
  - ecache - struct nf_conntrack_ecache: 16 bytes
  - zone - struct nf_conntrack_zone: 2 bytes

So, as you can see, in the worst case there can be 66 / 82 more bytes allocated with each conntrack and this goes into kmalloc-x slab that
rounds it into 2^n bytes.


That's for conntrack only ?  This is pretty low (346 bytes max per
connection), I thought conntrack would consume more than that.

No, conntracks are rather cheap and a decent hardware is able handle
even 500K without much trouble. However, as they are hashed and
collected into buckets you may waste much more memory that a number of
conntracks migh indicate. Especially that with big load you really
need to bumb nf_conntrack.hashsize not to saturate your CPU.

What about the sk_buff and other structure? I didn't dive in the
networking layer for some time. I don't need an exact number, but just
an idea of how much memory we are talking about.

Size of sk_buff depends on MTU, driver used and several other
factors. With MTU=1500 you typically are able to fit into 2048 or 4096
bytes.


some test results I configured conntrack to accept 500k connections (given the memory footprint, it shouldn't exceed 200MB) and launched an ab test.

ab -n 20000 -c 20000 -k http://website.com/banner.jpg

(ab doesn't support a value of c > 20k. do you know of a injector that support keepalive and can go beyond that ?)

From what I saw in tcpdump, ab establishes all connections before doing the GET. So I get an idea of how much memory is consumed by those 20k connections.

slabtop returns

  OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME
 40586  40586 100%    0.30K   3122       13     12488K ip_conntrack

and top

 PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  SWAP CODE DATA COMMAND
 15   0  862m 786m  780 S 22.9  3.3   1:44.86  76m  172 786m lighttpd

12MB for ip_conntrack and 786MB for lighttpd are OK for my setup. with your number, I need to add another 80MB max (20k*4k) for the network memory footprint. I can easily manage 4x that.

So, keepalive that is, with a idle timeout set to 5 seconds. and s/lighttpd/haproxy/ ASAP ;)

I'm now looking at some other kernel parameters related to conntrack. This answer http://1nw.eu/!S7 on serverfault contains interesting information.


Thanks a lot !

Julien


Reply via email to