Hi Julien, On Thu, Sep 01, 2011 at 06:51:19PM -0400, Julien Vehent wrote: > some test results I configured conntrack to accept 500k connections > (given the memory footprint, it shouldn't exceed 200MB) and launched an > ab test. > > ab -n 20000 -c 20000 -k http://website.com/banner.jpg > > (ab doesn't support a value of c > 20k. do you know of a injector that > support keepalive and can go beyond that ?) > > From what I saw in tcpdump, ab establishes all connections before doing > the GET. So I get an idea of how much memory is consumed by those 20k > connections. > > slabtop returns > > OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME > 40586 40586 100% 0.30K 3122 13 12488K ip_conntrack > > and top > > PR NI VIRT RES SHR S %CPU %MEM TIME+ SWAP CODE DATA COMMAND > 15 0 862m 786m 780 S 22.9 3.3 1:44.86 76m 172 786m lighttpd > > 12MB for ip_conntrack and 786MB for lighttpd are OK for my setup. with > your number, I need to add another 80MB max (20k*4k) for the network > memory footprint. I can easily manage 4x that.
As Krzysztof already said it, you can safely ignore the conntrack cost, provided that it's correctly tuned to save your CPU. However you need to clearly distinguish between kernel TCP buffers and user-space TCP buffers. The user-space buffers only depend on the product you're using and are generally stable for a given connection, so you can roughly measure their size and estimate their number for your load. The kernel buffers are dynamic. Since you're on a proxy, you'll have one send and one receive buffer on each side of the proxy for a given end-to-end connection. If you don't keep the connections alive between lighttpd and haproxy, the internal side of the connection can be omitted since it will be very ephemeral. So you're still stuck with a high number of connections from the clients. Those buffers can widely vary in size, check tcp_rmem and tcp_wmem. The left parameter is the smallest possible size, which is used upon memory shortage. The middle one is the default size, assigned to new connections. The right one is the maximum you allow them to grow if memory permits it. In practice, as long as all your connections fit in the smallest size, you don't have to worry for your system's stability. Performance may be impacted however, because the higher the RTT, the larger the buffer you need. Your tests with ab were made on the local network so your window probably never grew beyond 4-5 kB. If you're really too short in memory, disabling keep-alive on the external side may be better than running with too low windows : you'd then take one extra RTT to set up a connection but at least it will run with full buffers, while the keep-alive with small buffers can add many RTTs to push objects larger than the buffer. Still, your load seems very manageable. You were talking about 14k clients in 2 minutes, with 6 connections per client. Indeed this is 84k conns in 2 minutes. This is only 700 conns per second. If you keep your connections alive for 5 seconds max, you'll run at 3500 concurrent connections, which is a lot less than what you tested with ab. With some minor tuning, it should be very easy to achieve with 512 or more megs of RAM. Another point to keep in mind if you intend to enable keepalive on the internal side : you can't do that with too large keep-alive timeouts because you'll need a large number of source ports. For instance, if you were running with a keep-alive timeout of 60s, you'd have 42k concurrent connections. The default settings for the local port range in ip_local_port_range allows less than 30k source ports. This is a point which needs some tuning too even if you don't intend to enable keep-alive this soon. Cheers, Willy

