Hi Julien,

On Thu, Sep 01, 2011 at 06:51:19PM -0400, Julien Vehent wrote:
> some test results I configured conntrack to accept 500k connections 
> (given the memory footprint, it shouldn't exceed 200MB) and launched an 
> ab test.
> 
> ab -n 20000 -c 20000 -k http://website.com/banner.jpg
> 
> (ab doesn't support a value of c > 20k. do you know of a injector that 
> support keepalive and can go beyond that ?)
> 
> From what I saw in tcpdump, ab establishes all connections before doing 
> the GET. So I get an idea of how much memory is consumed by those 20k 
> connections.
> 
> slabtop returns
> 
>   OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME
>  40586  40586 100%    0.30K   3122       13     12488K ip_conntrack
> 
> and top
> 
>  PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  SWAP CODE DATA COMMAND
>  15   0  862m 786m  780 S 22.9  3.3   1:44.86  76m  172 786m lighttpd
> 
> 12MB for ip_conntrack and 786MB for lighttpd are OK for my setup. with 
> your number, I need to add another 80MB max (20k*4k) for the network 
> memory footprint. I can easily manage 4x that.

As Krzysztof already said it, you can safely ignore the conntrack cost,
provided that it's correctly tuned to save your CPU.

However you need to clearly distinguish between kernel TCP buffers and
user-space TCP buffers. The user-space buffers only depend on the product
you're using and are generally stable for a given connection, so you can
roughly measure their size and estimate their number for your load.

The kernel buffers are dynamic. Since you're on a proxy, you'll have
one send and one receive buffer on each side of the proxy for a given
end-to-end connection. If you don't keep the connections alive between
lighttpd and haproxy, the internal side of the connection can be omitted
since it will be very ephemeral. So you're still stuck with a high number
of connections from the clients. Those buffers can widely vary in size,
check tcp_rmem and tcp_wmem. The left parameter is the smallest possible
size, which is used upon memory shortage. The middle one is the default
size, assigned to new connections. The right one is the maximum you
allow them to grow if memory permits it. In practice, as long as all
your connections fit in the smallest size, you don't have to worry for
your system's stability. Performance may be impacted however, because
the higher the RTT, the larger the buffer you need. Your tests with ab
were made on the local network so your window probably never grew
beyond 4-5 kB.

If you're really too short in memory, disabling keep-alive on the external
side may be better than running with too low windows : you'd then take one
extra RTT to set up a connection but at least it will run with full buffers,
while the keep-alive with small buffers can add many RTTs to push objects
larger than the buffer.

Still, your load seems very manageable. You were talking about 14k clients
in 2 minutes, with 6 connections per client. Indeed this is 84k conns in
2 minutes. This is only 700 conns per second. If you keep your connections
alive for 5 seconds max, you'll run at 3500 concurrent connections, which
is a lot less than what you tested with ab. With some minor tuning, it
should be very easy to achieve with 512 or more megs of RAM.

Another point to keep in mind if you intend to enable keepalive on the
internal side : you can't do that with too large keep-alive timeouts
because you'll need a large number of source ports. For instance, if
you were running with a keep-alive timeout of 60s, you'd have 42k
concurrent connections. The default settings for the local port range
in ip_local_port_range allows less than 30k source ports. This is a
point which needs some tuning too even if you don't intend to enable
keep-alive this soon.

Cheers,
Willy


Reply via email to