On 2011-08-31 17:11, Julien Vehent wrote:
Hi List,
Hello,
This might be a tiny bit off the haproxy topic, but I figured the folks
here would be interested in the subject.
My company is launching a new website on monday, with TV coverage and
potentially large waves of visitors in very short windows (estimate is
around 14k visitors in a 2 minutes window).
So, I'm reviewing our configuration, and my biggest problem right now
is our single node HTTP frontend that uses keep-alive.
(the frontend is running lighttpd but the problem would be the same
with haproxy).
Some assumptions:
* a browser usually opens 6 parallels TCP connections to keep-alive
* the browser will keep the connection open until the timeout is
reached, even if the tab is closed (observed in FF, might not be true on
every browser)
* on the server side, each connection will consume ~150K of memory in
the kernel (I use conntrack and want to keep it, is that estimation
correct ?)
* all of our servers are hosted on the east coast. the RTT from a
server in las vegas is around 80ms.
* The home page with keep-alive uses ~25 TCP connections and 1500
packets. Without keep-alive, this number rises to ~210 TCP conenctions
and over 3200 packets.
So, 6*14000 = 84,000 connections. 84,000 * 150 ~= 12GB of memory. Here
is the problem:
1. I don't have that amount of memory available on the front end.
2. lighttpd 1.4 is not very comfortable with that amount of
connections to manage. it hurts the hits/s a lot.
But on the other end, I'm concerned about the 80ms RTT.
I am going to mitigate some of these issues with a CDN and a secondary
www record with a secondary lighttpd. but the debate concerns the
keep-alive feature. I'd like to turn it off, but I'm worried that the
impact on page opening time is going to be high. Once of the content
retrieval done, we have a lot of ajax requests for browsing the site
that usually fit in a single tcp connection. But I'm not certain that
the browser will free the other connections and just keep one open.
This is where I need Haproxy's users wisdom :)
I know there have been a number of discussion about keep-alive
consuming to much resources. I kind of agree with that, but given the
assumptions and the situation (a RTT between 80ms and 100ms for half our
users), do you think it's wise to deactivate it ?
Your concern is very valid and I think this is a moment where you should
take advantage of HAProxy, so you came to the right place. ;)
Each active session on HAProxy does not cost too much (much less than on
http server), so you may use "http-server-close" mode. You will provide
keep-alive to clients and only to clients - http requests between LB and
http server(s) will be handled without keep-alive.
HAproxy also gives you possibility to transparently distribute requests
(and load) between two or more servers, without additional dns records.
As a side question: do you know where I can find the information
regarding connection size and conntrack size in the kernel ? (other than
printf size_of(sk_buff) :p).
Conntrack has nothing to do with sk_buff. However, you are able to find
this information with for example:
# egrep "(#|connt)" /proc/slabinfo
It should be around 208 bytes on x86 and 264 bytes on x86_64 (2 x longer
pointers), but this is not all. Each conntrack can have some additional
data attached, which is known as "extends". Currently there are 5
possible extends:
- helper - struct nf_conn_help: 16 bytes (x86) / 24 bytes (x86_64)
- nat - struct nf_conn_nat: 16 bytes (x86) / 24 bytes (x86_64)
- acct - struct nf_conn_counter: 16 bytes
- ecache - struct nf_conntrack_ecache: 16 bytes
- zone - struct nf_conntrack_zone: 2 bytes
So, as you can see, in the worst case there can be 66 / 82 more bytes
allocated with each conntrack and this goes into kmalloc-x slab that
rounds it into 2^n bytes.
Best regards,
Krzysztof Olędzki