Re: Stats for backend queue

Willy Tarreau Sat, 12 May 2012 12:07:10 -0700

On Sat, May 12, 2012 at 08:43:43PM +0300, Bar Ziony wrote:
> So session rate is the number of requests per second ? Why is it called
> session then if it's really requests?


You have the two. Initially in haproxy, you had no keepalive, so
1 req = 1 session. Now you have the numbers in the session column,
and if you pass your mouse over the number, you'll see the requests
too.

> And "Sessions" is just plain sessions number, without caring for how much
> of them were happening in 1 sec?

"sessions cur" is the number of concurrent sessions observed at the moment
it is reported. It is not related to any timing, it's what is observed at
one instant. To keep the analogy with the highway, it's how many lanes are
occupied at the precise instant you're taking the snapshot.

> How can I know the average response time of my servers? haproxy provides
> that data somewhere?

Yes you have each response time value in your logs :-)

> I have a max of 800 requests in the backend queue (none in the servers
> queue since there is no persistence). Is that a lot ? :|

It depends. If you're doing 800 reqs/s, you know that on average it will
take one second to drain these 800 requests, so it can be a lot. But if
you're facing an exceptional event, maybe you accept to delay requests
by up to 1s instead of seeing your servers die or stop responding.

> I also see 3,400 sessions in the frontend, and only ~100 in the dynamic
> backend and 15 in the static backend (in the "cur" column). How is that
> possible? So many requests are not valid, or sessions are kept and are not
> for 1 request only ? :\ I don't understand that..

It's because a connection is only forwarded to the backend once the client
has sent a request. And for some clients, sending a request takes some time
(eg: large requests, or simply because of poor network connectivity), so
you're always having more connections on the frontend than on the backend.
Also, haproxy closes the connection to the server as soon as it has the
last byte of the response, but it still forwards those data to the client
(so it acts as a TCP buffer). During this response buffering, the clients
are still connected to the frontend but the backend is already released.

This behaviour enables some multiplexing of the server connections, because
they never remain idle, even if the clients are slow to read the responses.

> I'm sorry but I didn't quite get what does Concurrency means.

It is the number of parallel sessions you have at one instant. When you do
"netstat -an | grep -c ESTAB", you get a number of concurrent connections.

> Connection/sec * response time ? Why is that = concurrency?

If you're not used to this, you need to draw it on paper to understand.

Imagine a road passing on a bridge. Your bridge is designed to support
100 cars. This is its concurrency limit. The response time is the time
a car spends on the bridge. The cars enter the bridge at a rate of 4
per second. If the cars last more than 25 seconds on the bridge, you'll
have more than 100 cars on your bridge and it might break. If you make
your cars run faster on the bridge, they will last there less time and
there will be less cars on it. If you are on a holiday season, you'll
get a higher "car rate" at the input of the bridge, and if they don't
run faster, you'll break the limit again.

> Here they are:
> net.ipv4.tcp_mem = 24372 32496 48744
> net.ipv4.tcp_wmem = 4096 16384 1039872
> net.ipv4.tcp_rmem = 4096 87380 1039872
> 
> Are those valid?

So you have between 24372*4096 and 48744*4096 = 100..200 MB of RAM
assigned to the TCP stack, which is fine considering your VM size.
Your read and write buffers are correct too (the kernel automatically
adjusts them depending on the available memory).

However you have to be aware that a socket buffer needs at least 4kB
in each direction (hence why the min limit is 4kB), so 200 MB limits
you to 200/2/4 = 25k sockets, 20k of which can be on the frontend side
and 5k of which may be on the backend side.

> > You can reduce haproxy's memory usage by reducing buffer sizes this
> > way here :
> >    tune.bufsize 8030
> >    tune.maxrewrite 1030
> 
> But would this hurt somehow?

It would only hurt if your average object size is larger than 7kB. And it
would not hurt that much, it would only eat a bit more CPU because haproxy
would have to perform twice the number of recv/send to forward data between
sockets. If you're using mostly large objects (more than twice the buffer
size), you can also enable "option splice-response" which will permit
forwarding between the two sockets without user-land copy. It becomes
insensible to the buffer size and generally saves some CPU cycles.

> I can increase the RAM if that will solve the
> problem! I just wonder how it is possible that haproxy is using so much
> RAM, when I didn't see so much RAM usage from my old single web server
> (nginx).

As I said, it all depends on how many connections you were having when
you observed the issue. With the 16kB default buffers, 20k conns * (2
buffers + ~1kB for internal structs) will take around 650 MB of RAM.
This plus the 200 MB for the kernel buffers almost reaches your VM
memory. I forgot to say that you also have conntrack to account for.
What is unsure is if you really reached those 20k conns.

> I don't want to configure stuff for a low-RAM machine if I actually need
> more RAM. We have no problem paying for a bigger VPS (but unfortunately we
> must stay on this VPS infrastructure).

OK, then let's tune more finely and increase the RAM size at the same
time : use the 8kB buffers in haproxy as I indicated above, and increase
your RAM to 1.5-2GB to be safe.

> Only nginx is running on this machine as well to terminate SSL, but it
> seems like haproxy is the one that consumes all memory. Nothing else is
> running on the machines besides syslog for haproxy and the machine itself,
> regular processes and munin plugins every 5 minutes (which are not causing
> any RAM issues)...

OK, so it should be easy to get the correct numbers once for all.

Regards,
Willy

Re: Stats for backend queue

Reply via email to