Hello,

On Wed, Jul 29, 2009 at 12:17:47AM -0700, xx yy wrote:
(...)
> But lest do it one more time, but this time we will let it run a little longer
> 
> r...@lighttpdubu:~#  httperf --server 127.0.0.1 --port 80 --uri /index.html 
> --rate 250 --num-conn 100000 --num-call 1 --timeout 5
(...)
> Errors: total 39141 client-timo 31987 socket-timo 0 connrefused 0 connreset 0
> Errors: fd-unavail 7154 addrunavail 0 ftab-full 0 other 0
> 
> ====== I had to stop it because it timed out with the same settings ==========
> 
> So this is just a matter of time until it times out because the tcp ports do 
> not reuse,
> this is why I need some advices on tuning the system stack.
> 
> This is what I have until now:
> 
> net.ipv4.tcp_fin_timeout = 1

This one is definitely wrong ! You're not even allowing the other side to
lose the last ACK and retransmit it ! Please increase this value to at
least 30s.

> net.ipv4.ip_local_port_range="1024 65536"

This one is wrong too. Port 65536 does not exist. I don't know what the
system will do with such a range. Maybe it ignores it, maybe it will
randomly fail to bind to the source port, maybe it internally limits
it and will not be affected... Lots of maybes.

Also you need to set net.ipv4.tcp_tw_reuse to 1, so that the client
can bind to a socket which is still in TIME_WAIT after all ports have
been used.

> sysctl -w fs.file-max=2000000
> 
> Are my tests wrong or the system can't handle more than 200 requests/second ?

There is no reason. I regularly test above 75000 requests/s between client and
server.

> > You also need to adjust haproxy settings according to your webservers
> > capabilities (maxconn/weight/minconn/inter).
> 
> Can you please exemplify this ?

What Benoit meant is that if your servers are configured for 200 concurrent
connections, you must not exceed that in haproxy's configuration (server's
"maxconn" parameter). If your servers are unbalanced in terms of capacity,
you should reflect this in the "weight" parameter, etc...

> > A basic apache2 memory imprint is around 10Mb/process, so to answer 100
> > concurrent connections it need 1Go of memory
> 
> This is why I quited Apache. I am looking for a testing methodology in order
> to determine the maximum request/second that a server can handle - I know it 
> is not
> HAProxy fully related but any suggestions are very welcomed.

The number of requests/s is easily checked by requesting a small object,
so that you don't account the network time.

> > I would recommend tuning the software stack *before* tuning the system
> > stack.

I agree a lot here ! Too often we see people doing crap on system tunables
instead of fixing the application. The system must only be tuned when you
know *why* you have to tune it.

> I followed your advice and I tuned the webservers a little changing to epoll, 
> sendfile
> and noatime, but Willy Thereau did not posted his sysctl parameters and 
> HAProxy 
> settings used in the last test

It's precisely because there is nothing magic here, I have not changed them from
the defaults my system boots on (though I agree that those ones were already 
tuned
a bit a long time ago). But basically, they just consist in increasing the 
source
port range, increasing the TIME-WAIT buckets and max_orphans, setting tw_reuse 
to
1 and increasing somaxconn and max_syn_backlog in order to accept 10-20000
un-acknowledged connections. When I unpack the machines and boot them again I
can give you the specific values, I don't have them all in mind, but once again,
nothing really magics here.

> and I find it hard to beleive that a 2.6 kernel can handle 38000 concurrent
> connections without at least increasing the tcp_backlog queue or decreasing
> the fin_timeout.

The backlog and fin_timeout have nothing to do with concurrent connections,
they are related the the connection rate. The source port range will impact
concurrent connections. The max number of sockets too (fs.file-max). And
BTW I just rechecked and in this test I reached 38000 connections/s, those
were not concurrent connections, so yes, backlog and fin_timeout have to
slightly be adjusted. But if haproxy never takes more than 10ms between
two accept() cycles, even a very small backlog of 380 is OK. The fin_timeout
issue is properly handled by the tw_reuse setting and the fact that haproxy
does a setsockopt(SO_REUSEADDR).

Regards,
Willy


Reply via email to