> When I try with tw_recycle = 0 then I start to get a lot of TIME_WAIT 
> connections and performance degrades quite quickly so I cannot remove it

Having a lot of TIME_WAIT connections shouldn't be a problem, in fact its
pretty normal.
With tcp_tw_reuse enabled (and tw_recycle disabled), you should not have any
problems, because the kernel is able to reuse the TIME_WAIT connections
(thats what tcp_tw_reuse does).

tw_recycle on the other hand is quite dangerous with NAT clients. Personally
I am not sure what tw_recycle exactly does and why it's causing problems with
NAT gateways, but you do want to avoid it on a internet facing services.

Are you sure, really sure, please triple check, that with tw_recycle = 0 you
have degraded performance? Its OK to have (a lot) more sockets in TIME_WAIT,
but did you really see degraded performance or did you assume the performance
is worse because you saw more sockets in TIME_WAIT?

I don't believe the performance problem you are seeing is directly related to
the amount of sockets in TIME_WAIT.


> Are you sure? I ask you this as netstat -a -n|grep TIME_WAIT|wc -l
> shows only around 4K connections.

In fact, I'm not so sure about the source port thing anymore. Please give us
the result from ab with tw_recycle = 0.

Are you using the -k flag with ab? Try varying it and post the result back.




>  procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
>  r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
>  1  0      0 235612 105460 157964    0    0     0     0 11545 6428  5 21 74  0
>  0  0      0 235200 105460 157964    0    0     0     0 12896 7497  8 18 74  0
>  1  0      0 234824 105460 157964    0    0     0     0 14740 8578  4 28 68  0
>  1  0      0 235292 105460 157964    0    0     0     0 10976 6423  1 18 81  0
>  1  0      0 235420 105460 157964    0    0     0     0 9668 5898  3 20 77  0
>  1  0      0 234296 105460 157964    0    0     0     0 12969 8001  2 25 73  0
>  1  0      0 234672 105460 157964    0    0     0     0 13888 8529  3 23 74  0
>  0  0      0 235076 105460 157964    0    0     0     0 8081 4717  3 18 79  0
>  1  0      0 235516 105460 157964    0    0     0     0 8465 5026  0 13 87  0
>  0  0      0 235004 105460 157964    0    0     0     0 8770 5223  2 18 80  0
>  0  0      0 235100 105460 157964    0    0     0     0 8635 4921  1 18 81  0
>  0  0      0 234904 105460 157964    0    0     0     0 9532 5805  3 21 76  0
>  0  0      0 234696 105460 157964    0    0     0     0 11013 6468  3 20 77  0
>  0  0      0 235728 105460 157964    0    0     0     0 9707 5185  3 14 83  0

You are clearly not CPU bound. Since you are on a single-core, cpu/core
pinning doesn't make sense either.



> TARGET  = linux26

you can use the linux2628 target for your kernel, that way haproxy is able
to use more efficient kernel features, which will improve performance
(like tcp splicing for example). This has nothing to do with the performance
problem you are facing however.


> Linux HAproxy 2.6.32-5-amd64 #1 SMP Sun Sep 23 10:07:46 UTC 2012 x86_64 
> GNU/Linux
> [...]
> Built with OpenSSL version : OpenSSL 0.9.8o 01 Jun 2010

Looks like debian stable, do you have the same kernel on the other,
better performing, haproxy box?



>> could you try the "option nolinger" in the backend?
> I added it, not much of a difference from what I can see.

Throw it out then, its bad anyway :)



> HAproxy uses a 100Mb card, the Nginx servers use 1Gb cards

Haproxy has a single 100mbit/s NIC handling both incoming and
outgoing traffic? Are you sure the 100M link is not simply saturated
during the benchmark?

Can you monitor your NIC load with "nload eth" while benchmarking?




Lukas
                                          

Reply via email to