Hi!

I've got stable 70-80k session rate at last:)

I've tried everything, upgraded haproxy to 1.4.15, tried "bind :80
defer-accept", event wrote a script to try all possible combinations
for cpu_affinity for IRQs/haproxy(it's important note). I've also
removed bond0. Nothing helped.

After that, following advice by Hank A. Paulson, I've rebooted server
and disabled hyperthreading (logical processors as Dell calls it), and
then without any other tuning I've got 40-50k session rate. After
binding irq/haproxy to first and second cores I saw 70-80k. So it was
just slow linear processing power of Xeon CPU. So it is impossible to
get this 2xHexacore Xeon @2.66 run haproxy faster then my desktop
(which is simple core i5 - it showed 85k session rate without any
tuning at all).

Tommorow I'm going to try running to two haproxy processes and
distributing irqs on second core. Also I'm going to try to remove MSI
support when loading bnx2. I have almost no hope to see 100k here, but
I'm just curios:)

2011/7/19 Willy Tarreau <w...@1wt.eu>
>
> Hi Dmitriy,
>
> On Tue, Jul 19, 2011 at 04:25:29AM +0400, Dmitriy Samsonov wrote:
> > Using one/two clients with ab2 -c 250 -H 'Accept-Encoding: None' -n
> (...)
> >   98%     11
> >   99%     13
> >  100%   9022 (longest request)
>
> Still some drops. Was this before or after ethtool+somaxconn ?
>
> > Also I've upgraded kernel from 2.6.38-r6 (gentoo) to 2.6.39-r3
> > (gentoo) - nothing changed. At all. haproxy version is 1.4.8.
>
> OK. When you have time, you can also update haproxy. No less than
> 44 bugs were fixed since 1.4.8, and a few improvements were made,
> eventhough those ones will not concern your current tests. BTW, I'm
> seeing that your system is 64-bit, I assume you built haproxy for
> 64-bit too ? I'm asking because syscalls are cheaper in 64-bit.
>
> > Altering somaxconn also did change anything.
>
> OK. This one could be increased too (eg 10x) :
>   net.core.netdev_max_backlog = 1000
>
> Also in your sysctls, I see :
>   net.ipv4.tcp_max_syn_backlog = 1024
>
> I thought it was set to something like 10000. 1024 can be a bit low
> for 50k sess/s.
>
> One thing you can try for the test, but not for production until the
> kernel it checked for correct processing is the "defer-accept" option
> of the "bind" line :
>
>   frontend xxx
>        bind :80 defer-accept
>
> It asks the kernel to wake haproxy up only when data are available on
> the port. It can save a few kernel-user switches. But until recently
> it was not reliably usable for production because there was no way to
> tell the kernel to forward the connection after some delay. It has
> recently changed but I don't know at what version. If it sensibly
> improves performance, we can try to find the correct version.
>
> I'm seeing you have the bonding driver loaded. Is is in use, and if so,
> how is the traffic spread on the links ? You should never use round robin
> nor any non-determinist solution for high packet rates, it's the best way
> to cause reordering that costs a lot on both the client and the server TCP
> stacks. It also tends to cause them to emit more useless ACKs. If you want
> to use two NICs for double data rate, you'd better bind two frontends each
> to one NIC and have the front router route the traffic to both. If you only
> have a switch, then sometimes you can give them the same IP+MAC and enable
> etherchannel on the switch. This will also bring you the ability to run 2
> procs, each bound to one NIC and use 4 cores total :
>  - 1 for each NIC
>  - 1 for each haproxy
>
> > Only changing affinity of irq/haproxy is affecting system, maximum
> > rate changes from 25k to 49k. And that's it...
>
> OK. As you see, it's very important to play with this, because by default
> the system will move haproxy to the core processing the traffic, but at
> such rates, both the kernel and haproxy need their own core.
>
> > I'm including sysctl -a output, but I think all this happens because
> > of some troubles with bnx2 driver - I just don't see any explanation
> > why 70-80Mbs are saturating haproxy and irq handling (lost packets!).
>
> You should not consider that in Mbps but in connection rates and packet
> rates, as those are just small packets.
>
> > I have an option to try 'High Performance 1000PT Intel Network Card'
> > could it be any better or I should try find solution for current
> > configuration?
>
> The intel NIC generally is the best tradeoff between perf and stability.
> You only have to fix the InterruptThrottleRate in modprobe.conf, I'm
> used to limit it between 5000 and 20000, and then it's easy to get very
> nice numbers. Above that, using 10G NICs from Myricom will provide even
> better enhancements, but you need the connectivity too. To give you an
> idea, on a machine where I got about 35K sess/s (forwarded to server)
> with a gig intel NIC, switching to the myri improves to about 45K without
> changing anything else.
>
> > My final task is to handle DDoS attacks with flexible and robust
> > filter available. Haproxy is already helping me to stay alive under
> > ~8-10k DDoS bots (I'm using two servers and DNS RR in production), but
> > attackers are not sleeping and I'm expecting attacks to continue with
> > more bots. I bet they will stop at 20-25k bots.
>
> You can never guess at what level they will stop. It's just a matter of
> money for them. If they're doing that just to annoy you, then probably
> yes at one point they'll think that getting a laugh is costing too much.
> But if they have any business in damaging your site, there is no reason
> for them to stop when the costs increase, until the cost is higher than
> the income they expect from taking your site down.

I just know the market and I can imagine their resources.

>
> Also you should never publicly tell them how you're trying to get rid
> of them nor what your sizing is, as you did right here :-)
>

Thanks for adice, but they don't know neither my name nor english language.

> > Such botnet will
> > generate approx. 500k session rate. and ~1Gbps bandwidth so I was
> > dreaming to handle it on this one server with two NIC's bonded giving
> > me 2Gbps for traffic:)
>
> Two gig NIC will not handle 500k sess/s. Wire limit is 1.4Mpps for
> short packets. NIC limits I often encounter is 550 kpps on PCI-X and
> 630 kpps on PCI-expres. With two NICs, you'll be able to process 1.2 Mpps.
> This is only enough to deal with 300k session rate if they're sending
> requests :
>
>   - SYN (1)
>   - SYN-ACK (return)
>   - ACK (1)
>   - request (1)
>   - response + FIN (return)
>   - FIN (1)
>   - ACK (return)
>
> => 4 packets incoming per session, so 1.2M/4 = 300k.
>
> You can improve that by immediately resetting the connection when you
> know their IP (use 1.5-dev for that, it has tables with per-IP rates) :
>
>   - SYN (1)
>   - SYN-ACK (return)
>   - ACK (1)
>   - request (1)
>   - RST (return)
>
> => 3 packets or 400k sess/s.
>
> In fact, the reset will happen just after the first ACK but since the
> request is sent at the same time by the client, you'll get the packet
> anyway and a second reset will be sent.
>
> Also, speaking of 1.5-dev, it is with it that I reached 300ksess/s on
> the core i5. The processing has been layered a bit more, and it is
> possible that blocking at a lower layer is much faster than in 1.4.
>
> Regards,
> Willy
>

Reply via email to