On Fri, Apr 13, 2012 at 06:45:57PM +0200, Baptiste wrote:
> hi,
>
> HAProxy is a single-process event-driven software.
> Which means that the performance are directly linked to the CPU speed
> and not to the number of CPUs available in the machine..
It's also important to check on what core you run the process. Look at your
CPU's internal architecture. You need to pin network interrupts to a specific
core and have haproxy run on a separate core sharing the same L2 cache as the
one in charge for the interrupts. It's important to pin the process and
interrupts on separate cores, because by default the kernel tends to put them
on the same core and both share half of the CPU. It's common to see a 2x
increase when doing this.
Using a recent CPU, it's possible to forward between 50 and 100k connections
per second. If your workload mainly consists in large objects, increasing
buffer size or enabling splicing can sometimes significantly reduce the CPU
usage by moving larger amounts of data at once.
> There is a multiprocess (not threaded) way of running HAProxy, check
> the nbproc parameter from the documentation.
Even when doing so, please ensure you never have one process running on
the same core as the NIC's interrupts, and that you never have a process
running on a different CPU socket as the NIC's interrupts, which basically
means you'd be running without any CPU cache because inter-CPU latency is
too high for such processing.
Some people are used to run multiple instances with multiple NICs on
multi-core machines, this looks like this :
NIC1 haproxy1 NIC2 haproxy2
[ core 0 ] [ core 1 ] [ core 2 ] [ core 3 ]
[ shared L2 cache ] [ shared L2 cache ]
etc...
It's more or less like having multiple load balancers inside the same
machine, but it generally ensures you can take the most out of your
hardware by limiting inter-cpu communications.
Regards,
Willy