Re: Very low session rate with simple benchmark setup

Maxime Brugidou Sat, 05 Jul 2014 03:15:21 -0700

On Jul 4, 2014 9:45 AM, "Willy Tarreau" <[email protected]> wrote:
>
> On Fri, Jul 04, 2014 at 08:28:50AM +0200, Maxime Brugidou wrote:


> > I did exactly that in a second test later before going to sleep and
went up
> > to 50k session/sec.
>
> Ah great!

Actually i couldnt figure out how to reproduce this consistently. I'm stuck
to 30k session/s now but the CPU does not seem to be blocking. Using keep
alive helps a lot though.

> > I realized that too in my next tests and went back to 3. I forgot about
the
> > RX buffer which is at 256 by default. I tried raising it to 4096 (the
> > maximum) with some little improvements.
>
> It generally does not help since too large buffers means larger rings
> that are less efficient to process (ie take more cache space). 128-512
> are generally the best options for TCP, with 256 often being optimal.
>

OK I went back to 256 (default)

> > I will try to do further testing now that I have a better understanding
of
> > this. Not sure if there any http tool like siege that are easier to
monitor
> > object size and latency?
>
> I'm used to use "inject" which I wrote many years ago. It provides one
line
> per second (like vmstat) with some metrics of req/s, data/s, avg resp time
> and standard deviation. It doesn't support SSL nor keep-alive but I find
it
> useful enough not to switch to other tools. Legends are still in french
but
> it should not be a problem for you :-)
>
>   http://git.formilux.org/?p=people/willy/inject.git
>   http://1wt.eu/tools/inject/  (for the doc)

Thanks! I am using this now for my non-keepalive tests it's simple and
convenient.

> > > OK but you need first to ensure that you *can* max out the bandwidth,
> > > otherwise it definitely indicates a setup problem.
> >
> > OK I'll try to do that with large objects first. I can also increase MTU
> > with the backend, use splicing and maybe LRO it should max out the
> > bandwidth.
>
> You should never need to increase MTU at such rates. Even at 40 Gbps I'm
> working with 1500. Splicing tends to be slower than copying with many gig
> NICs, so reserve it for 10G+ NICs unless your tests show that it's better.
> LRO is useless at such low speeds

OK I did not do any of these but with 75kB response I got above 950Mps
which seems OK for me.

> Whenever you want low latency or high bandwidth, you need to test hardware
> before selecting the one you'll need. It took me 6 months to find hardware
> capable of 10 Gbps in 2009. 10G NICs will provide you much better
performance
> even at rates below 1G. I know a few web sites very sensitive to response
> time which have switched to 10G just for this reason. Myricom NICs will
> provide you with a very low latency, but will hardly scale to 10G unless
> you're mostly dealing with huge objects. Intel NICs will reach higher bit
> rates, but come with a higher CPU usage so are not necessarily relevant
> for rates of 1G or less.
>
> > I am still a bit disappointed by the connections speed I reach.
>
> You're too impatient :-)
> After 1 day and 3 mails you doubled your performance !
>
> > I'll update you later today trying out all the solutions and getting
better
> > data.
>
> OK.
>
> Willy
>

I can get acceptable performance now that I read all the benchmarks and
tests online, especially with 2GHz and 1G. However 30k session/sec for one
core with 150kpps Rx + 150kpps Tx with this Intel I350 NIC is still very
low according to other benchmarks of the NIC going above 1Mpps with Linux
and igb.

Three questions:
1. If I set up a more recent kernel (CentOS 7 or Debian), do you think it
can significantly help?
2. Do you use haproxy with bonding? If we want to add a second NIC and use
bonding, do you think that we can use the second CPU socket with it? I can
easily send IRQs with smp_affinity but if I add haproxy processes to the
second socket I am not sure that they will handle the traffic from the
second NIC exclusively. Is RPS/XPS the solution here?
3. With haproxy 1.5, am I right that since we use round robin we can't
benefit of the http-keep-alive option? Do we need to switch to another
algorithm ?

Re: Very low session rate with simple benchmark setup

Reply via email to