Hi Dmitriy,

On Mon, Jul 18, 2011 at 10:01:47PM +0400, Dmitriy Samsonov wrote:
> defaults
>         mode http
>         maxconn 79500
>         timeout client 20s
>         timeout server 15s
>         timeout queue  60s
>         timeout connect 4s
>         timeout http-request 5s
>         timeout http-keep-alive 250
>         #option httpclose
>         option abortonclose
>         balance roundrobin
>         option forwardfor

OK, by using "option http-server-close", you'll benefit from an active
close to the servers, which will improve things a lot when using real
servers. Still it will not change anything in your tests.

>         retries 10
> 
> Also there was conntrack module loaded - friend of mine was playing with
> iptables and did not remove it. Now there is no iptables at all:
> 
> dex9 ipv4 # sysctl -a | grep conntrack | wc -l
> 0
> dex9 ipv4 # lsmod | grep xt_ | wc -l
> 0
> dex9 ipv4 # lsmod | grep nf_ | wc -l
> 0

OK fine.

> I followed your recommendation and set affinity for processes:
> dex9 ipv4 # schedtool 16638 # haproxy's process
> PID 16638: PRIO   0, POLICY N: SCHED_NORMAL, NICE   0, AFFINITY 0x2
> 
> dex9 ipv4 # schedtool 3  # ksoftirqd/0
> PID     3: PRIO   0, POLICY N: SCHED_NORMAL, NICE   0, AFFINITY 0x1

I'm not sure that applying schedtool to kernel threads has any effect.
Normally you should "echo 1 > /proc/irq/XXX/smp_affinity" to force
interrupts to a specific core.

> Now in top it looks like this:
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
> 
> 16638 root      20   0  100m  36m  520 R   93  0.0  19:33.18 haproxy
> 
>     3 root      20   0     0    0    0 R   38  0.0   7:11.76 ksoftirqd/0
> 
> 
> 93% percent for haproxy and 38% for ksoftirqd/0

OK. As long as you don't reach 100%, there's something perturbating the
tests. Possibly that your IRQs are spread over all cores.

> I was lucky enough to reach 66903 session rate (max) and average value for
> Cur is around 40-42k.

Fine, this is a lot better now. Since you're running at 2000 concurrent
connections, the impact on the cache is noticeable (at 32kB per connection
for haproxy, it's 64MB of RAM possibly touched each second, maybe only 16MB
since requests are short and fit in a single page). Could you recheck at
only 250 concurrent connections in total (125 per ab) ? This usually is
the optimal point I observe. I'm not saying that it should be your target,
but we're chasing the issues :-)

> Typical output of one of two ab2 running is:
> 
> Server Software:
> Server Hostname:        nohost
> Server Port:            80
> 
> Document Path:          /
> Document Length:        0 bytes
> 
> Concurrency Level:      1000
> Time taken for tests:   470.484 seconds
> Complete requests:      10000000
> Failed requests:        0
> Write errors:           0
> Total transferred:      0 bytes
> HTML transferred:       0 bytes
> Requests per second:    21254.72 [#/sec] (mean)
> Time per request:       47.048 [ms] (mean)
> Time per request:       0.047 [ms] (mean, across all concurrent requests)
> Transfer rate:          0.00 [Kbytes/sec] received
> 
> Connection Times (ms)
>               min  mean[+/-sd] median   max
> Connect:        0   34 275.9     11   21086

This one means there is packet loss on SYN packets. Some requests
take up to 4 SYN to pass (0+3+6+9 seconds). Clearly something is
wrong, either on the network or more likely net.core.somaxconn.
You have to restart haproxy after you change this default setting.

Does "dmesg" say anything on either the clients or the proxy machine ?

> Processing:     0   13  17.8     11     784
> Waiting:        0    0   0.0      0       0
> Total:          2   47 276.9     22   21305
> 
> Percentage of the requests served within a certain time (ms)
>   50%     22
>   66%     26
>   75%     28
>   80%     30
>   90%     37
>   95%     41
>   98%     47
>   99%    266
>  100%  21305 (longest request)
> 
> Typical output of vmstat is:
> dex9 ipv4 # vmstat 1
> procs -----------memory---------- ---swap-- -----io---- -system--
> ----cpu----
>  r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id
> wa
>  1  0      0 131771328  46260  64016    0    0     2     1  865  503  1  6
> 94  0
>  1  0      0 131770688  46260  64024    0    0     0     0 40496 6323  1  9
> 90  0

OK, so 1% user, 9% system, 90% idle, 0% wait at 40k int/s. Since this is
scaled to 100% for all cores, it means that we're saturating a core in the
system (which is expected with short connections).

I don't remember if I asked you what version of haproxy and what kernel you
were using. Possibly that some TCP options can improve things a bit.

> Also, I've checked version of NIC's firmware:
> dex9 ipv4 # ethtool -i eth0
> driver: bnx2
> version: 2.0.21
> firmware-version: 6.2.12 bc 5.2.3
> bus-info: 0000:01:00.0

OK, let's hope it's fine. I remember having seen apparently good results
with version 4.4 from what I recall, so this one should be OK.

> Moreover, I've tried launching two ab2 localy:
> dex9 ipv4 # ab2 -c 1000 -H 'Accept-Encoding: None' -n 10000000
> http://localweb/
> This is ApacheBench, Version 2.3 <$Revision: 655654 $>
> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
> Licensed to The Apache Software Foundation, http://www.apache.org/
> 
> Benchmarking inbet.cc (be patient)
> Completed 1000000 requests
> Completed 2000000 requests
> ^C
> 
> Server Software:
> Server Hostname:        localweb
> Server Port:            80
> 
> Document Path:          /
> Document Length:        0 bytes
> 
> Concurrency Level:      1000
> Time taken for tests:   104.583 seconds
> Complete requests:      2141673
> Failed requests:        0
> Write errors:           0
> Total transferred:      0 bytes
> HTML transferred:       0 bytes
> Requests per second:    20478.13 [#/sec] (mean)
> Time per request:       48.833 [ms] (mean)
> Time per request:       0.049 [ms] (mean, across all concurrent requests)
> Transfer rate:          0.00 [Kbytes/sec] received
> 
> Connection Times (ms)
>               min  mean[+/-sd] median   max
> Connect:        0   38 352.7      6   21073
> Processing:     0   10  49.7      7   14919
> Waiting:        0    0   0.0      0       0
> Total:          1   48 365.8     13   21078
> 
> Percentage of the requests served within a certain time (ms)
>   50%     13
>   66%     19
>   75%     26
>   80%     35
>   90%     36
>   95%     37
>   98%     39
>   99%     67
>  100%  21078 (longest request)
> 
> Two such ab2 processes are running both at 100% and saturating haproxy to
> 100%. 'Cur' session rate is also around 40-44k/s.

Fine, so those are the exact same numbers, with the same issue with packet
losses.

> Should I get rid of dell r410 and replace it with Core i5?:)) Being serious,
> is there any other tips or tricks I can try? To see those amazing 100k/s
> session rate?

Two things to test first as indicated above :
  1) retest with less concurrency from ab to see if things improve
  2) increase /proc/sys/net/core/somaxconn to 10000 or so

Next, if things don't get any better, please post the output of sysctl -a.

Hmmm please also note that when reaching 300k connections/s on the core i5,
it was done with 10Gb NICs which have an extremely low latency and nice TCP
stateless optimizations. I'm used to see much better results with them than
with gig NICs even at sub-gig rate. But anyway, more than 100k is expected
from such a machine.

For instance, I'm attaching a capture of a test I caught one year ago on my
PC (Core 2 duo 2.66 GHz at that time), and which exhibits 212ksess/s. I
think it was a bench of TCP connections, not HTTP sessions, but still even
if we double the number of TCP packets exchanged over the wire, we should
still see more than 100k on this machine.

Regards,
Willy

Title: Statistics Report for HAProxy

HAProxy version 1.5-dev0-15, released 2010/05/25

Statistics Report for pid 5339


> General process information

pid = 5339 (process #1, nbproc = 1)
uptime = 0d 0h00m26s
system limits: memmax = unlimited; ulimit-n = 1014
maxsock = 1014; maxconn = 500; maxpipes = 0
current conns = 1; current pipes = 0/0
Running tasks: 1/1

 active UP  backup UP
active UP, going down backup UP, going down
active DOWN, going up backup DOWN, going up
active or backup DOWN  not checked
active or backup DOWN for maintenance (MAINT)  
Note: UP with load-balancing disabled is reported as "NOLB".
Display option:External ressources:
echo
QueueSession rateSessionsBytesDeniedErrorsWarningsServer
CurMaxLimitCurMaxLimitCurMaxLimitTotalLbTotInOutReqRespReqConnRespRetrRedisStatusLastChkWghtActBckChkDwnDwntmeThrtle
Frontend212135212505-002000511808000511808000OPEN
Backend0000002000000051182800000026s UP 000 0 

stats
QueueSession rateSessionsBytesDeniedErrorsWarningsServer
CurMaxLimitCurMaxLimitCurMaxLimitTotalLbTotInOutReqRespReqConnRespRetrRedisStatusLastChkWghtActBckChkDwnDwntmeThrtle
Frontend11-1120005158037876000OPEN
Backend00000020000015803787600000026s UP 000 0 

Reply via email to