> On Thu, May 21, 2015 at 5:58 PM, Willy Tarreau <[email protected]> wrote:

Hi Willy,

Thank you for your reply.

> I suspect the BW unit is bytes per second above though I could be

That's correct, and the BW is as you had stated: >8gpbs vs 2.8 gbps.

> Hmmm, would you be running from multiple load generators connected via

No, I am running a single 'ab' command from 1 node.

> I'm thinking about something else, could you retry with less or more total
> objects in the /128 case (or the 16k case) ? The thing is that "ab" starts

I tried with "-n 1000" but it also hangs at 90%. More details on this below.

> You may want to try openssl-1.0.2a which significantly improved
performance

Thank you, I upgraded to 1.0.2a today before testing further.

> You should do this instead to have 3 distinct sockets each in its own
> process (but warning, this requires a kernel >= 3.9) :

Yes, I am running 3.19.6, so have made this change too, and for :443.
Thanks for
the explanation.

> Another thing that can be done is to compare the setup above with
6-process
> per frontend. You can even have everything in the same frontend by the
way :

I tried this without any improvement.

> I fail to see how this is possible, the Xeon E5-2670 is 8-core and
> supports 2 CPU configurations max. So that's 16 cores max in total.

It is the v3 processor "Intel Xeon Processor E5-2670 v3". lscpu shows:
    NUMA node0 CPU(s):
0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38,40,42,44,46
    NUMA node1 CPU(s):
1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39,41,43,45,47

> OK. What do you mean by "correct", you mean "the same CPU package as
> the one running haproxy so as not to pass the data over QPI", right ?

Yes, I used David Miller's set_irq_affinity.sh script, which maps
irq0->cpu0, irq1->cpu1,
and so on. Following are the interrupt counts on different irq's after a
reboot and test:

irq#               List of cpus#'s (and #interrupts on each)
IRQ-0 (36):        0(1652395)
IRQ-1 (37):        1(267916)
IRQ-2 (38):        2(1639163)
IRQ-3 (39):        3(270744)
IRQ-4 (40):        4(1651315)
IRQ-5 (41):        5(270939)
IRQ-6 (42):        6(1637431)
IRQ-7 (43):        7(270505)
IRQ-8 (44):        8(1643712)
IRQ-9 (45):        9(271290)
IRQ-10 (46):       10(1644798)
IRQ-11 (47):       11(269653)
IRQ-12 (48):       12(270003)
IRQ-13 (49):       13(271268)
IRQ-14 (50):       14(270255)
IRQ-15 (51):       15(271206)

When 'ab' at the client uses -k, interrupts are generated on all even cpu's
0-10 on
the haproxy node (which explains why the odd irq's above have counts too,
though
it is smaller due to mix of -k and without -k option testing). Without -k,
interrupts are
generated on all cpus 0-10, including the odd ones.

> This certainly is a side effect of the imbalance above combined with ab
which
> keeps the same connection from the beginning to the end of the test.

With the new configuration file (below), I was able to get some more
information on
what is going on:

1. Without -k option to 'ab', the SSL test works and finishes for all I/O
sizes. With the
    following configuration file (1-3:80 ; 4-6:443), 3 haproxy's run and
finish the work.
2. With -k option to 'ab', 3 haproxy's start off in response, they run for
about 1 second
    (as seen in 'top'), then 2 stops handling work while only 1 continues,
and after 90%,
    the sole haproxy also stops, and the client soon prints the "70007"
error. Sometime
    the sole working haproxy stops immediately, and I get an error before
10% is done.
    This happens only for large IO, like 128K. With 128 bytes, all
haproxies run till 'ab'
    completes successfully. Similarly, it works for I/O of 7000 bytes, but
fails at >= 8000.
3. 'ab' to the backend, with or without -k, works without issues for any
size.

#2 above seems very suspicious, and happens every time. With your above
suggestion
to have a single frontend, I saw that all 6 starts, and 5 stop at about 1
second, and the
test finally hangs. Without -k, all 6 run and 'ab' finishes.

Regards,
- Krishna Kumar

Configuration file (have tried "bind-process 1 2 3" and "bind-process 4 5
6" in the two
backend's below, there was no difference in the above behavior):

global
    daemon
    quiet
    nbproc 6
    cpu-map 1 0
    cpu-map 2 2
    cpu-map 3 4
    cpu-map 4 6
    cpu-map 5 8
    cpu-map 6 10
    user haproxy
    group haproxy
    stats socket /var/run/haproxy.sock mode 600 level admin
    stats timeout 2m
    tune.bufsize 32768

userlist stats-auth
    group admin    users admin
    user  admin    insecure-password admin

defaults
    mode http
    retries 3
    option forwardfor
    option redispatch
    option prefer-last-server
    option splice-auto

frontend www-http
    bind *:80 process 1
    bind *:80 process 2
    bind *:80 process 3
    stats uri /stats
    stats enable
    acl AUTH http_auth(stats-auth)
    acl AUTH_ADMIN http_auth(stats-auth) admin
    stats http-request auth unless AUTH
    default_backend www-backend

frontend www-https
    bind *:443 process 4 ssl crt /etc/ssl/private/haproxy.pem
    bind *:443 process 5 ssl crt /etc/ssl/private/haproxy.pem
    bind *:443 process 6 ssl crt /etc/ssl/private/haproxy.pem
    reqadd X-Forwarded-Proto:\ https
    default_backend www-backend-ssl

backend www-backend
    mode http
    balance roundrobin
    cookie FKSID prefix indirect nocache
    server nginx-1 NGINX1:80 check
    server nginx-2 NGINX2:80 check

backend www-backend-ssl
    mode http
    balance roundrobin
    cookie FKSID prefix indirect nocache
    server nginx-3 NGINX1:80 check
    server nginx-4 NGINX2:80 check

Reply via email to