haproxy config, nginx config non default sysctl (if any) as a side note, can you have a look at "dmesg" output ? do you have nf conntrack enabled ? what are its limits ?
чт, 4 окт. 2018 г. в 9:59, Krishna Kumar (Engineering) < [email protected]>: > Sure. > > 1. Client: Use one of the following two setup's: > - a single baremetal (48 core, 40g) system > Run: "wrk -c 4800 -t 48 -d 30s http://<IP>:80/128", or, > - 100 2 core vm's. > Run "wrk -c 16 -t 2 -d 30s http://<IP>:80/128" from > each VM and summarize the results using some > parallel-ssh setup. > > 2. HAProxy running on a single baremetal (same system config > as client - 48 core, 40g, 4.17.13 kernel, irq tuned to use different > cores of the same NUMA node for each irq, kill irqbalance, with > haproxy configuration file as given in my first mail. Around 60 > backend servers are configured in haproxy. > > 3. Backend servers are 2 core VM's running nginx and serving > a file called "/128", which is 128 bytes in size. > > Let me know if you need more information. > > Thanks, > - Krishna > > > On Thu, Oct 4, 2018 at 10:21 AM Илья Шипицин <[email protected]> wrote: > >> load testing is somewhat good. >> can you describe an overall setup ? (I want to reproduce and play with it) >> >> чт, 4 окт. 2018 г. в 8:16, Krishna Kumar (Engineering) < >> [email protected]>: >> >>> Re-sending in case this mail was missed. To summarise the 3 issues seen: >>> >>> 1. Performance drops 18x with higher number of nbthreads as compared to >>> nbprocs. >>> 2. CPU utilisation remains at 100% after wrk finishes for 30 seconds >>> (for 1.9-dev3 >>> for nbprocs and nbthreads). >>> 3. Sockets on client remain in FIN-WAIT-2, while on HAProxy it remains >>> in either >>> CLOSE-WAIT (towards clients) and ESTAB (towards the backend >>> servers), till >>> the server/client timeout expires. >>> >>> The tests for threads and processes were done on the same systems, so >>> there is >>> no difference in system parameters. >>> >>> Thanks, >>> - Krishna >>> >>> >>> On Tue, Oct 2, 2018 at 9:18 PM Krishna Kumar (Engineering) < >>> [email protected]> wrote: >>> >>>> Hi Willy, and community developers, >>>> >>>> I am not sure if I am doing something wrong, but wanted to report >>>> some issues that I am seeing. Please let me know if this is a problem. >>>> >>>> 1. HAProxy system: >>>> Kernel: 4.17.13, >>>> CPU: 48 core E5-2670 v3 >>>> Memory: 128GB memory >>>> NIC: Mellanox 40g with IRQ pinning >>>> >>>> 2. Client, 48 core similar to server. Test command line: >>>> wrk -c 4800 -t 48 -d 30s http://<IP:80>/128 >>>> >>>> 3. HAProxy version: I am testing both 1.8.14 and 1.9-dev3 (git checkout >>>> as of >>>> Oct 2nd). >>>> # haproxy-git -vv >>>> HA-Proxy version 1.9-dev3 2018/09/29 >>>> Copyright 2000-2018 Willy Tarreau <[email protected]> >>>> >>>> Build options : >>>> TARGET = linux2628 >>>> CPU = generic >>>> CC = gcc >>>> CFLAGS = -O2 -g -fno-strict-aliasing -Wdeclaration-after-statement >>>> -fwrapv -fno-strict-overflow -Wno-unused-label -Wno-sign-compare >>>> -Wno-unused-parameter -Wno-old-style-declaration -Wno-ignored-qualifiers >>>> -Wno-clobbered -Wno-missing-field-initializers -Wtype-limits >>>> OPTIONS = USE_ZLIB=yes USE_OPENSSL=1 USE_PCRE=1 >>>> >>>> Default settings : >>>> maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = >>>> 200 >>>> >>>> Built with OpenSSL version : OpenSSL 1.0.2g 1 Mar 2016 >>>> Running on OpenSSL version : OpenSSL 1.0.2g 1 Mar 2016 >>>> OpenSSL library supports TLS extensions : yes >>>> OpenSSL library supports SNI : yes >>>> OpenSSL library supports : TLSv1.0 TLSv1.1 TLSv1.2 >>>> Built with transparent proxy support using: IP_TRANSPARENT >>>> IPV6_TRANSPARENT IP_FREEBIND >>>> Encrypted password support via crypt(3): yes >>>> Built with multi-threading support. >>>> Built with PCRE version : 8.38 2015-11-23 >>>> Running on PCRE version : 8.38 2015-11-23 >>>> PCRE library supports JIT : no (USE_PCRE_JIT not set) >>>> Built with zlib version : 1.2.8 >>>> Running on zlib version : 1.2.8 >>>> Compression algorithms supported : identity("identity"), >>>> deflate("deflate"), raw-deflate("deflate"), gzip("gzip") >>>> Built with network namespace support. >>>> >>>> Available polling systems : >>>> epoll : pref=300, test result OK >>>> poll : pref=200, test result OK >>>> select : pref=150, test result OK >>>> Total: 3 (3 usable), will use epoll. >>>> >>>> Available multiplexer protocols : >>>> (protocols markes as <default> cannot be specified using 'proto' >>>> keyword) >>>> h2 : mode=HTTP side=FE >>>> <default> : mode=TCP|HTTP side=FE|BE >>>> >>>> Available filters : >>>> [SPOE] spoe >>>> [COMP] compression >>>> [TRACE] trace >>>> >>>> 4. HAProxy results for #processes and #threads >>>> # Threads-RPS Procs-RPS >>>> 1 20903 19280 >>>> 2 46400 51045 >>>> 4 96587 142801 >>>> 8 172224 254720 >>>> 12 210451 437488 >>>> 16 173034 437375 >>>> 24 79069 519367 >>>> 32 55607 586367 >>>> 48 31739 596148 >>>> >>>> 5. Lock stats for 1.9-dev3: Some write locks on average took a lot more >>>> time >>>> to acquire, e.g. "POOL" and "TASK_WQ". For 48 threads, I get: >>>> Stats about Lock FD: >>>> # write lock : 143933900 >>>> # write unlock: 143933895 (-5) >>>> # wait time for write : 11370.245 msec >>>> # wait time for write/lock: 78.996 nsec >>>> # read lock : 0 >>>> # read unlock : 0 (0) >>>> # wait time for read : 0.000 msec >>>> # wait time for read/lock : 0.000 nsec >>>> Stats about Lock TASK_RQ: >>>> # write lock : 2062874 >>>> # write unlock: 2062875 (1) >>>> # wait time for write : 7820.234 msec >>>> # wait time for write/lock: 3790.941 nsec >>>> # read lock : 0 >>>> # read unlock : 0 (0) >>>> # wait time for read : 0.000 msec >>>> # wait time for read/lock : 0.000 nsec >>>> Stats about Lock TASK_WQ: >>>> # write lock : 2601227 >>>> # write unlock: 2601227 (0) >>>> # wait time for write : 5019.811 msec >>>> # wait time for write/lock: 1929.786 nsec >>>> # read lock : 0 >>>> # read unlock : 0 (0) >>>> # wait time for read : 0.000 msec >>>> # wait time for read/lock : 0.000 nsec >>>> Stats about Lock POOL: >>>> # write lock : 2823393 >>>> # write unlock: 2823393 (0) >>>> # wait time for write : 11984.706 msec >>>> # wait time for write/lock: 4244.788 nsec >>>> # read lock : 0 >>>> # read unlock : 0 (0) >>>> # wait time for read : 0.000 msec >>>> # wait time for read/lock : 0.000 nsec >>>> Stats about Lock LISTENER: >>>> # write lock : 184 >>>> # write unlock: 184 (0) >>>> # wait time for write : 0.011 msec >>>> # wait time for write/lock: 60.554 nsec >>>> # read lock : 0 >>>> # read unlock : 0 (0) >>>> # wait time for read : 0.000 msec >>>> # wait time for read/lock : 0.000 nsec >>>> Stats about Lock PROXY: >>>> # write lock : 291557 >>>> # write unlock: 291557 (0) >>>> # wait time for write : 109.694 msec >>>> # wait time for write/lock: 376.235 nsec >>>> # read lock : 0 >>>> # read unlock : 0 (0) >>>> # wait time for read : 0.000 msec >>>> # wait time for read/lock : 0.000 nsec >>>> Stats about Lock SERVER: >>>> # write lock : 1188511 >>>> # write unlock: 1188511 (0) >>>> # wait time for write : 854.171 msec >>>> # wait time for write/lock: 718.690 nsec >>>> # read lock : 0 >>>> # read unlock : 0 (0) >>>> # wait time for read : 0.000 msec >>>> # wait time for read/lock : 0.000 nsec >>>> Stats about Lock LBPRM: >>>> # write lock : 1184709 >>>> # write unlock: 1184709 (0) >>>> # wait time for write : 778.947 msec >>>> # wait time for write/lock: 657.501 nsec >>>> # read lock : 0 >>>> # read unlock : 0 (0) >>>> # wait time for read : 0.000 msec >>>> # wait time for read/lock : 0.000 nsec >>>> Stats about Lock BUF_WQ: >>>> # write lock : 669247 >>>> # write unlock: 669247 (0) >>>> # wait time for write : 252.265 msec >>>> # wait time for write/lock: 376.939 nsec >>>> # read lock : 0 >>>> # read unlock : 0 (0) >>>> # wait time for read : 0.000 msec >>>> # wait time for read/lock : 0.000 nsec >>>> Stats about Lock STRMS: >>>> # write lock : 9335 >>>> # write unlock: 9335 (0) >>>> # wait time for write : 0.910 msec >>>> # wait time for write/lock: 97.492 nsec >>>> # read lock : 0 >>>> # read unlock : 0 (0) >>>> # wait time for read : 0.000 msec >>>> # wait time for read/lock : 0.000 nsec >>>> Stats about Lock VARS: >>>> # write lock : 901947 >>>> # write unlock: 901947 (0) >>>> # wait time for write : 299.224 msec >>>> # wait time for write/lock: 331.753 nsec >>>> # read lock : 0 >>>> # read unlock : 0 (0) >>>> # wait time for read : 0.000 msec >>>> # wait time for read/lock : 0.000 nsec >>>> >>>> 6. CPU utilization after test for processes/threads: haproxy-1.9-dev3 >>>> runs >>>> at 4800% (48 cpus) for 30 seconds after the test is done. For 1.8.14, >>>> this behavior was not seen. Ran the following command for both: >>>> "ss -tnp | awk '{print $1}' | sort | uniq -c | sort -n" >>>> 1.8.14 during test: >>>> 451 SYN-SENT >>>> 9166 ESTAB >>>> 1.8.14 after test: >>>> 2 ESTAB >>>> >>>> 1.9-dev3 during test: >>>> 109 SYN-SENT >>>> 9400 ESTAB >>>> 1.9-dev3 after test: >>>> 2185 CLOSE-WAIT >>>> 2187 ESTAB >>>> All connections that were in CLOSE-WAIT were from the client, while >>>> all >>>> connections in ESTAB state were to the server. This lasted for 30 >>>> seconds. >>>> On the client system, all sockets were in FIN-WAIT-2 state: >>>> 2186 FIN-WAIT-2 >>>> This (2185/2186) seems to imply that client closed the connection >>>> but >>>> haproxy did not close the socket for 30 seconds. This also results >>>> in >>>> high CPU utilization on haproxy for some reason (100% for each >>>> process >>>> for 30 seconds), which is also unexpected as the remote side has >>>> closed the >>>> socket. >>>> >>>> 7. Configuration file for process mode: >>>> global >>>> daemon >>>> maxconn 26000 >>>> nbproc 48 >>>> stats socket /var/run/ha-1-admin.sock mode 600 level admin process 1 >>>> # (and so on for 48 processes). >>>> >>>> defaults >>>> option http-keep-alive >>>> balance leastconn >>>> retries 2 >>>> option redispatch >>>> maxconn 25000 >>>> option splice-response >>>> option tcp-smart-accept >>>> option tcp-smart-connect >>>> option splice-auto >>>> timeout connect 5000ms >>>> timeout client 30000ms >>>> timeout server 30000ms >>>> timeout client-fin 30000ms >>>> timeout http-request 10000ms >>>> timeout http-keep-alive 75000ms >>>> timeout queue 10000ms >>>> timeout tarpit 15000ms >>>> >>>> frontend fk-fe-upgrade-80 >>>> mode http >>>> default_backend fk-be-upgrade >>>> bind <VIP>:80 process 1 >>>> # (and so on for 48 processes). >>>> >>>> backend fk-be-upgrade >>>> mode http >>>> default-server maxconn 2000 slowstart >>>> # 58 server lines follow, e.g.: "server <name> <ip:80>" >>>> >>>> 8. Configuration file for thread mode: >>>> global >>>> daemon >>>> maxconn 26000 >>>> stats socket /var/run/ha-1-admin.sock mode 600 level admin >>>> nbproc 1 >>>> nbthread 48 >>>> # cpu-map auto:1/1-48 0-39 >>>> >>>> defaults >>>> option http-keep-alive >>>> balance leastconn >>>> retries 2 >>>> option redispatch >>>> maxconn 25000 >>>> option splice-response >>>> option tcp-smart-accept >>>> option tcp-smart-connect >>>> option splice-auto >>>> timeout connect 5000ms >>>> timeout client 30000ms >>>> timeout server 30000ms >>>> timeout client-fin 30000ms >>>> timeout http-request 10000ms >>>> timeout http-keep-alive 75000ms >>>> timeout queue 10000ms >>>> timeout tarpit 15000ms >>>> >>>> frontend fk-fe-upgrade-80 >>>> mode http >>>> bind <VIP>:80 process 1/1-48 >>>> default_backend fk-be-upgrade >>>> >>>> backend fk-be-upgrade >>>> mode http >>>> default-server maxconn 2000 slowstart >>>> # 58 server lines follow, e.g.: "server <name> <ip:80>" >>>> >>>> I had also captured 'perf' output for the system for thread vs >>>> processes, >>>> can send it later if required. >>>> >>>> Thanks, >>>> - Krishna >>>> >>>

