Re: Idle HAProxy 1.8 spins at 100% in user space

2018-03-12 Thread Cyril Bonté

Le 12/03/2018 à 14:11, Krishna Kumar (Engineering) a écrit :

Hi Cyril,

Thanks, this patch fixes it, it is now back to 0%. Confirmed it a few
times, and undid the patch, back to 100%, and re-added the patch,
back to 0%. Fixes perfectly.


Thanks for confirming, I'm preparing the patch with a reasonable commit 
message. It should be ready in a few minutes ;-)



--
Cyril Bonté



Re: Idle HAProxy 1.8 spins at 100% in user space

2018-03-12 Thread Krishna Kumar (Engineering)
Hi Cyril,

Thanks, this patch fixes it, it is now back to 0%. Confirmed it a few
times, and undid the patch, back to 100%, and re-added the patch,
back to 0%. Fixes perfectly.

Thanks,
- Krishna


On Mon, Mar 12, 2018 at 5:23 PM, Willy Tarreau  wrote:

> On Mon, Mar 12, 2018 at 12:36:05PM +0100, Cyril Bonté wrote:
> > I confirm I can reproduce the issue once 32 (and more) threads are used
> : the main process enters an endless loop.
> > I think the same issue may occur with nbproc on FreeBSD (the same code
> in an #ifdef FreeBSD__).
> >
> > Can you try the patch attached ? I'll send a clean one later.
>
> Ah good catch, I'm pretty sure you nailed it down indeed! The fun thing
> is that the initial purpose of that patch was precisely to avoid this
> kind of annoying stuff in the first place!
>
> Cheers,
> Willy
>


Re: Idle HAProxy 1.8 spins at 100% in user space

2018-03-12 Thread Willy Tarreau
On Mon, Mar 12, 2018 at 12:36:05PM +0100, Cyril Bonté wrote:
> I confirm I can reproduce the issue once 32 (and more) threads are used : the 
> main process enters an endless loop.
> I think the same issue may occur with nbproc on FreeBSD (the same code in an 
> #ifdef FreeBSD__).
> 
> Can you try the patch attached ? I'll send a clean one later.

Ah good catch, I'm pretty sure you nailed it down indeed! The fun thing
is that the initial purpose of that patch was precisely to avoid this
kind of annoying stuff in the first place!

Cheers,
Willy



Re: Idle HAProxy 1.8 spins at 100% in user space

2018-03-12 Thread Cyril Bonté
Hi Krishna and Willy, 

- Mail original -
> De: "Krishna Kumar (Engineering)" <krishna...@flipkart.com>
> À: "HAProxy" <haproxy@formilux.org>
> Envoyé: Lundi 12 Mars 2018 07:48:50
> Objet: Idle HAProxy 1.8 spins at 100% in user space
> 
> As an aside, could someone also post a simple configuration file to
> enable 40 listeners (thread)?
> 
> I get 100% cpu util when running high number (>30, on a 48 core
> system)
> of threads, I have tried both these versions:
> 
> HA-Proxy version 1.8.4-1ppa1~xenial 2018/02/10: Installed via .deb
> file
> HA-Proxy version 1.8.4-1deb90d 2018/02/08: Built from source
> http://www.haproxy.org/download/1.8/src/haproxy-1.8.4.tar.gz
> 
> 1. Distro/kernel: Ubuntu 16.04.1 LTS, 4.4.0-36-generic
> 
> 
> 2. Top:
> # top -d 1 -b | head -12
> top - 11:59:06 up 4 days, 41 min, 1 user, load average: 1.00, 1.00,
> 2.14
> Tasks: 492 total, 2 running, 464 sleeping, 0 stopped, 26 zombie
> %Cpu(s): 0.5 us, 0.2 sy, 0.0 ni, 99.2 id, 0.0 wa, 0.0 hi, 0.1 si, 0.0
> st
> KiB Mem : 13191999+total, 9520 free, 1222684 used, 52917792
> buff/cache
> KiB Swap: 0 total, 0 free, 0 used. 12986652+avail Mem
> 
> 
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 87994 haproxy 20 0 896624 14996 1468 R 100.0 0.0 3:09.60 haproxy
> 1 root 20 0 38856 7088 4132 S 0.0 0.0 0:08.69 systemd
> 2 root 20 0 0 0 0 S 0.0 0.0 0:00.08 kthreadd
> 3 root 20 0 0 0 0 S 0.0 0.0 4:05.79 ksoftirqd/0
> 5 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/0:0H
> 
> 
> 3. As to what it is doing:
> %Cpu0 :100.0 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0
> st
> 
> 
> %Cpu1 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0
> st
> %Cpu2 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0
> st
> 
> 
> 4. Minimal configuration file to reproduce this (using this blog:
> https://www.haproxy.com/blog/multithreading-in-haproxy/ ):
> 
> 
> 
> global
> daemon
> nbproc 1
> nbthread 40
> cpu-map auto:1/1-40 0-39

I confirm I can reproduce the issue once 32 (and more) threads are used : the 
main process enters an endless loop.
I think the same issue may occur with nbproc on FreeBSD (the same code in an 
#ifdef FreeBSD__).

Can you try the patch attached ? I'll send a clean one later.

> 
> 
> frontend test-fe
> mode http
> bind 10.33.110.118:80 process all/all
> use_backend test-be
> 
> 
> backend test-be
> mode http
> server 10.33.5.62 10.33.5.62:80 weight 255
> 
> 
> 5. Problem disappears when " cpu-map auto:1/1-40 0-39" is commented
> out.
> Same strace output, so it is in user space as shown by 'top' above.
> 
> 
> 6. Version/build (gcc version 5.4.0 20160609 (Ubuntu
> 5.4.0-6ubuntu1~16.04.2))
> 
> 
> 
> # haproxy -vv
> HA-Proxy version 1.8.4-1deb90d 2018/02/08
> Copyright 2000-2018 Willy Tarreau < wi...@haproxy.org >
> 
> 
> Build options :
> TARGET = linux2628
> CPU = generic
> CC = gcc
> CFLAGS = -O2 -g -fno-strict-aliasing -Wdeclaration-after-statement
> -fwrapv -Wno-unused-label
> OPTIONS = USE_ZLIB=yes USE_OPENSSL=1 USE_PCRE=1
> 
> 
> Default settings :
> maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents =
> 200
> 
> 
> Built with OpenSSL version : OpenSSL 1.0.2g 1 Mar 2016
> Running on OpenSSL version : OpenSSL 1.0.2g 1 Mar 2016
> OpenSSL library supports TLS extensions : yes
> OpenSSL library supports SNI : yes
> OpenSSL library supports : TLSv1.0 TLSv1.1 TLSv1.2
> Built with transparent proxy support using: IP_TRANSPARENT
> IPV6_TRANSPARENT IP_FREEBIND
> Encrypted password support via crypt(3): yes
> Built with multi-threading support.
> Built with PCRE version : 8.38 2015-11-23
> Running on PCRE version : 8.38 2015-11-23
> PCRE library supports JIT : no (USE_PCRE_JIT not set)
> Built with zlib version : 1.2.8
> Running on zlib version : 1.2.8
> Compression algorithms supported : identity("identity"),
> deflate("deflate"), raw-deflate("deflate"), gzip("gzip")
> Built with network namespace support.
> 
> 
> Available polling systems :
> epoll : pref=300, test result OK
> poll : pref=200, test result OK
> select : pref=150, test result OK
> Total: 3 (3 usable), will use epoll.
> 
> 
> Available filters :
> [SPOE] spoe
> [COMP] compression
> [TRACE] trace
> 
> 
> 7. Strace of the process:
> 88033 11:57:18.946030 <... epoll_wait resumed> [], 200, 1000) = 0
> <1.001144>
> 88032 11:57:18.946046 <... epoll_wait resumed> [], 200, 1000) = 0
> <1.001149>
> 88033 11:57:18.946078 epoll_wait(47, 
> 88034 11:57:18.946092 epoll_wait(48, 
> 88

Idle HAProxy 1.8 spins at 100% in user space

2018-03-12 Thread Krishna Kumar (Engineering)
As an aside, could someone also post a simple configuration file to
enable 40 listeners (thread)?

I get 100% cpu util when running high number (>30, on a 48 core system)
of threads, I have tried both these versions:

HA-Proxy version 1.8.4-1ppa1~xenial 2018/02/10: Installed via .deb file
HA-Proxy version 1.8.4-1deb90d 2018/02/08: Built from source
 http://www.haproxy.org/download/1.8/src/haproxy-1.8.4.tar.gz


1. Distro/kernel: Ubuntu 16.04.1 LTS, 4.4.0-36-generic

2. Top:
# top -d 1 -b  | head -12
top - 11:59:06 up 4 days, 41 min,  1 user,  load average: 1.00, 1.00, 2.14
Tasks: 492 total,   2 running, 464 sleeping,   0 stopped,  26 zombie
%Cpu(s):  0.5 us,  0.2 sy,  0.0 ni, 99.2 id,  0.0 wa,  0.0 hi,  0.1 si,
0.0 st
KiB Mem : 13191999+total, 9520 free,  1222684 used, 52917792 buff/cache
KiB Swap:0 total,0 free,0 used. 12986652+avail Mem

   PID USER  PR  NIVIRTRESSHR S  %CPU %MEM TIME+ COMMAND
 87994 haproxy   20   0  896624  14996   1468 R 100.0  0.0   3:09.60 haproxy
 1 root  20   0   38856   7088   4132 S   0.0  0.0   0:08.69 systemd
 2 root  20   0   0  0  0 S   0.0  0.0   0:00.08
kthreadd
 3 root  20   0   0  0  0 S   0.0  0.0   4:05.79
ksoftirqd/0
 5 root   0 -20   0  0  0 S   0.0  0.0   0:00.00
kworker/0:0H

3.  As to what it is doing:
%Cpu0  :100.0 us,  0.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,
0.0 st
%Cpu1  :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,
0.0 st
%Cpu2  :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,
0.0 st

4. Minimal configuration file to reproduce this (using this blog:
   https://www.haproxy.com/blog/multithreading-in-haproxy/):

global
daemon
nbproc 1
nbthread 40
cpu-map auto:1/1-40 0-39

frontend test-fe
mode http
bind 10.33.110.118:80 process all/all
use_backend test-be

backend test-be
mode http
server 10.33.5.62 10.33.5.62:80 weight 255

5. Problem disappears when "cpu-map auto:1/1-40 0-39" is commented out.
Same strace output, so it is in user space as shown by 'top' above.

6. Version/build (gcc version 5.4.0 20160609 (Ubuntu
5.4.0-6ubuntu1~16.04.2))

# haproxy -vv
HA-Proxy version 1.8.4-1deb90d 2018/02/08
Copyright 2000-2018 Willy Tarreau 

Build options :
  TARGET  = linux2628
  CPU = generic
  CC  = gcc
  CFLAGS  = -O2 -g -fno-strict-aliasing -Wdeclaration-after-statement
-fwrapv -Wno-unused-label
  OPTIONS = USE_ZLIB=yes USE_OPENSSL=1 USE_PCRE=1

Default settings :
  maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200

Built with OpenSSL version : OpenSSL 1.0.2g  1 Mar 2016
Running on OpenSSL version : OpenSSL 1.0.2g  1 Mar 2016
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
OpenSSL library supports : TLSv1.0 TLSv1.1 TLSv1.2
Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT
IP_FREEBIND
Encrypted password support via crypt(3): yes
Built with multi-threading support.
Built with PCRE version : 8.38 2015-11-23
Running on PCRE version : 8.38 2015-11-23
PCRE library supports JIT : no (USE_PCRE_JIT not set)
Built with zlib version : 1.2.8
Running on zlib version : 1.2.8
Compression algorithms supported : identity("identity"),
deflate("deflate"), raw-deflate("deflate"), gzip("gzip")
Built with network namespace support.

Available polling systems :
  epoll : pref=300,  test result OK
   poll : pref=200,  test result OK
 select : pref=150,  test result OK
Total: 3 (3 usable), will use epoll.

Available filters :
[SPOE] spoe
[COMP] compression
[TRACE] trace

7. Strace of the process:
88033 11:57:18.946030 <... epoll_wait resumed> [], 200, 1000) = 0 <1.001144>
88032 11:57:18.946046 <... epoll_wait resumed> [], 200, 1000) = 0 <1.001149>
88033 11:57:18.946078 epoll_wait(47,  
88034 11:57:18.946092 epoll_wait(48,  
88032 11:57:18.946104 epoll_wait(46,  
88031 11:57:18.946115 <... epoll_wait resumed> [], 200, 1000) = 0 <1.001153>
88030 11:57:18.946128 <... epoll_wait resumed> [], 200, 1000) = 0 <1.001154>
88029 11:57:18.946140 <... epoll_wait resumed> [], 200, 1000) = 0 <1.001155>
88028 11:57:18.946152 <... epoll_wait resumed> [], 200, 1000) = 0 <1.001216>
88031 11:57:18.946169 epoll_wait(44,  
88027 11:57:18.946181 <... epoll_wait resumed> [], 200, 1000) = 0 <1.001183>
88030 11:57:18.946196 epoll_wait(43,  
88029 11:57:18.946208 epoll_wait(40,  
88028 11:57:18.946219 epoll_wait(39,  
88027 11:57:18.946231 epoll_wait(38,  
88026 11:57:18.946244 <... epoll_wait resumed> [], 200, 1000) = 0 <1.001296>
88025 11:57:18.946257 <... epoll_wait resumed> [], 200, 1000) = 0 <1.001248>
88024 11:57:18.946269 <... epoll_wait resumed> [], 200, 1000) = 0 <1.001226>
88023 11:57:18.946282 <... epoll_wait resumed> [], 200, 1000) = 0 <1.001210>
88026 11:57:18.946293 epoll_wait(37,  
88022 11:57:18.946307 <... epoll_wait resumed> [], 200, 1000) = 0 <1.001224>
88025 11:57:18.946320 epoll_wait(36,