Re: haproxy 2.2-dev8-7867525 - 100% cpu usage on 1 core after config 'reload'

2020-05-29 Thread PiBa-NL

Hi Christopher,

Op 29-5-2020 om 09:00 schreef Christopher Faulet:

Le 29/05/2020 à 00:45, PiBa-NL a écrit :

Hi List,

I noticed a issue with 2.2-dev8-release and with 2.2-dev8-7867525 the
issue is still there that when a reload is 'requested' it fails to stop
the old worker..



Hi Pieter,

I was able to reproduce the bug. Thanks for the reproducer. I've fixed 
it. It should be ok now.



Thanks for the quick fix! It works for me.

Regards,
PiBa-NL (Pieter)




Re: haproxy 2.2-dev8-7867525 - 100% cpu usage on 1 core after config 'reload'

2020-05-29 Thread Christopher Faulet

Le 29/05/2020 à 11:39, William Dauchy a écrit :

Hello Christopher,

On Fri, May 29, 2020 at 9:02 AM Christopher Faulet  wrote:

I was able to reproduce the bug. Thanks for the reproducer. I've fixed it. It
should be ok now.


I believe you are referring to
https://git.haproxy.org/?p=haproxy.git;a=commit;h=56192cc60b786f2c82925411d8b2ccd7d9f97d84
right?
I'm trying to hunt another loop issue in v2.2, that's why I prefer to
remove unrelated noise.


Yes, it is this one.

--
Christopher Faulet



Re: haproxy 2.2-dev8-7867525 - 100% cpu usage on 1 core after config 'reload'

2020-05-29 Thread William Dauchy
Hello Christopher,

On Fri, May 29, 2020 at 9:02 AM Christopher Faulet  wrote:
> I was able to reproduce the bug. Thanks for the reproducer. I've fixed it. It
> should be ok now.

I believe you are referring to
https://git.haproxy.org/?p=haproxy.git;a=commit;h=56192cc60b786f2c82925411d8b2ccd7d9f97d84
right?
I'm trying to hunt another loop issue in v2.2, that's why I prefer to
remove unrelated noise.

Thanks,
-- 
William



Re: haproxy 2.2-dev8-7867525 - 100% cpu usage on 1 core after config 'reload'

2020-05-29 Thread Christopher Faulet

Le 29/05/2020 à 00:45, PiBa-NL a écrit :

Hi List,

I noticed a issue with 2.2-dev8-release and with 2.2-dev8-7867525 the
issue is still there that when a reload is 'requested' it fails to stop
the old worker.. The old worker shuts down most of its threads, but 1
thread  starts running at 100% cpu usage of a core. Not sure yet 'when'
the issue was introduced exactly.. Ive skiped quite a few dev releases
and didnt have time to disect it to a specific version/commit yet. Ill
try and do that during the weekend i noone does it earlier ;)..

Normally dont use -W but am 'manually' restarting haproxy with -sf
parameters.. but this seemed like the easier reproduction..
Also i 'think' i noticed once that dispite the -W parameter and logging
output that a worker was spawned that there was only 1 process running,
but couldnt reproduce that one  sofar again... Also i havnt tried to see
if and how i can connect through the master to the old worker process
yet... perhaps also something i can try later..
I 'suspect' it has something to do with the healthchecks though... (and
their refactoring as i think happened.?.)

Anyhow perhaps this is already enough for someone to take a closer look.?
If more info is needed ill try and provide :).



Hi Pieter,

I was able to reproduce the bug. Thanks for the reproducer. I've fixed it. It 
should be ok now.


--
Christopher Faulet



Re: haproxy 2.2-dev8-7867525 - 100% cpu usage on 1 core after config 'reload'

2020-05-28 Thread Tim Düsterhus
Pieter,

Am 29.05.20 um 00:45 schrieb PiBa-NL:
> I 'suspect' it has something to do with the healthchecks though... (and
> their refactoring as i think happened.?.)

This appears to be correct.

> Anyhow perhaps this is already enough for someone to take a closer look.?
> If more info is needed ill try and provide :).
> 
> Regards,
> PiBa-NL (Pieter)
> 
> *Reproduction (works 99% of the time..):*
>   haproxy -W -f /var/etc/haproxy-2020/haproxy.cfg
>   kill -s USR2 17683
> 
> *haproxy.cfg*
> frontend www
>     bind            127.0.0.1:81
>     mode            http
> backend testVPS_ipv4
>     mode            http
>     retries            3
>     option            httpchk OPTIONS /Test HTTP/1.1\r\nHost:\ test.test.nl
>     server            vps2a 192.168.30.10:80 id 10109 check inter 15000
> backend O365mailrelay
>     mode            tcp
>     option            smtpchk HELO
>     no option log-health-checks
>     server-template            O365smtp 2
> test.mail.protection.outlook.com:25 id 122 check inter 1
> 

I can reproduce the issue with your configuration on

> HA-Proxy version 2.2-dev8-fa9d78-30 2020/05/28 - https://haproxy.org/
> Status: development branch - not safe for use in production.
> Known bugs: https://github.com/haproxy/haproxy/issues?q=is:issue+is:open
> Running on: Linux 4.4.0-179-generic #209-Ubuntu SMP Fri Apr 24 17:48:44 UTC 
> 2020 x86_64


Backtrace is as follows:

> (gdb) bt full
> #0  eb_next (node=0x19e2e60) at ebtree/ebtree.h:571
> t = 0x19e2e61
> #1  ebpt_next (ebpt=0x19e2e60) at ebtree/ebpttree.h:77
> No locals.
> #2  deinit_tcpchecks () at src/checks.c:5523
> rs = 
> r = 
> rb = 
> node = 0x19e2e60
> #3  0x004d1da3 in deinit () at src/haproxy.c:2762

strace does not show any further activity.

Best regards
Tim Düsterhus



haproxy 2.2-dev8-7867525 - 100% cpu usage on 1 core after config 'reload'

2020-05-28 Thread PiBa-NL

Hi List,

I noticed a issue with 2.2-dev8-release and with 2.2-dev8-7867525 the 
issue is still there that when a reload is 'requested' it fails to stop 
the old worker.. The old worker shuts down most of its threads, but 1 
thread  starts running at 100% cpu usage of a core. Not sure yet 'when' 
the issue was introduced exactly.. Ive skiped quite a few dev releases 
and didnt have time to disect it to a specific version/commit yet. Ill 
try and do that during the weekend i noone does it earlier ;)..


Normally dont use -W but am 'manually' restarting haproxy with -sf 
parameters.. but this seemed like the easier reproduction..
Also i 'think' i noticed once that dispite the -W parameter and logging 
output that a worker was spawned that there was only 1 process running, 
but couldnt reproduce that one  sofar again... Also i havnt tried to see 
if and how i can connect through the master to the old worker process 
yet... perhaps also something i can try later..
I 'suspect' it has something to do with the healthchecks though... (and 
their refactoring as i think happened.?.)


Anyhow perhaps this is already enough for someone to take a closer look.?
If more info is needed ill try and provide :).

Regards,
PiBa-NL (Pieter)

*Reproduction (works 99% of the time..):*
  haproxy -W -f /var/etc/haproxy-2020/haproxy.cfg
  kill -s USR2 17683

*haproxy.cfg*
frontend www
    bind            127.0.0.1:81
    mode            http
backend testVPS_ipv4
    mode            http
    retries            3
    option            httpchk OPTIONS /Test HTTP/1.1\r\nHost:\ test.test.nl
    server            vps2a 192.168.30.10:80 id 10109 check inter 15000
backend O365mailrelay
    mode            tcp
    option            smtpchk HELO
    no option log-health-checks
    server-template            O365smtp 2 
test.mail.protection.outlook.com:25 id 122 check inter 1


*haproxy -vv*
HA-Proxy version 2.2-dev8-7867525 2020/05/28 - https://haproxy.org/
Status: development branch - not safe for use in production.
Known bugs: https://github.com/haproxy/haproxy/issues?q=is:issue+is:open
Running on: FreeBSD 11.1-RELEASE FreeBSD 11.1-RELEASE #0 r321309: Fri 
Jul 21 02:08:28 UTC 2017 
r...@releng2.nyi.freebsd.org:/usr/obj/usr/src/sys/GENERIC amd64

Build options :
  TARGET  = freebsd
  CPU = generic
  CC  = cc
  CFLAGS  = -pipe -g -fstack-protector -fno-strict-aliasing 
-fno-strict-aliasing -Wdeclaration-after-statement -fwrapv 
-fno-strict-overflow -Wno-null-dereference -Wno-unused-label 
-Wno-unused-parameter -Wno-sign-compare -Wno-ignored-qualifiers 
-Wno-unused-command-line-argument -Wno-missing-field-initializers 
-Wno-address-of-packed-member -DFREEBSD_PORTS -DFREEBSD_PORTS
  OPTIONS = USE_PCRE=1 USE_PCRE_JIT=1 USE_STATIC_PCRE=1 
USE_GETADDRINFO=1 USE_OPENSSL=1 USE_LUA=1 USE_ACCEPT4=1 USE_ZLIB=1


Feature list : -EPOLL +KQUEUE -NETFILTER +PCRE +PCRE_JIT -PCRE2 
-PCRE2_JIT +POLL -PRIVATE_CACHE +THREAD -PTHREAD_PSHARED -BACKTRACE 
+STATIC_PCRE -STATIC_PCRE2 +TPROXY -LINUX_TPROXY -LINUX_SPLICE +LIBCRYPT 
-CRYPT_H +GETADDRINFO +OPENSSL +LUA -FUTEX +ACCEPT4 +ZLIB -SLZ 
+CPU_AFFINITY -TFO -NS -DL -RT -DEVICEATLAS -51DEGREES -WURFL -SYSTEMD 
-OBSOLETE_LINKER -PRCTL -THREAD_DUMP -EVPORTS


Default settings :
  bufsize = 16384, maxrewrite = 1024, maxpollevents = 200

Built with multi-threading support (MAX_THREADS=64, default=16).
Built with OpenSSL version : OpenSSL 1.0.2k-freebsd  26 Jan 2017
Running on OpenSSL version : OpenSSL 1.0.2k-freebsd  26 Jan 2017
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
OpenSSL library supports : SSLv3 TLSv1.0 TLSv1.1 TLSv1.2
Built with Lua version : Lua 5.3.4
Built with clang compiler version 4.0.0 (tags/RELEASE_400/final 297347)
Built with transparent proxy support using: IP_BINDANY IPV6_BINDANY
Built with PCRE version : 8.40 2017-01-11
Running on PCRE version : 8.40 2017-01-11
PCRE library supports JIT : yes
Encrypted password support via crypt(3): yes
Built with zlib version : 1.2.11
Running on zlib version : 1.2.11
Compression algorithms supported : identity("identity"), 
deflate("deflate"), raw-deflate("deflate"), gzip("gzip")


Available polling systems :
 kqueue : pref=300,  test result OK
   poll : pref=200,  test result OK
 select : pref=150,  test result OK
Total: 3 (3 usable), will use kqueue.

Available multiplexer protocols :
(protocols marked as  cannot be specified using 'proto' keyword)
  h2 : mode=HTTP   side=FE|BE mux=H2
    fcgi : mode=HTTP   side=BE    mux=FCGI
    : mode=HTTP   side=FE|BE mux=H1
    : mode=TCP    side=FE|BE mux=PASS

Available services : none

Available filters :
    [SPOE] spoe
    [CACHE] cache
    [FCGI] fcgi-app
    [TRACE] trace
    [COMP] compression