Re: dns fails to process response / hold valid? (since commit 2.2-dev0-13a9232)
Hi List, Hereby a little bump. Can someone take a look? (maybe the pcap attachment didn't fly well through spam filters. (or the email formatting..)?) (or because i (wrongly?) chose to include Baptiste specifically in my addressing (he committed the original patch that caused the change in behaviour)..) Anyhow the current '2.2-dev2-a71667c, released 2020/02/17' is still affected. If someone was already planning to, please don't feel 'pushed' by this mail. i'm just trying to make sure this doesn't fall through the cracks :). Regards, PiBa-NL (Pieter) Op 9-2-2020 om 15:35 schreef PiBa-NL: Hi List, Baptiste, After updating haproxy i found that the DNS resolver is no longer working for me. Also i wonder about the exact effect that 'hold valid' should have. I pointed haproxy to a 'Unbound 1.9.4' dns server that does the recursive resolving of the dns request made by haproxy. Before commit '2.2-dev0-13a9232, released 2020/01/22 (use additional records from SRV responses)' i get seemingly proper working resolving of server a name. After this commit all responses are counted as 'invalid' in the socket stats. Attached also a pcap of the dns traffic. Which shows a short capture of a single attempt where 3 retries for both A and records show up. There is a additional record of type 'OPT' is present in the response.. But the exact same keeps repeating every 5 seconds. As for 'hold valid' (tested with the commit before this one) it seems that the stats page of haproxy shows the server in 'resolution' status way before the 3 minute 'hold valid' has passed when i simply disconnect the network of the server running the Unbound-DNS server. Though i guess that is less important that dns working at all in the first place.. If any additional information is needed please let me know :). Can you/someone take a look? Thanks in advance. p.s. i think i read something about a 'vtest' that can test the haproxy DNS functionality, if you have a example that does this i would be happy to provide a vtest with a reproduction of the issue though i guess it will be kinda 'slow' if it needs to test for hold valid timings.. Regards, PiBa-NL (Pieter) haproxy config: resolvers globalresolvers nameserver pfs_routerbox 192.168.0.18:53 resolve_retries 3 timeout retry 200 hold valid 3m hold nx 10s hold other 15s hold refused 20s hold timeout 25s hold obsolete 30s timeout resolve 5s frontend nu_nl bind 192.168.0.19:433 name 192.168.0.19:433 ssl crt-list /var/etc/haproxy/nu_nl.crt_list mode http log global option http-keep-alive timeout client 3 use_backend nu.nl_ipvANY backend nu.nl_ipvANY mode http id 2113 log global timeout connect 3 timeout server 3 retries 3 option httpchk GET / HTTP/1.0\r\nHost:\ nu.nl\r\nAccept:\ */* server nu_nl nu.nl:443 id 2114 ssl check inter 1 verify none resolvers globalresolvers check-sni nu.nl resolve-prefer ipv4 haproxy_socket.sh show resolvers Resolvers section globalresolvers nameserver pfs_routerbox: sent: 216 snd_error: 0 valid: 0 update: 0 cname: 0 cname_error: 0 any_err: 108 nx: 0 timeout: 0 refused: 0 other: 0 invalid: 108 too_big: 0 truncated: 0 outdated: 0 haproxy -vv HA-Proxy version 2.2-dev0-13a9232 2020/01/22 - https://haproxy.org/ Status: development branch - not safe for use in production. Known bugs: https://github.com/haproxy/haproxy/issues?q=is:issue+is:open Build options : TARGET = freebsd CPU = generic CC = cc CFLAGS = -pipe -g -fstack-protector -fno-strict-aliasing -fno-strict-aliasing -Wdeclaration-after-statement -fwrapv -fno-strict-overflow -Wno-null-dereference -Wno-unused-label -Wno-unused-parameter -Wno-sign-compare -Wno-ignored-qualifiers -Wno-unused-command-line-argument -Wno-missing-field-initializers -Wno-address-of-packed-member -DFREEBSD_PORTS -DFREEBSD_PORTS OPTIONS = USE_PCRE=1 USE_PCRE_JIT=1 USE_REGPARM=1 USE_STATIC_PCRE=1 USE_GETADDRINFO=1 USE_OPENSSL=1 USE_LUA=1 USE_ACCEPT4=1 USE_ZLIB=1 Feature list : -EPOLL +KQUEUE -MY_EPOLL -MY_SPLICE -NETFILTER +PCRE +PCRE_JIT -PCRE2 -PCRE2_JIT +POLL -PRIVATE_CACHE +THREAD -PTHREAD_PSHARED +REGPARM +STATIC_PCRE -STATIC_PCRE2 +TPROXY -LINUX_TPROXY -LINUX_SPLICE +LIBCRYPT -CRYPT_H -VSYSCALL +GETADDRINFO +OPENSSL +LUA -FUTEX +ACCEPT4 -MY_ACCEPT4 +ZLIB -SLZ +CPU_AFFINITY -TFO -NS -DL -RT -DEVICEATLAS -51DEGREES -WURFL -SYSTEMD -OBSOLETE_LINKER -PRCTL -THREAD_DUMP -EVPORTS Default settings : bufsize = 16384, maxrewrite = 1024, maxpollevents = 200 Built with mult
Re: dns fails to process response / hold valid? (since commit 2.2-dev0-13a9232)
Hi Baptiste, Op 19-2-2020 om 13:06 schreef Baptiste: Hi, I found a couple of bugs in that part of the code. Can you please try the attached patch? (0001 is useless but I share it too in case of) Works for me, thanks! It will allow parsing of additional records for SRV queries only and when done, will silently ignore any record which are not A or . @maint team, please don't apply the patch yet, I want to test it much more before. When the final patch is ready ill be happy to give it a try as well. Baptiste On a side note. With config below i would expect 2 servers with status 'MAINT(resolving)'. Using this configuration in Unbound (4 server IP's defined.): server: local-data: "_https._tcp.pkg.test.tld 3600 IN SRV 0 100 80 srv1.test.tld" local-data: "_https._tcp.pkg.test.tld 3600 IN SRV 0 100 80 srv2.test.tld" local-data: "srv1.test.tld 3600 IN A 192.168.0.51" local-data: "srv2.test.tld 3600 IN A 192.168.0.52" local-data: "srvX.test.tld 3600 IN A 192.168.0.53" local-data: "srvX.test.tld 3600 IN A 192.168.0.54" And this in a HAProxy backend: server-template PB_SRVrecords 3 ipv4@_https._tcp.pkg.test.tld:77 id 10110 check inter 18 resolvers globalresolvers resolve-prefer ipv4 server-template PB_multipleA 3 i...@srvx.test.tld:78 id 10111 check inter 18 resolvers globalresolvers resolve-prefer ipv4Results in 6 servers, but 1 is This results in 6 servers of which 1 server has 'MAINT(resolution)' status and 1 has an IP of 0.0.0.0 but shows as 'DOWN'. I would have expected 2 servers with status MAINT.? (p.s. none of the IP's actually exist on my network so that the other servers are also shown as down is correct..) PB_ipv4,PB_SRVrecords1,0,0,0,0,,0,0,0,,0,,0,0,0,0,DOWN,1,1,0,1,1,124,124,,1,10102,10110,,0,,2,0,,0,L4CON,,74995,0,0,0,0,0,0,0,0,-1,,,0,0,0,0Layer4 connection problem,,2,3,0192.168.0.51:80,,http0,0,0,,,0,,0,0,0,0,0, PB_ipv4,PB_SRVrecords2,0,0,0,0,,0,0,0,,0,,0,0,0,0,DOWN,1,1,0,1,1,94,94,,1,10102,2,,0,,2,0,,0,L4CON,,75029,0,0,0,0,0,0,0,0,-1,,,0,0,0,0Layer4 connection problem,,2,3,0192.168.0.52:80,,http0,0,0,,,0,,0,0,0,0,0, PB_ipv4,PB_SRVrecords3,0,0,0,0,,0,0,0,,0,,0,0,0,0,DOWN,1,1,0,1,1,64,64,,1,10102,3,,0,,2,0,,0,L4CON,,75039,0,0,0,0,0,0,0,0,-1,,,0,0,0,0Layer4 connection problem,,2,3,00.0.0.0:77,,http0,0,0,,,0,,0,0,0,0,0, PB_ipv4,PB_multipleA1,0,0,0,0,,0,0,0,,0,,0,0,0,0,DOWN,1,1,0,1,2,34,34,,1,10102,10111,,0,,2,0,,0,L4CON,,75002,0,0,0,0,0,0,0,0,-1,,,0,0,0,0Layer4 connection problem,,2,3,0192.168.0.53:78,,http0,0,0,,,0,,0,0,0,0,0, PB_ipv4,PB_multipleA2,0,0,0,0,,0,0,0,,0,,0,0,0,0,DOWN,1,1,0,1,2,4,4,,1,10102,5,,0,,2,0,,0,L4CON,,75014,0,0,0,0,0,0,0,0,-1,,,0,0,0,0Layer4 connection problem,,2,3,0192.168.0.54:78,,http0,0,0,,,0,,0,0,0,0,0, PB_ipv4,PB_multipleA3,0,0,0,0,,0,0,0,,0,,0,0,0,0,MAINT (resolution),1,1,0,0,1,199,199,,1,10102,6,,0,,2,0,,00,0,0,0,0,0,0,0,-1,,,0,0,0,00.0.0.0:78,,http0,0,0,,,0,,0,0,0,0,0, If additional info is desired, please let me know :). On Tue, Feb 18, 2020 at 2:03 PM Baptiste <mailto:bed...@gmail.com>> wrote: Hi guys, Thx Tim for investigating. I'll check the PCAP and see why such behavior happens. Baptiste On Tue, Feb 18, 2020 at 12:09 AM Tim Düsterhus mailto:t...@bastelstu.be>> wrote: Pieter, Am 09.02.20 um 15:35 schrieb PiBa-NL: > Before commit '2.2-dev0-13a9232, released 2020/01/22 (use additional > records from SRV responses)' i get seemingly proper working resolving of > server a name. > After this commit all responses are counted as 'invalid' in the socket > stats. I can confirm the issue with the provided configuration. The 'if (len == 0) {' check in line 1045 of the commit causes HAProxy to consider the responses 'invalid': Thanks for confirming :). https://github.com/haproxy/haproxy/commit/13a9232ebc63fdf357ffcf4fa7a1a5e77a1eac2b#diff-b2ddf457bc423779995466f7d8b9d147R1045-R1048 Best regards Tim Düsterhus Regards, PiBa-NL (Pieter)
commit 493d9dc makes a SVN-checkout stall..
Hi List, Willy, Today i thought lets give v2.2-dev5 a try for my production environment ;). Soon it turned out to cause SVN-Checkout to stall/disconnect for a repository we run locally in a Collab-SVN server. I tracked it down to this commit: 493d9dc (MEDIUM: mux-h1: do not blindly wake up the tasklet at end of request anymore) causing the problem for the first time. Is there something tricky there that can be suspected to cause the issue.? Perhaps a patch i can try? While 'dissecting' the issue i deleted the whole directory each time and performed a new svn-checkout several times. It doesn't always stall at the exact same point but usually after checking out around +- 20 files something between 0.5 and 2 MB. , the commit before that one allows me to checkout 500+MB through haproxy without issue.. A wireshark seems to show that haproxy is sending several of RST,ACK packets for a 4 different connections to the svn-server at the same milisecond after it was quiet for 2 seconds.. The whole issue happens in a timeframe of start of checkout till when it stalls within 15 seconds. The 'nokqueue' i usually try on my FreeBSD machine doesn't change anything. Hope you have an idea where to look. Providing captures/logs is a bit difficult without some careful scrubbing.. Regards, PiBa-NL (Pieter) ### Complete config (that still reproduces the issue.. things cant get much simpler than this..): frontend InternalSites.8.6-merged bind 192.168.8.67:80 mode http use_backend APP01-JIRA-SVN_ipvANY backend APP01-JIRA-SVN_ipvANY mode http server svn 192.168.104.20:8080 ### uname -a FreeBSD freebsd11 11.1-RELEASE FreeBSD 11.1-RELEASE #0 r321309: Fri Jul 21 02:08:28 UTC 2017 r...@releng2.nyi.freebsd.org:/usr/obj/usr/src/sys/GENERIC amd64 ### haproxy -vv HA-Proxy version 2.2-dev5-3e128fe 2020/03/24 - https://haproxy.org/ Status: development branch - not safe for use in production. Known bugs: https://github.com/haproxy/haproxy/issues?q=is:issue+is:open Build options : TARGET = freebsd CPU = generic CC = cc CFLAGS = -pipe -g -fstack-protector -fno-strict-aliasing -fno-strict-aliasing -Wdeclaration-after-statement -fwrapv -fno-strict-overflow -Wno-null-dereference -Wno-unused-label -Wno-unused-parameter -Wno-sign-compare -Wno-ignored-qualifiers -Wno-unused-command-line-argument -Wno-missing-field-initializers -Wno-address-of-packed-member -DFREEBSD_PORTS -DFREEBSD_PORTS OPTIONS = USE_PCRE=1 USE_PCRE_JIT=1 USE_STATIC_PCRE=1 USE_GETADDRINFO=1 USE_OPENSSL=1 USE_LUA=1 USE_ACCEPT4=1 USE_ZLIB=1 Feature list : -EPOLL +KQUEUE -NETFILTER +PCRE +PCRE_JIT -PCRE2 -PCRE2_JIT +POLL -PRIVATE_CACHE +THREAD -PTHREAD_PSHARED -BACKTRACE +STATIC_PCRE -STATIC_PCRE2 +TPROXY -LINUX_TPROXY -LINUX_SPLICE +LIBCRYPT -CRYPT_H +GETADDRINFO +OPENSSL +LUA -FUTEX +ACCEPT4 +ZLIB -SLZ +CPU_AFFINITY -TFO -NS -DL -RT -DEVICEATLAS -51DEGREES -WURFL -SYSTEMD -OBSOLETE_LINKER -PRCTL -THREAD_DUMP -EVPORTS Default settings : bufsize = 16384, maxrewrite = 1024, maxpollevents = 200 Built with multi-threading support (MAX_THREADS=64, default=16). Built with OpenSSL version : OpenSSL 1.0.2k-freebsd 26 Jan 2017 Running on OpenSSL version : OpenSSL 1.0.2k-freebsd 26 Jan 2017 OpenSSL library supports TLS extensions : yes OpenSSL library supports SNI : yes OpenSSL library supports : SSLv3 TLSv1.0 TLSv1.1 TLSv1.2 Built with Lua version : Lua 5.3.4 Built with transparent proxy support using: IP_BINDANY IPV6_BINDANY Built with PCRE version : 8.40 2017-01-11 Running on PCRE version : 8.40 2017-01-11 PCRE library supports JIT : yes Encrypted password support via crypt(3): yes Built with zlib version : 1.2.11 Running on zlib version : 1.2.11 Compression algorithms supported : identity("identity"), deflate("deflate"), raw-deflate("deflate"), gzip("gzip") Available polling systems : kqueue : pref=300, test result OK poll : pref=200, test result OK select : pref=150, test result OK Total: 3 (3 usable), will use kqueue. Available multiplexer protocols : (protocols marked as cannot be specified using 'proto' keyword) h2 : mode=HTTP side=FE|BE mux=H2 fcgi : mode=HTTP side=BE mux=FCGI : mode=HTTP side=FE|BE mux=H1 : mode=TCP side=FE|BE mux=PASS Available services : none Available filters : [SPOE] spoe [CACHE] cache [FCGI] fcgi-app [TRACE] trace [COMP] compression
Re: commit 493d9dc makes a SVN-checkout stall..
Hi Olivier, Willy, Just to confirm, as expected it (c3500c3) indeed works for me :). Thanks for the quick fix! Regards, PiBa-NL (Pieter) Op 25-3-2020 om 17:16 schreef Willy Tarreau: On Wed, Mar 25, 2020 at 05:08:03PM +0100, Olivier Houchard wrote: That is... interesting, not sure I reached such an outstanding result. Oh I stopped trying to guess long ago :-) This is now fixed, sorry about that ! Confirmed, much better now, thanks! Willy
Re: [PATCHES] dns related
Hi Baptiste, Op 26-3-2020 om 12:46 schreef William Lallemand: On Wed, Mar 25, 2020 at 11:15:37AM +0100, Baptiste wrote: Hi there, A couple of patches here to cleanup and fix some bugs introduced by 13a9232ebc63fdf357ffcf4fa7a1a5e77a1eac2b. Baptiste Thanks Baptiste, merged. Thanks for this one. Question though, are you still working on making a non-existing server template go into 'resolution' state? See below/attached picture with some details.. (or see https://www.mail-archive.com/haproxy@formilux.org/msg36373.html ) Regards, PiBa-NL (Pieter)
server-state application failed for server 'x/y', invalid srv_admin_state value '32'
Hi List, Using 2.2-dev5-c3500c3, I've got both a server and a servertemplate/server that are marked 'down' due to dns not replying with (enough) records. That by itself is alright.. (and likely has been like that for a while so i don't think its a regression.) But when i perform a 'seemless reload' with a serverstates file it causes the warnings below for both server and template.: [WARNING] 095/150909 (74796) : server-state application failed for server 'x/y', invalid srv_admin_state value '32' [WARNING] 095/150909 (74796) : server-state application failed for server 'x2/z3', invalid srv_admin_state value '32' Is there a way to get rid of these warnings, and if 32 is a invalid value, how did it get into the state file at all? ## Severely cut down config snippet..: backend x server y AppSrv:8084 id 161 check inter 1 weight 1 resolvers globalresolvers backend x2 server-template z 3 smtp.company.tld:25 id 167 check inter 1 weight 10 resolvers globalresolvers One could argue that my backend x should have a better dns name configured, if it doesn't exists i apparently messed up something.. For the second x2 backend though isn't it 'normal' to have a template sizing account for 'future growth' of the cluster? And as such always have some extra template-servers available that are in 'MAINT / resolution' state? As such when 2 servers of x2 are up, and the 3rd is in resolution state, it shouldn't warn on a restart imho as its to be expected for most setups.?. Not sure if its a bug or a feature request, but i do think it should be changed :). Can it be added to some todo list? Thanks. Thanks and regards, PiBa-NL (Pieter)
Re: server-state application failed for server 'x/y', invalid srv_admin_state value '32'
Hi Baptiste, Op 6-4-2020 om 11:43 schreef Baptiste: Hi Piba, my answers inline. Using 2.2-dev5-c3500c3, I've got both a server and a servertemplate/server that are marked 'down' due to dns not replying with (enough) records. That by itself is alright.. (and likely has been like that for a while so i don't think its a regression.) You're right, this has always been like that. For the 'regression part' i was thinking about the warnings below which where 'likely' like that before 2.2-dev5 as well, it wasn't about marking servers as down which is totally expected & desired ;) i seem to read from your response here like you thought i thought otherwise...(damn that gets hard to understand, sorry..) And as you have confirmed its already causing the warnings in 1.8 as well.. So not a regression in 2.2-dev5 itself. But when i perform a 'seemless reload' with a serverstates file it causes the warnings below for both server and template.: [WARNING] 095/150909 (74796) : server-state application failed for server 'x/y', invalid srv_admin_state value '32' [WARNING] 095/150909 (74796) : server-state application failed for server 'x2/z3', invalid srv_admin_state value '32' Is there a way to get rid of these warnings, and if 32 is a invalid value, how did it get into the state file at all? I can confirm this is not supposed to happen! And I could reproduce this behavior since HAProxy 1.8. Not sure if its a bug or a feature request, but i do think it should be changed :). Can it be added to some todo list? Thanks. This is a bug from my point of view. I'll check this. Could you please open a github issue and tag me in there? Done: https://github.com/haproxy/haproxy/issues/576 Baptiste Thanks and regards, PiBa-NL (Pieter)
Re: disabling test if ipv6 not supported ?
Hi Ilya, Op 21-5-2020 om 04:57 schreef Илья Шипицин: Hello, seems, freebsd images on cirrus-ci run with no ipv6 support https://cirrus-ci.com/task/6613883307687936 It fails on srv3 configuration, any ideay why doesn't it complain about srv2 as that also seems to me it uses IPv6..? any idea how we can skip such tests ? I think the test or code should get fixed.. not skipped because it fails. Note that this specific test recently got this extra 'addr ::1' server check parameter on srv3. Perhaps that that syntax is written/parsed wrongly? Cheers, Ilya Shipitcin Regards, PiBa-NL (Pieter)
haproxy 2.2-dev8-7867525 - 100% cpu usage on 1 core after config 'reload'
Hi List, I noticed a issue with 2.2-dev8-release and with 2.2-dev8-7867525 the issue is still there that when a reload is 'requested' it fails to stop the old worker.. The old worker shuts down most of its threads, but 1 thread starts running at 100% cpu usage of a core. Not sure yet 'when' the issue was introduced exactly.. Ive skiped quite a few dev releases and didnt have time to disect it to a specific version/commit yet. Ill try and do that during the weekend i noone does it earlier ;).. Normally dont use -W but am 'manually' restarting haproxy with -sf parameters.. but this seemed like the easier reproduction.. Also i 'think' i noticed once that dispite the -W parameter and logging output that a worker was spawned that there was only 1 process running, but couldnt reproduce that one sofar again... Also i havnt tried to see if and how i can connect through the master to the old worker process yet... perhaps also something i can try later.. I 'suspect' it has something to do with the healthchecks though... (and their refactoring as i think happened.?.) Anyhow perhaps this is already enough for someone to take a closer look.? If more info is needed ill try and provide :). Regards, PiBa-NL (Pieter) *Reproduction (works 99% of the time..):* haproxy -W -f /var/etc/haproxy-2020/haproxy.cfg kill -s USR2 17683 *haproxy.cfg* frontend www bind 127.0.0.1:81 mode http backend testVPS_ipv4 mode http retries 3 option httpchk OPTIONS /Test HTTP/1.1\r\nHost:\ test.test.nl server vps2a 192.168.30.10:80 id 10109 check inter 15000 backend O365mailrelay mode tcp option smtpchk HELO no option log-health-checks server-template O365smtp 2 test.mail.protection.outlook.com:25 id 122 check inter 1 *haproxy -vv* HA-Proxy version 2.2-dev8-7867525 2020/05/28 - https://haproxy.org/ Status: development branch - not safe for use in production. Known bugs: https://github.com/haproxy/haproxy/issues?q=is:issue+is:open Running on: FreeBSD 11.1-RELEASE FreeBSD 11.1-RELEASE #0 r321309: Fri Jul 21 02:08:28 UTC 2017 r...@releng2.nyi.freebsd.org:/usr/obj/usr/src/sys/GENERIC amd64 Build options : TARGET = freebsd CPU = generic CC = cc CFLAGS = -pipe -g -fstack-protector -fno-strict-aliasing -fno-strict-aliasing -Wdeclaration-after-statement -fwrapv -fno-strict-overflow -Wno-null-dereference -Wno-unused-label -Wno-unused-parameter -Wno-sign-compare -Wno-ignored-qualifiers -Wno-unused-command-line-argument -Wno-missing-field-initializers -Wno-address-of-packed-member -DFREEBSD_PORTS -DFREEBSD_PORTS OPTIONS = USE_PCRE=1 USE_PCRE_JIT=1 USE_STATIC_PCRE=1 USE_GETADDRINFO=1 USE_OPENSSL=1 USE_LUA=1 USE_ACCEPT4=1 USE_ZLIB=1 Feature list : -EPOLL +KQUEUE -NETFILTER +PCRE +PCRE_JIT -PCRE2 -PCRE2_JIT +POLL -PRIVATE_CACHE +THREAD -PTHREAD_PSHARED -BACKTRACE +STATIC_PCRE -STATIC_PCRE2 +TPROXY -LINUX_TPROXY -LINUX_SPLICE +LIBCRYPT -CRYPT_H +GETADDRINFO +OPENSSL +LUA -FUTEX +ACCEPT4 +ZLIB -SLZ +CPU_AFFINITY -TFO -NS -DL -RT -DEVICEATLAS -51DEGREES -WURFL -SYSTEMD -OBSOLETE_LINKER -PRCTL -THREAD_DUMP -EVPORTS Default settings : bufsize = 16384, maxrewrite = 1024, maxpollevents = 200 Built with multi-threading support (MAX_THREADS=64, default=16). Built with OpenSSL version : OpenSSL 1.0.2k-freebsd 26 Jan 2017 Running on OpenSSL version : OpenSSL 1.0.2k-freebsd 26 Jan 2017 OpenSSL library supports TLS extensions : yes OpenSSL library supports SNI : yes OpenSSL library supports : SSLv3 TLSv1.0 TLSv1.1 TLSv1.2 Built with Lua version : Lua 5.3.4 Built with clang compiler version 4.0.0 (tags/RELEASE_400/final 297347) Built with transparent proxy support using: IP_BINDANY IPV6_BINDANY Built with PCRE version : 8.40 2017-01-11 Running on PCRE version : 8.40 2017-01-11 PCRE library supports JIT : yes Encrypted password support via crypt(3): yes Built with zlib version : 1.2.11 Running on zlib version : 1.2.11 Compression algorithms supported : identity("identity"), deflate("deflate"), raw-deflate("deflate"), gzip("gzip") Available polling systems : kqueue : pref=300, test result OK poll : pref=200, test result OK select : pref=150, test result OK Total: 3 (3 usable), will use kqueue. Available multiplexer protocols : (protocols marked as cannot be specified using 'proto' keyword) h2 : mode=HTTP side=FE|BE mux=H2 fcgi : mode=HTTP side=BE mux=FCGI : mode=HTTP side=FE|BE mux=H1 : mode=TCP side=FE|BE mux=PASS Available services : none Available filters : [SPOE] spoe [CACHE] cache [FCGI] fcgi-app [TRACE] trace [COMP] compression
Re: haproxy 2.2-dev8-7867525 - 100% cpu usage on 1 core after config 'reload'
Hi Christopher, Op 29-5-2020 om 09:00 schreef Christopher Faulet: Le 29/05/2020 à 00:45, PiBa-NL a écrit : Hi List, I noticed a issue with 2.2-dev8-release and with 2.2-dev8-7867525 the issue is still there that when a reload is 'requested' it fails to stop the old worker.. Hi Pieter, I was able to reproduce the bug. Thanks for the reproducer. I've fixed it. It should be ok now. Thanks for the quick fix! It works for me. Regards, PiBa-NL (Pieter)
Re: set ssl ocsp-response working only if we already have an ocsp record
Op 9-2-2017 om 7:58 schreef Willy Tarreau: Hi Olivier, On Mon, Jan 23, 2017 at 08:31:13PM +0100, Olivier Doucet wrote: Hello, I'm actually implementing OCSP stapling on my haproxy instance. It seems we can update ocsp (with set ssl ocsp-response on socket) only if a previous OCSP record exist. For example : Case #1 - start haproxy without any ocsp file - set ssl ocsp-response $(base64 file.ocsp) => OCSP single response: Certificate ID does not match any certificate or issuer. Case #2 - start haproxy with ocsp file - set ssl ocsp-response [ with same OCSP response file ] => "OCSP Response updated!" Is this an expected behaviour ? I'm not surprized since the initial purpose was to update the pre-loaded record. However I don't know if technically speaking there are any such requirements or if we could get rid of this dependency. Maybe you should try to take a look at it. The "ocsp" word appears very rarely in the code, I think should can track all of the sequence without too much difficulties. Willy There is of course the option of starting with a 'empty' .ocsp file, and then later setting the actual ocsp content over the admin socket. Assuming you do know in advance that you will want to use ocsp.. Regards PiBa-NL
Re: dual check
Op 29-3-2017 om 21:47 schreef Aleksandar Lazic: Hi. Am 29-03-2017 12:52, schrieb Antonio Trujillo Carmona: In a haproxy with ssl-nsi (not terminate ssl). I want to check state of VM witch are under other haproxy, so I need "option httpchk GET /healthcheck" (https://www.mail-archive.com/haproxy@formilux.org/msg24823.html)). or "option httpchk GET /testwebwls/check" (https://www.mail-archive.com/haproxy@formilux.org/msg24829.html) But Ineed to use SSLID to keep Session affinity, so I need "option ssl-hello-chk" Can I use double check?, How do I do it?. Thank. Currently the ssl-hello-chk just check the hello in "binary" format. That's the reason why you don't need a ssl libary to check ssl-hello-chk. I'm on the way to create a full blown sslchk where you can make similar requests like in httpchk. No eta for now but I'm working on it. Am i missing something?.. Or is it already possible: check-ssl http://cbonte.github.io/haproxy-dconv/1.8/snapshot/configuration.html#check-ssl Regards Aleks -- *Antonio Trujillo Carmona* *Técnico de redes y sistemas.* *Subdirección de Tecnologías de la Información y Comunicaciones* Servicio Andaluz de Salud. Consejería de Salud de la Junta de Andalucía _antonio.trujillo.sspa@juntadeandalucia.es_ Tel. +34 670947670 747670) Also i dont get how a healthcheck settings would be relevant to get session affinity? Regards, PiBa-NL
Re: ssl & default_backend
Hi Antonio, Op 3-4-2017 om 13:29 schreef Antonio Trujillo Carmona: It's well documented that Windows XP with Internet Explorer don't support sni, so I try to redirect call through "default_backend", but I got ERROR-404, it work fine with all other combination of OS/surfer. If I (only for test purpose) comment the four line with "ssiiprovincial" (witch mean all the traffic must be redirected through default_backend) it don't work with any OS/surfer. frontend Aplicaciones bind *:443 mode tcp log global tcp-request inspect-delay 5s tcp-request content accept if { req_ssl_hello_type 1 } # Parametros para utilizar SNI (Server Name Indication) acl aplicaciones req_ssl_sni -i aplicaciones.gra.sas.junta-andalucia.es acl citrixsf req_ssl_sni -i ssiiprovincial.gra.sas.junta-andalucia.es acl citrixsf req_ssl_sni -i ssiiprovincial01.gra.sas.junta-andalucia.es acl citrixsf req_ssl_sni -i ssiiprovincial.hvn.sas.junta-andalucia.es acl citrixsf req_ssl_sni -i ssiiprovincial01.hvn.sas.junta-andalucia.es use_backend CitrixSF-SSL if citrixsf use_backend SevidoresWeblogic-12c-Balanceador-SSL There is no acl for the backend above? so probably the default_backend below will never be reached. Could it be the above backend returns the 404 your seeing? default_backend CitrixSF-SSL Regards, PiBa-NL
Re: Need to understand logs
Hi Rajesh, Aleksander, Op 11-9-2017 om 10:32 schreef Rajesh Kolli: Hi Aleksandar, Thank you for clarifying about "Layer 4" checks. I am interested in knowing the values of these %d/%d, %s in line 319. Why it is taking only 1/2, 1/3... values? What they are representing? Have you seen rise & fall in the documentation? http://cbonte.github.io/haproxy-dconv/1.8/snapshot/configuration.html#rise http://cbonte.github.io/haproxy-dconv/1.8/snapshot/configuration.html#5.2-fall Basically it takes by default 3 consecutive failed checks to mark a server down, and 2 passed checks to get it back up. So 1/3 is 1 failed check but server status is still 'up'. Then 2/3 failed check, but still marked up. At 3/3 the server would be marked down, and removed from the backend pool. Then after a while when the webserver is working again the following will happen. After the first successful 1/2 check the server is still marked 'down'. And on the second 2/2 successful check it will be marked 'up' and is added back into the backend server pool to take requests. 319 "chunk_appendf(&trash, ", status: %d/%d %s", 320 (check->health >= check->rise) ? check->health - check->rise + 1 : check->health, 321 (check->health >= check->rise) ? check->fall : check->rise, 322 (check->health >= check->rise) ? (s->uweight ? "UP" : "DRAIN") : "DOWN"); 323 Thanks & Regards Rajesh Kolli -Original Message- From: Aleksandar Lazic [mailto:al-hapr...@none.at] Sent: Sunday, September 10, 2017 9:37 PM To: Rajesh Kolli; haproxy@formilux.org Subject: Re: Need to understand logs Hi Rajesh. Rajesh Kolli wrote on 08.09.2017: Hi Aleksandar, Thank you for your response. Yes, I am using "Log-health-checks" in my configuration and here is my HAProxy version information. Thanks. sorry to say that but for know you can only take a look into the source for documenation. http://git.haproxy.org/?p=haproxy-1.7.git&a=search&h=HEAD&st=grep&s=PR_O2_LO GHCHKS for example. http://git.haproxy.org/?p=haproxy-1.7.git;a=blob;f=src/checks.c;hb=640d526f8 cdad00f7f5043b51f6a34f3f6ebb49f#l307 We are open for patches also for documentation to add this part to the docs ;-) To answer your question below I think layer 4 checks are 'only' tcp checks which sometimes are answered by some os when a service is listen on the specific port. This does not means that the App works properly. I'm open for any correction when my assumption is wrong. Regards Aleks [root@DS-11-82-R7-CLST-Node1 ~]# haproxy -vv HA-Proxy version 1.7.8 2017/07/07 Copyright 2000-2017 Willy Tarreau Build options : TARGET = linux2628 CPU = generic CC = gcc CFLAGS = -O2 -g -fno-strict-aliasing -Wdeclaration-after-statement -fwrapv OPTIONS = Default settings : maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200 Thanks & Regards Rajesh Kolli -Original Message- From: Aleksandar Lazic [mailto:al-hapr...@none.at] Sent: Thursday, September 07, 2017 10:08 PM To: Rajesh Kolli; haproxy@formilux.org Subject: Re: Need to understand logs Hi Rajesh. Rajesh Kolli wrote on 07.09.2017: Hello, I am using HAProxy community version from a month, i need to understand logs of HAProxy for the i need your help. Here is a sample of my logs: Sep 6 17:03:31 localhost haproxy[19389]: Health check for server Netrovert-sites/DS-11-81-R7-CLST-Node2 succeeded, reason: Layer4 check passed, check duration: 0ms, status: 1/2 DOWN. Sep 6 17:03:33 localhost haproxy[19389]: Health check for server Netrovert-sites/DS-11-81-R7-CLST-Node2 succeeded, reason: Layer4 check passed, check duration: 0ms, status: 3/3 UP. Sep 6 17:03:33 localhost haproxy[19389]: Server Netrovert-sites/DS-11-81-R7-CLST-Node2 is UP. 2 active and 0 backup servers online. 0 sessions requeued, 0 total in queue. Here my doubts are, in first line health check is 1/2 DOWN and 2nd line it is 3/3 UP, in both cases Layer4 check passed. How to understand it? what exactly it is checking? what are these 1/2 & 1/3's? Finally, is there any document to understand its logging? There is a logging part in the doc but I haven't seen such entries in the document. http://cbonte.github.io/haproxy-dconv/1.7/configuration.html#8 Maybe you have activated http://cbonte.github.io/haproxy-dconv/1.7/configuration.html#4.2-optio n%20log-health-checks in your config. It would be nice to know which haproxy version you use. haproxy -vv -- Best Regards Aleks https://www.me2digital.com/ -- Best Regards Aleks Regards, PiBa-NL
Re: confusion regarding usage of haproxy for large number of connections
Hi, Op 27-10-2017 om 14:58 schreef kushal bhattacharya: Hi, I am confused regarding the readme text ' This is a development version, so it is expected to break from time to time, to add and remove features without prior notification and it should not be used in production' .Here I am testing for 8000 connections being distributed to three virtual mqtt brokers having same ip address but three different ports.I am getting a maximum threshold of 2000 connections being handled in this setup.Haproxy is listeneing to a port for incoming client connections and distributing it to the 3 mqtt brokers with the configuration file given below defaults mode tcp maxconn 8000 timeout connect 5000s timeout client 5000s timeout server 5000s frontend localnodes bind *:9875 log global log 127.0.0.1:514 <http://127.0.0.1:514> local0 info option tcplog default_backend nodes backend nodes mode tcp balance roundrobin server web01 192.168.0.5:9878 <http://192.168.0.5:9878> maxconn 3000 server web02 192.168.0.5:9877 <http://192.168.0.5:9877> maxconn 3000 server web03 192.168.0.5:9876 <http://192.168.0.5:9876> maxconn 2000 With this configuration can i undergo my setup with 8000 connection load distribution or do i have to undergo some changes here Thanks, Kushal Add a 'maxconn 8000' in 'global' section? Regards, PiBa-NL
HAProxy 1.7.9 FreeBSD 100% CPU usage
Hi List, I've experienced a issue where its using 100% cpu usage with haproxy 1.7.9 on FreeBSD 11.1p3 / pfSense 2.4.2dev. There is very little traffic actually hitting this haproxy instance. But it happened for the second time in a few days now. Actually haproxy has been running for a few weeks with 100% and i didnt notice.. it does keep working it seems.. Anyhow thought i would try and capture the next event if it would happen again. It did after a few hours.. After the truss output below the last line keeps repeating fast lots and lots of times. kevent(0,0x0,0,{ },7,{ 1.0 }) = 0 (0x0) kevent(0,0x0,0,{ },7,{ 1.0 }) = 0 (0x0) kevent(0,0x0,0,{ },7,{ 1.0 }) = 0 (0x0) kevent(0,0x0,0,{ 1,EVFILT_READ,EV_EOF,0x0,0x0,0x0 },7,{ 0.99400 }) = 1 (0x1) recvfrom(1,0x8024ed972,16290,0,NULL,0x0) = 0 (0x0) kevent(0,{ 1,EVFILT_READ,EV_DELETE,0x0,0x0,0x0 },1,0x0,0,0x0) = 0 (0x0) kevent(0,0x0,0,{ },7,{ 0.0 }) = 0 (0x0) kevent(0,0x0,0,{ },7,{ 0.0 }) = 0 (0x0) kevent(0,0x0,0,{ },7,{ 0.0 }) = 0 (0x0) kevent(0,0x0,0,{ },7,{ 0.0 }) = 0 (0x0) kevent(0,0x0,0,{ },7,{ 0.0 }) = 0 (0x0) I tried to gather all possible relevant info in attached file. Not using much special configuration options.. but i am using lua to server a small simple static response.. I'm not sure if its a problem that might be related to LUA, or perhaps there is some other issue.?. I've got tcpdump and complete truss output from before and while it happened after a few hours, but actually just a few request (+- 29).. But i would prefer to send these off list though, Willy if you desire i send em to your mail address? But maybe i have overlooked it on the mailinglist and its a known issue already..? Last connection which i think caused/triggered the issue is in screenshot(if it attaches right to the mail..)basically just a GET request which gets a ack, followed by a FYN,ACK packet from the client 30 seconds later again followed by a ack.. The LetsEncrypt backend that is part of the configuration never got a single request according to stats.. Is it a known issue? Are tcpdump/truss output desired ..? (where should i send em?) Is there any other output that can try to gather next time? Regards, PiBa-NL HA-Proxy version 1.7.9 2017/08/18 TARGET = freebsd [2.4.2-DEVELOPMENT][admin@pfsense.local]/root: /usr/local/pkg/haproxy/haproxy_socket.sh show sess all show sess all 0x80242b800: [08/Nov/2017:19:40:18.868158] id=15 proto=tcpv4 source=45.76.a.b:53752 flags=0x48a, conn_retries=0, srv_conn=0x0, pend_pos=0x0 frontend=www (id=3 mode=http), listener=37.97.x.y:80 (id=1) addr=37.97.x.y:80 backend= (id=-1 mode=-) server= (id=-1) task=0x80248f380 (state=0x04 nice=0 calls=4 exp= age=4h23m) txn=0x802421800 flags=0x820 meth=1 status=-1 req.st=MSG_BODY rsp.st=MSG_RPBEFORE waiting=0 si[0]=0x80242ba38 (state=EST flags=0x08 endp0=CONN:0x8024ca480 exp=, et=0x000) si[1]=0x80242ba60 (state=EST flags=0x4010 endp1=APPCTX:0x8024ca600 exp=, et=0x000) co0=0x8024ca480 ctrl=tcpv4 xprt=RAW data=STRM target=LISTENER:0x8024ca300 flags=0x0025b300 fd=1 fd.state=22 fd.cache=0 updt=0 app1=0x8024ca600 st0=0 st1=0 st2=0 applet= req=0x80242b810 (f=0x80c020 an=0x0 pipe=0 tofwd=-1 total=94) an_exp= rex= wex= buf=0x8024ed900 data=0x8024ed914 o=94 p=94 req.next=94 i=0 size=16384 res=0x80242b850 (f=0x8040 an=0xa0 pipe=0 tofwd=0 total=0) an_exp= rex= wex= buf=0x783160 data=0x783174 o=0 p=0 rsp.next=0 i=0 size=0 0x80242ac00: [09/Nov/2017:00:04:24.403636] id=31 proto=unix_stream source=unix:1 flags=0x88, conn_retries=0, srv_conn=0x0, pend_pos=0x0 frontend=GLOBAL (id=0 mode=tcp), listener=? (id=1) addr=unix:1 backend= (id=-1 mode=-) server= (id=-1) task=0x80248f4d0 (state=0x0a nice=-64 calls=1 exp=10s age=?) si[0]=0x80242ae38 (state=EST flags=0x08 endp0=CONN:0x8024ca900 exp=, et=0x000) si[1]=0x80242ae60 (state=EST flags=0x4018 endp1=APPCTX:0x8024ca780 exp=, et=0x000) co0=0x8024ca900 ctrl=unix_stream xprt=RAW data=STRM target=LISTENER:0x8024ca000 flags=0x0020b306 fd=2 fd.state=25 fd.cache=0 updt=0 app1=0x8024ca780 st0=7 st1=0 st2=3 applet= req=0x80242ac10 (f=0xc08200 an=0x0 pipe=0 tofwd=-1 total=15) an_exp= rex=10s wex= buf=0x8024e7dc0 data=0x8024e7dd4 o=0 p=0 req.next=0 i=0 size=16384 res=0x80242ac50 (f=0x80008002 an=0x0 pipe=0 tofwd=-1 total=1198) an_exp= rex= wex= buf=0x8025603c0 data=0x8025603d4 o=1198 p=1198 rsp.next=0 i=0 size=16384 FreeBSD pfsense.local 11.1-RELEASE-p3 FreeBSD 11.1-RELEASE-p3 #362 r313908+9cf44ec5484(RELENG_2_4): Fri Nov 3 08:23:14 CDT 2017 [2.4.2-DEVELOPMENT][admin@pfsense.local]/root: haproxy -vv HA-Proxy version 1.7.9 2017/08/18 Copyright 2000-2017 Willy Tarreau Build options : TARGET = freebsd CPU = generic CC = cc CFLAGS = -O2 -pipe -fstack
Re: HAProxy 1.7.9 FreeBSD 100% CPU usage
Hi Willy, Op 9-11-2017 om 5:45 schreef Willy Tarreau: Hi Pieter, On Thu, Nov 09, 2017 at 02:28:46AM +0100, PiBa-NL wrote: Actually haproxy has been running for a few weeks with 100% and i didnt notice.. it does keep working it seems.. Anyhow thought i would try and capture the next event if it would happen again. It did after a few hours.. After the truss output below the last line keeps repeating fast lots and lots of times. kevent(0,0x0,0,{ },7,{ 1.0 }) = 0 (0x0) kevent(0,0x0,0,{ },7,{ 1.0 }) = 0 (0x0) kevent(0,0x0,0,{ },7,{ 1.0 }) = 0 (0x0) kevent(0,0x0,0,{ 1,EVFILT_READ,EV_EOF,0x0,0x0,0x0 },7,{ 0.99400 }) = 1 (0x1) recvfrom(1,0x8024ed972,16290,0,NULL,0x0) = 0 (0x0) kevent(0,{ 1,EVFILT_READ,EV_DELETE,0x0,0x0,0x0 },1,0x0,0,0x0) = 0 (0x0) kevent(0,0x0,0,{ },7,{ 0.0 }) = 0 (0x0) kevent(0,0x0,0,{ },7,{ 0.0 }) = 0 (0x0) kevent(0,0x0,0,{ },7,{ 0.0 }) = 0 (0x0) kevent(0,0x0,0,{ },7,{ 0.0 }) = 0 (0x0) kevent(0,0x0,0,{ },7,{ 0.0 }) = 0 (0x0) We had something similar on Linux in relation with TCP splicing and the fd cache, for which a fix was emitted. But yesterday Christopher explained me that the fix has an impact on the way applets are scheduled in 1.8, so actually it could mean that the initial bug might possibly cover a larger scope than splicing only, and that recv+send might also be affected. If you're interested in testing, the commit in 1.7 is c040c1f ("BUG/MAJOR: stream-int: don't re-arm recv if send fails") and is present in the latest snapshot (we really need to emit 1.7.10 BTW). I'd be curious to know if it fixes it or not. At least it will tell us if that's related to this fd cache thing or to something completely different such as Lua. I also need to check with Thierry if we could find a way to add some stats about the time spent in Lua to "show info" to help debugging such cases where Lua is involved. By the way, thanks for your dump, we'll check the sessions' statuses. There are not that many, and maybe it will give us a useful indication! Cheers, Willy Thanks for your time, i didn't think the 'splice' problem mentioned on mailing-list would be relevant for my case so i didn't see a need to try latest snapshot. Couldn't find much other recent cpu issues there. But ill try and compile haproxy 1.7 latest snapshot or perhaps just 1.7.9 with this extra patch and see if it keeps running with low cpu usage for a few days.. I have not compiled haproxy for a while, ill see what works the easiest for me, assuming can make it work build on a separate FreeBSD machine and packaged/copied to the actual 'problem machine' that doesn't have compilation tools on it.. hopefully my build binary will be 'compatible'.. Will report back in a few day's.. Thanks, PiBa-NL / Pieter
Re: HAProxy 1.7.9 FreeBSD 100% CPU usage
Hi Willy, List, Is it correct that when i build a haproxy-ss-20171017 snapshot that the version still shows up as: "HAProxy version 1.7.9, released 2017/08/18" on both haproxy -vv and stats page.? Or did i do it wrong? p.s. I changed the Makefile like this: PORTNAME= haproxy-ss PORTVERSION= 20171017 CATEGORIES= net www MASTER_SITES= http://www.haproxy.org/download/1.7/src/snapshot/ And then ran: make clean build install NO_CHECKSUM=yes Which did 'seem' to download the 'intended?' file.. Thanks, PiBa-NL / Pieter
Re: HAProxy 1.7.9 FreeBSD 100% CPU usage
Hi Willy, Op 9-11-2017 om 5:45 schreef Willy Tarreau: Hi Pieter, We had something similar on Linux in relation with TCP splicing and the fd cache, for which a fix was emitted. But yesterday Christopher explained me that the fix has an impact on the way applets are scheduled in 1.8, so actually it could mean that the initial bug might possibly cover a larger scope than splicing only, and that recv+send might also be affected. If you're interested in testing, the commit in 1.7 is c040c1f ("BUG/MAJOR: stream-int: don't re-arm recv if send fails") and is present in the latest snapshot (we really need to emit 1.7.10 BTW). I'd be curious to know if it fixes it or not. At least it will tell us if that's related to this fd cache thing or to something completely different such as Lua. I also need to check with Thierry if we could find a way to add some stats about the time spent in Lua to "show info" to help debugging such cases where Lua is involved. By the way, thanks for your dump, we'll check the sessions' statuses. There are not that many, and maybe it will give us a useful indication! Cheers, Willy Okay have been running with haproxy-ss-20171017 for a day now. Sofar it sticks to <1% cpu usage. Will report if anything changes, cant tell for sure as don't have a clear reproduction of the issue, but my issue 'seems' fixed. Regards, PiBa-NL / Pieter
haproxy-1.8-rc4 - FreeBSD 11.1 - build error: undefined reference, plock.h __unsupported_argument_size_for_pl_try_s__
Hi haproxy-list, I'm trying to build 1.8rc4 on FreeBSD 11.1, but it throws a few errors for me.. src/listener.o: In function `listener_accept': /usr/ports/net/haproxy-devel/work/haproxy-1.8-rc4/src/listener.c:455: undefined reference to `__unsupported_argument_size_for_pl_try_s__' src/signal.o: In function `__signal_process_queue': /usr/ports/net/haproxy-devel/work/haproxy-1.8-rc4/src/signal.c:74: undefined reference to `__unsupported_argument_size_for_pl_try_s__' src/fd.o: In function `fd_process_cached_events': /usr/ports/net/haproxy-devel/work/haproxy-1.8-rc4/src/fd.c:248: undefined reference to `__unsupported_argument_size_for_pl_try_s__' cc: error: linker command failed with exit code 1 (use -v to see invocation) Removing line 119 & 120 from plock.h makes the build succeed.. But i am not sure what gets broken by doing so if anything?.. With those lines removed i get the result below from haproxy -vv, looks good :) but i didn't actually start it yet with a proper config. Regards, PiBa-NL / Pieter root@:/usr/ports/net/haproxy-devel # haproxy -vv HA-Proxy version 1.8-rc4-cfe1466 2017/11/19 Copyright 2000-2017 Willy Tarreau Build options : TARGET = freebsd CPU = generic CC = cc CFLAGS = -pipe -g -fstack-protector -fno-strict-aliasing -fno-strict-aliasing -Wdeclaration-after-statement -fwrapv -Wno-address-of-packed-member -Wno-null-dereference -Wno-unused-label -DFREEBSD_PORTS OPTIONS = USE_GETADDRINFO=1 USE_ZLIB=1 USE_CPU_AFFINITY=1 USE_ACCEPT4=1 USE_REGPARM=1 USE_OPENSSL=1 USE_LUA=1 USE_STATIC_PCRE=1 USE_PCRE_JIT=1 Default settings : maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200 Built with multi-threading support. Built with PCRE version : 8.40 2017-01-11 Running on PCRE version : 8.40 2017-01-11 PCRE library supports JIT : yes Encrypted password support via crypt(3): yes Built with zlib version : 1.2.11 Running on zlib version : 1.2.11 Compression algorithms supported : identity("identity"), deflate("deflate"), raw-deflate("deflate"), gzip("gzip") Built with network namespace support. Built with transparent proxy support using: IP_BINDANY IPV6_BINDANY Built with Lua version : Lua 5.3.4 Built with OpenSSL version : OpenSSL 1.0.2k-freebsd 26 Jan 2017 Running on OpenSSL version : OpenSSL 1.0.2k-freebsd 26 Jan 2017 OpenSSL library supports TLS extensions : yes OpenSSL library supports SNI : yes OpenSSL library supports : SSLv3 TLSv1.0 TLSv1.1 TLSv1.2 Available polling systems : kqueue : pref=300, test result OK poll : pref=200, test result OK select : pref=150, test result OK Total: 3 (3 usable), will use kqueue. Available filters : [TRACE] trace [COMP] compression [SPOE] spoe root@:/usr/ports/net/haproxy-devel #
Re: haproxy-1.8-rc4 - FreeBSD 11.1 - build error: undefined reference, plock.h __unsupported_argument_size_for_pl_try_s__
Hi Willy, Op 19-11-2017 om 22:54 schreef Willy Tarreau: On Sun, Nov 19, 2017 at 10:43:21PM +0100, Willy Tarreau wrote: Your workaround has disabled locking so you'll get random bugs if you enable threads. Could you please send the output of "cc -v" ? That should help figure how this situation is possible. In the mean time you can build with "USE_THREAD=" to disable threads. OK I found what happens. For an unknown reason you have no optimization settings, and apparently by default Clang doesn't even perform constant optimizations (ie: when you write "if (0) do_something()" it still wants to emit the code for "do_something()". So the check on the argument size still works but the compiler is unhappy with the error message used to stop the build. Does your build command line force the CPU variable or the CPU_CFLAGS variable ? I suspect it's this. By not forcing it, you'll stay on "generic" and it will use "-O2" by default, avoiding this problem. We still need to improve this (possibly by detecting clang and lack of optimization if needed), otherwise the questionn will come back from time to time. Thanks, Willy Sorry, I should have mentioned how i was trying to build haproxy, didn't even think of my build options would matter here.. Was using this: make clean build reinstall NO_CHECKSUM=yes WITH_DEBUG=yes STRIP= So it would have debug symbols if problems would arise.. The "WITH_DEBUG=yes" strips the -O2 away and makes it a -g and thus 'changing' the way the compiler works. Without this flags it indeed builds without issue. But any future gdb session wouldn't have symbols.. Thanks, PiBa-NL / Pieter p.s. Not sure if it still matters but for the record my cc -v output: root@:/usr/ports/net/haproxy-devel # cc -v FreeBSD clang version 4.0.0 (tags/RELEASE_400/final 297347) (based on LLVM 4.0.0) Target: x86_64-unknown-freebsd11.1 Thread model: posix InstalledDir: /usr/bin
Re: haproxy-1.8-rc4 - FreeBSD 11.1 - master-worker daemon mode does not bind.?.
A little bump.. Wondering if the truss attachments maybe got my mail blocked.. ( the mail doesn't show on https://www.mail-archive.com/haproxy@formilux.org/maillist.html ) They where 107KB total in 2 attachments.. Now in 0bin links: https://0bin.net/paste/GIcxOfap-GYPrO7H#OnvGNxx2k41SLEK6VxJk9n-mD7vv/vQe/Pj33VRqdju https://0bin.net/paste/sJ955XNt2hE1a9mF#xsMP2tzydlK3BVpxo2nNRl878SRbxZNAUpRw5-YhwdM Op 20-11-2017 om 1:47 schreef PiBa-NL: Hi List, After compiling haproxy 1.8-rc4 (without modifications) on FreeBSD11.1 i'm trying to run it with master-worker option. I can run it with the following config from a ssh console: global #daemon master-worker nbproc 4 listen HAProxyLocalStats bind :2200 name localstats mode http stats enable stats refresh 2 stats admin if TRUE stats uri / stats show-desc Test2 It then starts 5 haproxy processes and the stats page works, being served from one of the workers. However if i start it with the 'daemon' option enabled or the -D startup parameter, it starts in background, also starts 4 workers, but then doesn't respond to browser requests.. Sending a 'kill -1' to the master does start new workers see output below. Truss output attached from commands below with a few requests to the stats page.. truss -dfHo /root/haproxy-truss.txt -f haproxy -f /root/hap.conf -D truss -dfHo /root/haproxy-truss.txt -f haproxy -f /root/hap.conf truss shows that 'accept4' isn't called when ran in daemon mode.. Am i doing something wrong? Or how can i check this further? Regards, PiBa-NL / Pieter root@:/ # haproxy -vv HA-Proxy version 1.8-rc4-cfe1466 2017/11/19 Copyright 2000-2017 Willy Tarreau Build options : TARGET = freebsd CPU = generic CC = cc CFLAGS = -O2 -pipe -fstack-protector -fno-strict-aliasing -fno-strict-aliasing -Wdeclaration-after-statement -fwrapv -Wno-address-of-packed-member -Wno-null-dereference -Wno-unused-label -DFREEBSD_PORTS OPTIONS = USE_GETADDRINFO=1 USE_ZLIB=1 USE_CPU_AFFINITY=1 USE_ACCEPT4=1 USE_REGPARM=1 USE_OPENSSL=1 USE_LUA=1 USE_STATIC_PCRE=1 USE_PCRE_JIT=1 Default settings : maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200 Built with multi-threading support. Built with PCRE version : 8.40 2017-01-11 Running on PCRE version : 8.40 2017-01-11 PCRE library supports JIT : yes Encrypted password support via crypt(3): yes Built with zlib version : 1.2.11 Running on zlib version : 1.2.11 Compression algorithms supported : identity("identity"), deflate("deflate"), raw-deflate("deflate"), gzip("gzip") Built with network namespace support. Built with transparent proxy support using: IP_BINDANY IPV6_BINDANY Built with Lua version : Lua 5.3.4 Built with OpenSSL version : OpenSSL 1.0.2k-freebsd 26 Jan 2017 Running on OpenSSL version : OpenSSL 1.0.2k-freebsd 26 Jan 2017 OpenSSL library supports TLS extensions : yes OpenSSL library supports SNI : yes OpenSSL library supports : SSLv3 TLSv1.0 TLSv1.1 TLSv1.2 Available polling systems : kqueue : pref=300, test result OK poll : pref=200, test result OK select : pref=150, test result OK Total: 3 (3 usable), will use kqueue. Available filters : [TRACE] trace [COMP] compression [SPOE] spoe root@:/ # haproxy -f /root/hap.conf -D [WARNING] 323/005936 (1604) : config : missing timeouts for proxy 'HAProxyLocalStats'. | While not properly invalid, you will certainly encounter various problems | with such a configuration. To fix this, please ensure that all following | timeouts are set to a non-zero value: 'client', 'connect', 'server'. [WARNING] 323/005936 (1604) : Proxy 'HAProxyLocalStats': in multi-process mode, stats will be limited to process assigned to the current request. [WARNING] 323/005936 (1604) : Proxy 'HAProxyLocalStats': stats admin will not work correctly in multi-process mode. root@:/ # kill -1 1605 root@:/ # [WARNING] 323/005936 (1605) : Reexecuting Master process [WARNING] 323/005945 (1605) : config : missing timeouts for proxy 'HAProxyLocalStats'. | While not properly invalid, you will certainly encounter various problems | with such a configuration. To fix this, please ensure that all following | timeouts are set to a non-zero value: 'client', 'connect', 'server'. [WARNING] 323/005945 (1605) : Proxy 'HAProxyLocalStats': in multi-process mode, stats will be limited to process assigned to the current request. [WARNING] 323/005945 (1605) : Proxy 'HAProxyLocalStats': stats admin will not work correctly in multi-process mode. [WARNING] 323/005945 (1605) : Former worker 1607 left with exit code 0 [WARNING] 323/005945 (1605) : Former worker 1606 left with exit code 0 [WARNING] 323/005945 (1605) : Former worker 1608 left with exit code 0 [WARNING] 323/005945 (1605) : Former worker 1609 left with exit code 0
Re: haproxy-1.8-rc4 - FreeBSD 11.1 - master-worker daemon mode does not bind.?.
Hi Willy, Op 20-11-2017 om 21:46 schreef Willy Tarreau: Hi Pieter, On Mon, Nov 20, 2017 at 01:47:48AM +0100, PiBa-NL wrote: Hmmm thinking about it there might be something. Could you start with "-dk" to disable kqueue and fall back to poll ? kqueue registers a post- fork function to close and reopen the kqueue fd. I wouldn't be surprized if we're having a problem with it not being placed exactly where needed when running in master-worker mode. Or maybe we need to call it twice when forking into background and one call is missing somewhere. Thanks! Willy With -dk it starts in background and serves the stats page as expected. So seems indeed related to the poller used in combination with master-worker. Regards, PiBa-NL
Re: haproxy-1.8-rc4 - FreeBSD 11.1 - master-worker daemon mode does not bind.?.
Hi Willy, Op 20-11-2017 om 22:08 schreef Willy Tarreau: OK thank you. I suspect something wrong happens, such as the master killing the same kevent_fd as the other ones are using or something like this. Could you please try the attached patch just in case it fixes anything ? I have not testedit and it may even break epoll, but one thing at a time :-) Thanks, Willy You patch fixes the issue. If you've got a definitive patch lemme know. Thanks, PiBa-NL
haproxy-1.8-rc4 - FreeBSD 11.1 - master-worker daemon parent staying alive/process-owner
Hi List, I've got a startup script that essentially looks like the one below #1# (simplified..) When configured with master-worker, the first parent process 2926 as seen in #2# keeps running. Doing the same without master-worker, the daemon properly detaches and the parent exits returning possible warnings/errors.. When the second php exec line in #1# with "> /dev/null" is used instead it does succeed. While its running the stats page does get served by the workers.. To avoid a possible issue with polers(see my previous mail thread) ive tried to add the -dk but still the first started parent process stays alive.. And if terminated with a ctrl+c it stops the other master-worker processes with it.. as can be seen in #3# (was from a different attempt so different processid's.). 'truss' output (again with different pids..): https://0bin.net/paste/f2p8uRU1t2ebZjkL#iJOBdPnR8mCmRrtGGkEaqsmQXfbHmQ56vQHdseh1x8U If desired i can gater the htop/truss/console output information from a single run.. Any other info i can provide? Or should i change my script to not expect any console output from haproxy? In my original script the 'exec' is called with 2 extra parameters that return the console output and exit status.. p.s. how should configuration/startup errors be 'handled' when using master-worker? A kill -1 itself wont tell if a new configured bind cannot find the interface address to bind to? and a -c before hand wont find such a problem. The end result that nothing is running and the error causing that however should be 'caught' somehow for logging.?. should haproxy itself log it to syslogs? but how will the startup script know to notify the user of a failure? Would it be possible when starting haproxy with -sf it would tell if the (original?) master was successful in reloading the config / starting new workers or how should this be done? Currently a whole new set of master-worker processes seems to be take over.. Or am i taking the wrong approach here? Regards, PiBa-NL / Pieter #1# Startup script (simplified..) haproxy.sh: #!/bin/sh echo "Starting haproxy." /usr/local/bin/php -q <// exec("/usr/local/sbin/haproxy -f /var/etc/haproxy/haproxy.cfg -D -dk > /dev/null"); ?> ENDOFF echo "Started haproxy..." #2# process list: PID PPID PGRP SESN TPGID NLWP USER PRI NI VIRT RES S CPU% MEM% TIME+ Command 9203 1 9203 9203 0 1 root 20 0 53492 4492 S 0.0 0.4 0:00.02 `- /usr/sbin/sshd 99097 9203 99097 99097 0 1 root 20 0 78840 7608 S 0.0 0.8 0:01.04 | `- sshd: root@pts/0 99900 99097 99900 99900 2651 1 root 24 0 13084 2808 S 0.0 0.3 0:00.01 | | `- -sh 161 99900 161 99900 2651 1 root 52 0 13084 2688 S 0.0 0.3 0:00.00 | | `- /bin/sh /etc/rc.initial 3486 161 3486 99900 2651 1 root 20 0 13392 3696 S 0.0 0.4 0:00.19 | | `- /bin/tcsh 2651 3486 2651 99900 2651 1 root 21 0 13084 2660 S 0.0 0.3 0:00.00 | | `- /bin/sh /usr/local/etc/rc.d/haproxy.sh start 2801 2651 2651 99900 2651 1 root 27 0 232M 19500 S 0.0 2.0 0:00.07 | | `- /usr/local/bin/php -q 2926 2801 2651 99900 2651 1 root 29 0 0 0 Z 0.0 0.0 0:00.01 | | `- /usr/local/sbin/haproxy -f /var/etc/haproxy/haproxy.cfg -D -dk 3061 1 2651 99900 2651 1 root 31 0 28288 7420 S 0.0 0.7 0:00.00 `- /usr/local/sbin/haproxy -f /var/etc/haproxy/haproxy.cfg -D -dk 3524 3061 3524 3524 0 1 root 20 0 28288 7436 S 0.0 0.7 0:00.04 | `- /usr/local/sbin/haproxy -f /var/etc/haproxy/haproxy.cfg -D -dk 3432 3061 3432 3432 0 1 root 20 0 28288 7436 S 0.0 0.7 0:00.04 | `- /usr/local/sbin/haproxy -f /var/etc/haproxy/haproxy.cfg -D -dk 3276 3061 3276 3276 0 1 root 20 0 28288 7436 S 0.0 0.7 0:00.04 | `- /usr/local/sbin/haproxy -f /var/etc/haproxy/haproxy.cfg -D -dk 3103 3061 3103 3103 0 1 root 20 0 28288 7436 S 0.0 0.7 0:00.04 | `- /usr/local/sbin/haproxy -f /var/etc/haproxy/haproxy.cfg -D -dk #3# starting script from ssh and terminating it with Ctrl+C: [2.4.3-DEVELOPMENT][root@pfSe.localdomain]/root: /usr/local/etc/rc.d/haproxy.sh Starting haproxy. [WARNING] 324/010345 (94381) : config : missing timeouts for proxy 'HAProxyLocalStats'. | While not properly invalid, you will certainly encounter various problems | with such a configuration. To fix this, please ensure that all following | timeouts are set to a non-zero value: 'client', 'connect', 'server'. [WARNING] 324/010345 (94381) : Proxy 'HAProxyLocalStats': in multi-process mode, stats will be limited to process assigned to the curren
Re: 4xx statistics made useless through health checks?
Hi Daniel, Op 21-11-2017 om 14:20 schreef Daniel Schneller: On 21. Nov. 2017, at 14:08, Lukas Tribus wrote: [...] Instead of hiding specific errors counters, why not send an actual HTTP request that triggers a 200 OK response? So health checking is not exempt from the statistics and only generates error statistics when actual errors occur? Good point. I wanted to avoid, however, having these “high level” health checks from the many many sidecars being routed through to the actual backends. Instead, I considered it enough to “only” check if the central haproxy is available. In case it is, the sidecars rely on it doing the actual health checks of the backends and responding with 503 or similar, when all backends for a particular request happen to be down. Maybe monitor-uri perhaps together with 'monitor fail' could help ?: http://cbonte.github.io/haproxy-dconv/1.8/snapshot/configuration.html#4.2-monitor-uri It says it wont log or forward the request.. not sure but maybe stats will also skip it. However, your idea and a little more Googling led me to this Github repo https://github.com/jvehent/haproxy-aws#healthchecks-between-elb-and-haproxy where they configure a dedicated “health check frontend” (albeit in their case to work around an AWS/ELB limitation re/ PROXY protocol). I think I will adapt this and configure the sidecars to health check on a dedicated port like this. I’ll let you know how it goes. Thanks a lot for your thoughts, so far :) Daniel Regards, PiBa-NL / Pieter
Re: haproxy-1.8-rc4 - FreeBSD 11.1 - master-worker daemon parent staying alive/process-owner
Hi William, I was intending to use the new feature to pass open sockets to the next haproxy process. And thought that master-worker is a 'requirement' to make that work as it would manage the transferal of sockets. Now i'm thinking thats not actually how its working at all.. I could 'manually' pass the -x /haproxy.socket to the next process and make it take over the sockets that way i guess.? (How does this combine with nbproc>1 and multiple stats sockets bound to separate processes?) Though i can imagine a future where the master would maybe provide some aggregated stats and management socket to perform server status changes. Perhaps i should step away from using master-worker for the moment. However the -W -D combination doesn't seem to (fully) work as i expected, responses below.. Op 21-11-2017 om 2:59 schreef William Lallemand: the master-worker was designed in a way to replace the systemd-wrapper, and the systemd way to run a daemon is to keep it on the foreground and pipe it to systemd so it can catch the errors on the standard ouput. However, it was also designed for normal people who wants to daemonize, so you can combine -W with -D which will daemonize the master. I'm not sure of getting the issue there, the errors are still displayed upon startup like in any other haproxy mode, there is really no change here. I assume your only problem with your script is the daemonize that you can achieve by combining -W and -D. I would prefer to do both 'catch' startup errors and daemonize haproxy. In my previous mail i'm starting it with -D, and the -W is equivalent of the global master-worker option in the config, so it 'should' daemonize right? But it did not(properly?), ive just tried with both startup parameters -D -W the result is the same. The master with pid 3061 is running under the system /sbin/init pid 1, however the pid 2926 also keeps running i would want/expect 2926 to exit when startup is complete. I just also noted that the 2926 actually becomes a 'zombie'.?. that cant be good right? A kill -1 itself wont tell if a new configured bind cannot find the interface address to bind to? and a -c before hand wont find such a problem. Upon a reload (SIGUSR2 on the master) the master will try to parse the configuration again and start the listeners. If it fails, the master will reexec itself in a wait() mode, and won't kill the previous workers, the parsing/bind error should be displayed on the standard output of the master. I think i saw it exit but cannot reproduce it anymore with the scenario of a wrong ip in the bind.. I might have issued a wrong signal there when i tried (a USR1 instead of a USR2 or something. ). It seems to work properly the way you describe.. (when properly demonized..) Sorry for the noise on this part.. The end result that nothing is running and the error causing that however should be 'caught' somehow for logging.?. should haproxy itself log it to syslogs? but how will the startup script know to notify the user of a failure? Well, the master don't do syslog, because there might be no syslog in your configuration. I think you should try the systemd way and log the standard output. I don't want to use systemd, but i do want to log standard output, at least during initial startup.. Would it be possible when starting haproxy with -sf it would tell if the (original?) master was successful in reloading the config / starting new workers or how should this be done? That may be badly documented but you are not supposed to use -sf with the master worker, you just have to send the -USR2 signal to the master and it will parse again the configuration, launch new workers and kill smoothly the previous ones. Unfortunately signals are asynchronous, and we don't have a way yet to return a bad exit code upon reload. But we might implement a synchronous configuration notification in the future, using the admin socket for example. Being able to signaling the master to reload over a admin socket and getting 'feedback' about its results would likely also solve my 'reload feedback' problem. Lets consider that a feature request :). Though maybe i shouldn't be using master-worker at all for the moment.. Currently a whole new set of master-worker processes seems to be take over.. Well, I supposed that's because you launched a new master-worker with -sf, it's not supposed to be used that way but it should work too if you don't mind having a new PID. I kinda expected this to indeed be 'as intended' -sf will fully replace the old processes. Thanks for your reply. Regards, PiBa-NL / Pieter
Re: haproxy-1.8-rc4 - FreeBSD 11.1 - master-worker daemon parent staying alive/process-owner
Hi William, I'm not 100% sure but i think the stdout and errout files should be closed before process exit? It seems to me they are not. At least with the following php script it fails to 'read' where the output from haproxy ends and it keeps waiting. Without the -W it succeeds. Could you check? Regards, PiBa-NL #!/usr/local/bin/php-cgi -f 0 => array("pipe", "r"), // stdin is a pipe that the child will read from 1 => array("pipe", "w"), // stdout is a pipe that the child will write to 2 => array("pipe", "w") // stderr is a file to write to ); $cwd = '/root'; $env = array(); $process = proc_open('haproxy -f hap.conf -W -D -dk', $descriptorspec, $pipes, $cwd, $env); echo "\n START\n"; echo "\n procstatus\n"; print_r(proc_get_status($process)); if (is_resource($process)) { echo "\n ERROUT\n"; while (false !== ($char = fgetc($pipes[2]))) { echo "$char"; } echo "\n STDOUT\n"; while (false !== ($char = fgetc($pipes[1]))) { echo "$char"; } echo "\n DONE reading.."; fclose($pipes[0]); fclose($pipes[1]); fclose($pipes[2]); $return_value = proc_close($process); echo "command returned $return_value\n"; } else { echo 'FAIL'; }; Op 21-11-2017 om 16:34 schreef PiBa-NL: Hi William, I was intending to use the new feature to pass open sockets to the next haproxy process. And thought that master-worker is a 'requirement' to make that work as it would manage the transferal of sockets. Now i'm thinking thats not actually how its working at all.. I could 'manually' pass the -x /haproxy.socket to the next process and make it take over the sockets that way i guess.? (How does this combine with nbproc>1 and multiple stats sockets bound to separate processes?) Though i can imagine a future where the master would maybe provide some aggregated stats and management socket to perform server status changes. Perhaps i should step away from using master-worker for the moment. However the -W -D combination doesn't seem to (fully) work as i expected, responses below.. Op 21-11-2017 om 2:59 schreef William Lallemand: the master-worker was designed in a way to replace the systemd-wrapper, and the systemd way to run a daemon is to keep it on the foreground and pipe it to systemd so it can catch the errors on the standard ouput. However, it was also designed for normal people who wants to daemonize, so you can combine -W with -D which will daemonize the master. I'm not sure of getting the issue there, the errors are still displayed upon startup like in any other haproxy mode, there is really no change here. I assume your only problem with your script is the daemonize that you can achieve by combining -W and -D. I would prefer to do both 'catch' startup errors and daemonize haproxy. In my previous mail i'm starting it with -D, and the -W is equivalent of the global master-worker option in the config, so it 'should' daemonize right? But it did not(properly?), ive just tried with both startup parameters -D -W the result is the same. The master with pid 3061 is running under the system /sbin/init pid 1, however the pid 2926 also keeps running i would want/expect 2926 to exit when startup is complete. I just also noted that the 2926 actually becomes a 'zombie'.?. that cant be good right? A kill -1 itself wont tell if a new configured bind cannot find the interface address to bind to? and a -c before hand wont find such a problem. Upon a reload (SIGUSR2 on the master) the master will try to parse the configuration again and start the listeners. If it fails, the master will reexec itself in a wait() mode, and won't kill the previous workers, the parsing/bind error should be displayed on the standard output of the master. I think i saw it exit but cannot reproduce it anymore with the scenario of a wrong ip in the bind.. I might have issued a wrong signal there when i tried (a USR1 instead of a USR2 or something. ). It seems to work properly the way you describe.. (when properly demonized..) Sorry for the noise on this part.. The end result that nothing is running and the error causing that however should be 'caught' somehow for logging.?. should haproxy itself log it to syslogs? but how will the startup script know to notify the user of a failure? Well, the master don't do syslog, because there might be no syslog in your configuration. I think you should try the systemd way and log the standard output. I don't want to use systemd, but i do want to log standard output, at least during initial startup.. Would it be possible when starting haproxy with -sf it would tell if the (or
Re: haproxy-1.8-rc4 - FreeBSD 11.1 - master-worker daemon parent staying alive/process-owner
Hi William, Found the 'crash?' i was talking about earlier again. Start haproxy like this: haproxy -f /root/hap.conf -W -D -dk -q Then issues a USR2 to the master. (the first parent/zombie is already gone so thats good imho..) It will temporarily start new workers and then immediately everything stops running.. Anyhow looking forward to your replies. Regards, PiBa-NL Op 22-11-2017 om 17:48 schreef PiBa-NL: Hi William, I'm not 100% sure but i think the stdout and errout files should be closed before process exit? It seems to me they are not. At least with the following php script it fails to 'read' where the output from haproxy ends and it keeps waiting. Without the -W it succeeds. Could you check? Regards, PiBa-NL #!/usr/local/bin/php-cgi -f 0 => array("pipe", "r"), // stdin is a pipe that the child will read from 1 => array("pipe", "w"), // stdout is a pipe that the child will write to 2 => array("pipe", "w") // stderr is a file to write to ); $cwd = '/root'; $env = array(); $process = proc_open('haproxy -f hap.conf -W -D -dk', $descriptorspec, $pipes, $cwd, $env); echo "\n START\n"; echo "\n procstatus\n"; print_r(proc_get_status($process)); if (is_resource($process)) { echo "\n ERROUT\n"; while (false !== ($char = fgetc($pipes[2]))) { echo "$char"; } echo "\n STDOUT\n"; while (false !== ($char = fgetc($pipes[1]))) { echo "$char"; } echo "\n DONE reading.."; fclose($pipes[0]); fclose($pipes[1]); fclose($pipes[2]); $return_value = proc_close($process); echo "command returned $return_value\n"; } else { echo 'FAIL'; }; Op 21-11-2017 om 16:34 schreef PiBa-NL: Hi William, I was intending to use the new feature to pass open sockets to the next haproxy process. And thought that master-worker is a 'requirement' to make that work as it would manage the transferal of sockets. Now i'm thinking thats not actually how its working at all.. I could 'manually' pass the -x /haproxy.socket to the next process and make it take over the sockets that way i guess.? (How does this combine with nbproc>1 and multiple stats sockets bound to separate processes?) Though i can imagine a future where the master would maybe provide some aggregated stats and management socket to perform server status changes. Perhaps i should step away from using master-worker for the moment. However the -W -D combination doesn't seem to (fully) work as i expected, responses below.. Op 21-11-2017 om 2:59 schreef William Lallemand: the master-worker was designed in a way to replace the systemd-wrapper, and the systemd way to run a daemon is to keep it on the foreground and pipe it to systemd so it can catch the errors on the standard ouput. However, it was also designed for normal people who wants to daemonize, so you can combine -W with -D which will daemonize the master. I'm not sure of getting the issue there, the errors are still displayed upon startup like in any other haproxy mode, there is really no change here. I assume your only problem with your script is the daemonize that you can achieve by combining -W and -D. I would prefer to do both 'catch' startup errors and daemonize haproxy. In my previous mail i'm starting it with -D, and the -W is equivalent of the global master-worker option in the config, so it 'should' daemonize right? But it did not(properly?), ive just tried with both startup parameters -D -W the result is the same. The master with pid 3061 is running under the system /sbin/init pid 1, however the pid 2926 also keeps running i would want/expect 2926 to exit when startup is complete. I just also noted that the 2926 actually becomes a 'zombie'.?. that cant be good right? A kill -1 itself wont tell if a new configured bind cannot find the interface address to bind to? and a -c before hand wont find such a problem. Upon a reload (SIGUSR2 on the master) the master will try to parse the configuration again and start the listeners. If it fails, the master will reexec itself in a wait() mode, and won't kill the previous workers, the parsing/bind error should be displayed on the standard output of the master. I think i saw it exit but cannot reproduce it anymore with the scenario of a wrong ip in the bind.. I might have issued a wrong signal there when i tried (a USR1 instead of a USR2 or something. ). It seems to work properly the way you describe.. (when properly demonized..) Sorry for the noise on this part.. The end result that nothing is running and the error causing that however should be 'caught' somehow for logging.?. should haproxy itself log it
Re: haproxy-1.8-rc4 - FreeBSD 11.1 - master-worker daemon mode does not bind.?.
Hi Willy, Op 25-11-2017 om 8:08 schreef Willy Tarreau: Hi Pieter, We found that it's the first fork_poller() placed after the fork which fixed the issue but that makes absolutely no sense since the next thing done is exit(), which also results in closing the kqueue fd! So unless there's something in at_exit() having an effect there, it really doesn't make much sense and could imply something more subtle. Thus my question is : do you have a real use case of master-worker+daemon on freebsd or can we release like this and try to fix it after the release ? It's about the last rock in the shoe we have. Thanks, Willy Personally i do not need master-worker at present time.. But running it with 'quiet' configuration option or -q startup parameter is something that imho does need to be checked though. It seems to always die when doing so after sending a USR2, also without using daemon mode.!. (i didn't quite specify that fully in the other mail/thread..) -- Some background info -- My usecase mainly revolves around running haproxy on pfSense (a FreeBSD firewall distribution), im using and maintaining the haproxy 'package' for it. All services there are 'managed' by php and shell scripts. When users modify the configuration in the webgui and press 'apply' i need to restart haproxy and show any warnings/alerts that might have been returned. This has worked and still works fine without requiring master-worker. So no problem or missing advantages for myself or users of that package if haproxy 1.8 gets released as it currently is. I'm not sure if other folks are using some service management tool like systemd on FreeBSD... But i guess time will tell, i'm sure 1.8 wont be the last release ever so fixes when required will come :). Regards, PiBa-NL / Pieter
Re: haproxy-1.8-rc4 - FreeBSD 11.1 - master-worker daemon parent staying alive/process-owner
Hi Willy, Op 25-11-2017 om 8:33 schreef Willy Tarreau: Hi Pieter, On Tue, Nov 21, 2017 at 04:34:16PM +0100, PiBa-NL wrote: Hi William, I was intending to use the new feature to pass open sockets to the next haproxy process. And thought that master-worker is a 'requirement' to make that work as it would manage the transferal of sockets. Now i'm thinking thats not actually how its working at all.. I could 'manually' pass the -x /haproxy.socket to the next process and make it take over the sockets that way i guess.? Yes it's the intent indeed. Master-worker and -x were developed in parallel and then master-worker was taught to be compatible with this, but the primary purpose of -x is to pass FDs without needing MW. Great, i suppose ill need to make a few (small) changes implementing this then in the package i maintain for pfSense, probably easier than changing it to use master-worker anyhow :). (How does this combine with nbproc>1 and multiple stats sockets bound to separate processes?) There's a special case for this. Normally, as you know, listening FDs not used in a process are closed after the fork(). Now by simply setting "expose-fd listeners" on your stats socket, the process running the stats socket will keep *all* listening FDs open in order to pass them upon invocation from the CLI. Thus, even with nbproc>1, sockets split across different processes and a single stats socket, -x will retrieve all listeners at once. Oke this will work well then :). Was thinking if i'm going to do it myself (pass the -x argument), i need to make sure i do it properly. Though i can imagine a future where the master would maybe provide some aggregated stats and management socket to perform server status changes. Perhaps i should step away from using master-worker for the moment. I think you don't need it for now and you're right that we'd all like it to continue to evolve. Simon had done an amazing work on this subject in the past, making a socket server and stuff like this but by then the internal architecture was not ready and we faced many issues so we had to drop it. But despite this there was already a huge motivation in trying to get this to work. This was during 1.5-dev5! Since then, microservices have emerged with the need for more common updates, the need to aggregate information has increased, etc. So yes, I think that the reasons that motivated us to try this 7 years ago are still present and have been emphasized over time. Maybe in a few years the master-worker mode will be the only one supported if it provides extra facilities such as being a CLI gateway for all processes or collecting stats. Let's just not rush and use it for what it is for now : a replacement for the systemd-wrapper. Ok clear, and thanks for the history involved, i'm not using the systemd-wrapper, so no need for me to use its replacement. I just thought it looked fancy to use and maybe 'future proof' though that is to early to really tell.. no more 'restarting' of processes but just sending a 'reload' request did seem like a better design. (though in the background the same restarting of processes still happens..) This with the added (wrongful) though it was required for socket transferal i was thinking lets give it a try :). However the -W -D combination doesn't seem to (fully) work as i expected, responses below.. As mentionned in the other thread, there's an issue on this and kqueue that I have no idea about. I'm suspecting an at_exit() doing nasty stuff somewhere and some head-scratching will be needed (I hate the principle of at_exit() as it cheats on the stuff you believe when reading it). Ok, thanks for looking into it. No need to rush as i can work with rc4 as it is.. At least on my test machine.. I would prefer to do both 'catch' startup errors and daemonize haproxy. In my previous mail i'm starting it with -D, and the -W is equivalent of the global master-worker option in the config, so it 'should' daemonize right? But it did not(properly?), ive just tried with both startup parameters -D -W the result is the same. The master with pid 3061 is running under the system /sbin/init pid 1, however the pid 2926 also keeps running i would want/expect 2926 to exit when startup is complete. I just also noted that the 2926 actually becomes a 'zombie'.?. that cant be good right? It's *possible* that this process either still had a connection and couldn't quit, or that a bug made it believe it still had a connection. Given that you had a very strange behaviour with -D -W, let's consider there's an unknown issue there for now and that it could explain a lot of strange behaviours. No 'connections' to the process iirc its a isolated test environment and it seems related to the stdout/errout output.. I did have one of these
haproxy-1.8.0, sending a email-alert causes 100% cpu usage, FreeBSD 11.1
Hi List, I thought i 'reasonably' tested some of 1.8.0's options. Today i put it into 'production' on my secondary cluster node and notice it takes 100% cpu... I guess i should have tried such a thing last week. My regular config with 10 frontends and total 13 servers seems to startup fine when 'email-alert level' is set to 'emerg' , doesnt need to send a mail then.. Anyhow below some gdb and console output. Config that reproduces it is pretty simple no new features used or anything. Though the server is 'down' so it is trying to send a mail for that.. that never seems to happen though.. no mail is received. I tried using nokqueu and nopoll, but that did not result in any improvement.. Anything else i can provide? Regards, PiBa-NL / Pieter haproxy -f /root/hap.conf -V [WARNING] 330/204605 (14771) : config : missing timeouts for frontend 'TestMailFront'. | While not properly invalid, you will certainly encounter various problems | with such a configuration. To fix this, please ensure that all following | timeouts are set to a non-zero value: 'client', 'connect', 'server'. [WARNING] 330/204605 (14771) : config : missing timeouts for backend 'TestMailBack'. | While not properly invalid, you will certainly encounter various problems | with such a configuration. To fix this, please ensure that all following | timeouts are set to a non-zero value: 'client', 'connect', 'server'. Note: setting global.maxconn to 2000. Available polling systems : kqueue : pref=300, test result OK poll : pref=200, test result OK select : pref=150, test result FAILED Total: 3 (2 usable), will use kqueue. Available filters : [TRACE] trace [COMP] compression [SPOE] spoe Using kqueue() as the polling mechanism. [WARNING] 330/204608 (14771) : Server TestMailBack/TestServer is DOWN, reason: Layer4 timeout, check duration: 2009ms. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue. [ALERT] 330/204608 (14771) : backend 'TestMailBack' has no server available! Complete configuration that reproduces the issue: mailers globalmailers mailer ex01 192.168.0.40:25 frontend TestMailFront bind :88 default_backend TestMailBack backend TestMailBack server TestServer 192.168.0.250:80 check email-alert mailers globalmailers email-alert level info email-alert from haproxy@me.local email-alert to m...@me.tld email-alert myhostname pfs root@:~ # haproxy -vv HA-Proxy version 1.8.0 2017/11/26 Copyright 2000-2017 Willy Tarreau Build options : TARGET = freebsd CPU = generic CC = cc CFLAGS = -pipe -g -fstack-protector -fno-strict-aliasing -fno-strict-aliasing -Wdeclaration-after-statement -fwrapv -Wno-address-of-packed-member -Wno-null-dereference -Wno-unused-label -DFREEBSD_PORTS OPTIONS = USE_GETADDRINFO=1 USE_ZLIB=1 USE_CPU_AFFINITY=1 USE_ACCEPT4=1 USE_REGPARM=1 USE_OPENSSL=1 USE_LUA=1 USE_STATIC_PCRE=1 USE_PCRE_JIT=1 Default settings : maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200 Built with network namespace support. Built with zlib version : 1.2.11 Running on zlib version : 1.2.11 Compression algorithms supported : identity("identity"), deflate("deflate"), raw-deflate("deflate"), gzip("gzip") Built with PCRE version : 8.40 2017-01-11 Running on PCRE version : 8.40 2017-01-11 PCRE library supports JIT : yes Built with multi-threading support. Encrypted password support via crypt(3): yes Built with transparent proxy support using: IP_BINDANY IPV6_BINDANY Built with Lua version : Lua 5.3.4 Built with OpenSSL version : OpenSSL 1.0.2k-freebsd 26 Jan 2017 Running on OpenSSL version : OpenSSL 1.0.2k-freebsd 26 Jan 2017 OpenSSL library supports TLS extensions : yes OpenSSL library supports SNI : yes OpenSSL library supports : SSLv3 TLSv1.0 TLSv1.1 TLSv1.2 Available polling systems : kqueue : pref=300, test result OK poll : pref=200, test result OK select : pref=150, test result OK Total: 3 (3 usable), will use kqueue. Available filters : [TRACE] trace [COMP] compression [SPOE] spoe root@:~ # root@:~ # /usr/local/bin/gdb --pid 14771 GNU gdb (GDB) 8.0.1 [GDB v8.0.1 for FreeBSD] Copyright (C) 2017 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-portbld-freebsd11.1". Type "show configuration" for configuration details. For bug reporting instruc
Re: haproxy-1.8.0, sending a email-alert causes 100% cpu usage, FreeBSD 11.1
Hi Christopher / Willy, On Tue, Nov 28, 2017 at 10:28:20AM +0100, Christopher Faulet wrote: Here is a patch that should fix the deadlock. Could you confirm it fixes your bug ? Fix confirmed. Thanks, PiBa-NL / Pieter
[PATCH] BUG/MINOR: Check if master-worker pipe getenv succeeded, also allow pipe fd 0 as valid.
Hi List, Willy / Willliam, A patch i came up with that might make it a little 'safer' with regard to getenv and its return value or possible lack thereof.. I'm not sure it it will ever happen. But if it does it wont fail on a null pointer or empty string conversion to a long value.. Though a arithmetic conversion error could still happen if the value is present but not a number..but well that would be a really odd case. There are a few things i'm not sure about though. - What would/could possibly break if mworker_pipe values are left as -1 and the process continues and tries to use it? - wont the rd wr char* values leak memory? Anyhow the biggest part that should be noticed of the bug is the sometimes wrongful alert when the fd is actually '0'... If anything needs to be changed let me know. Regards, PiBa-NL / Pieter From 486d7c759af03f9193ae3e38005d8325ab069b37 Mon Sep 17 00:00:00 2001 From: PiBa-NL Date: Tue, 28 Nov 2017 23:22:14 +0100 Subject: [PATCH] [PATCH] BUG/MINOR: Check if master-worker pipe getenv succeeded, also allow pipe fd 0 as valid. On FreeBSD in quiet mode the stdin/stdout/stderr are closed which lets the mworker_pipe to use fd 0 and fd 1. --- src/haproxy.c | 12 +--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/src/haproxy.c b/src/haproxy.c index 891a021..c3c8281 100644 --- a/src/haproxy.c +++ b/src/haproxy.c @@ -2688,9 +2688,15 @@ int main(int argc, char **argv) free(msg); } } else { - mworker_pipe[0] = atol(getenv("HAPROXY_MWORKER_PIPE_RD")); - mworker_pipe[1] = atol(getenv("HAPROXY_MWORKER_PIPE_WR")); - if (mworker_pipe[0] <= 0 || mworker_pipe[1] <= 0) { + mworker_pipe[0] = -1; + mworker_pipe[1] = -1; + char* rd = getenv("HAPROXY_MWORKER_PIPE_RD"); + char* wr = getenv("HAPROXY_MWORKER_PIPE_WR"); + if (rd && wr && strlen(rd) > 0 && strlen(wr) > 0) { + mworker_pipe[0] = atol(rd); + mworker_pipe[1] = atol(wr); + } + if (mworker_pipe[0] < 0 || mworker_pipe[1] < 0) { ha_warning("[%s.main()] Cannot get master pipe FDs.\n", argv[0]); } } -- 2.10.1.windows.1
[PATCH] BUG/MINOR: when master-worker is in daemon mode, detach from tty
Hi List, Made a patch that makes the master-worker detach from tty when it is also combined with daemon mode to allow a script to start haproxy with daemon mode, closing stdout so the calling process knows when to stop reading from it and allow the master to properly daemonize. This is intended to solve my previously reported 'issue' : https://www.mail-archive.com/haproxy@formilux.org/msg27963.html Let me know if something about it needs fixing.. Thanks PiBa-NL / Pieter From 06224a3fcf7b39bf1bf0128a5bac3d0209bc2aab Mon Sep 17 00:00:00 2001 From: PiBa-NL Date: Tue, 28 Nov 2017 23:26:08 +0100 Subject: [PATCH] [PATCH] BUG/MINOR: when master-worker is in daemon mode, detach from tty This allows a calling script to show the first startup output and know when to stop reading from stdout so haproxy can daemonize. --- src/haproxy.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/src/haproxy.c b/src/haproxy.c index c3c8281..a811577 100644 --- a/src/haproxy.c +++ b/src/haproxy.c @@ -2648,6 +2648,13 @@ int main(int argc, char **argv) } if (global.mode & (MODE_DAEMON | MODE_MWORKER)) { + if ((!(global.mode & MODE_QUIET) || (global.mode & MODE_VERBOSE)) && + ((global.mode & (MODE_DAEMON | MODE_MWORKER)) == (MODE_DAEMON | MODE_MWORKER))) { + /* detach from the tty, this is required to properly daemonize. */ + fclose(stdin); fclose(stdout); fclose(stderr); + global.mode &= ~MODE_VERBOSE; + global.mode |= MODE_QUIET; /* ensure that we won't say anything from now */ + } struct proxy *px; struct peers *curpeers; int ret = 0; -- 2.10.1.windows.1
Re: [PATCH] BUG/MINOR: Check if master-worker pipe getenv succeeded, also allow pipe fd 0 as valid.
Hi William, Op 29-11-2017 om 1:15 schreef William Lallemand: Hi Pieter, The getenv returning NULL should never happen, but the test is wrong, it should have been a strtol with an errno check instead of an atol. However that's overkill in this case, we just need to check the return value of getenv(). I've changed to from 'atol' to 'atoi' as you mentioned below (that was intentional right?), and kept the null check. There are a few things i'm not sure about though. - What would/could possibly break if mworker_pipe values are left as -1 and the process continues and tries to use it? That does not seems to be a good idea, because we try to register the fd in the poller after that. We don't need to do this, it's better to quit the master-worker if something this simple failed, because we can't trust the process anymore. It was using ha_warning before so i kept it the same in the first patch. Now changed to ha_alert+exit(), i agree this looks safer if for some reason it ever happens. - if (mworker_pipe[0] <= 0 || mworker_pipe[1] <= 0) { + mworker_pipe[0] = -1; + mworker_pipe[1] = -1; We don't need to init to -1; Ok. In my opinion we can simplify By doing this, which is more secure, and assure us that it won't start with this kind of problem: if (!rd || !wd) { ha_alert("[%s.main()] Cannot get master pipe FDs.\n", argv[0]); I added "atexit_flag = 0;" here. exit(1); } mworker_pipe[0] = atoi(rd); mworker_pipe[1] = atoi(wr); And we can do the same thing with the pipe return value: ret = pipe(mworker_pipe); if (ret < 0) { ha_alert("[%s.main()] Cannot create master pipe.\n", argv[0]); exit(1); } This code will guarantee that the whole master-worker quit if there is a problem. Thanks for the review, new patch attached that basically incorporates all your comments. Regards, PiBa-NL / Pieter From 19e5f7f9edb5439288789d8b19770454d2ae834f Mon Sep 17 00:00:00 2001 From: PiBa-NL Date: Tue, 28 Nov 2017 23:22:14 +0100 Subject: [PATCH] BUG/MINOR: Check if master-worker pipe getenv succeeded, also allow pipe fd 0 as valid. On FreeBSD in quiet mode the stdin/stdout/stderr are closed which lets the mworker_pipe to use fd 0 and fd 1. Additionally exit() upon failure to create or get the master-worker pipe. --- src/haproxy.c | 15 ++- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/src/haproxy.c b/src/haproxy.c index 891a021..0c87cb7 100644 --- a/src/haproxy.c +++ b/src/haproxy.c @@ -2679,7 +2679,8 @@ int main(int argc, char **argv) /* master pipe to ensure the master is still alive */ ret = pipe(mworker_pipe); if (ret < 0) { - ha_warning("[%s.main()] Cannot create master pipe.\n", argv[0]); + ha_alert("[%s.main()] Cannot create master pipe.\n", argv[0]); + exit(EXIT_FAILURE); } else { memprintf(&msg, "%d", mworker_pipe[0]); setenv("HAPROXY_MWORKER_PIPE_RD", msg, 1); @@ -2688,11 +2689,15 @@ int main(int argc, char **argv) free(msg); } } else { - mworker_pipe[0] = atol(getenv("HAPROXY_MWORKER_PIPE_RD")); - mworker_pipe[1] = atol(getenv("HAPROXY_MWORKER_PIPE_WR")); - if (mworker_pipe[0] <= 0 || mworker_pipe[1] <= 0) { - ha_warning("[%s.main()] Cannot get master pipe FDs.\n", argv[0]); + char* rd = getenv("HAPROXY_MWORKER_PIPE_RD"); + char* wr = getenv("HAPROXY_MWORKER_PIPE_WR"); + if (!rd || !wr) { + ha_alert("[%s.main()] Cannot get master pipe FDs.\n", argv[0]); + atexit_flag = 0;// dont reexecute master process + exit(EXIT_FAILURE); } + mworker_pipe[0] = atoi(rd); + mworker_pipe[1] = atoi(wr); } } -- 2.10.1.windows.1
Re: [PATCH] BUG/MINOR: when master-worker is in daemon mode, detach from tty
Hi William, When you have time, please take a look below & attached :) . Op 29-11-2017 om 1:28 schreef William Lallemand: Hi Pieter, diff --git a/src/haproxy.c b/src/haproxy.c index c3c8281..a811577 100644 --- a/src/haproxy.c +++ b/src/haproxy.c @@ -2648,6 +2648,13 @@ int main(int argc, char **argv) } if (global.mode & (MODE_DAEMON | MODE_MWORKER)) { + if ((!(global.mode & MODE_QUIET) || (global.mode & MODE_VERBOSE)) && + ((global.mode & (MODE_DAEMON | MODE_MWORKER)) == (MODE_DAEMON | MODE_MWORKER))) { + /* detach from the tty, this is required to properly daemonize. */ + fclose(stdin); fclose(stdout); fclose(stderr); + global.mode &= ~MODE_VERBOSE; + global.mode |= MODE_QUIET; /* ensure that we won't say anything from now */ + } struct proxy *px; struct peers *curpeers; int ret = 0; I need to check that again later, in my opinion it should be done after the pipe() so we don't inherit the 0 and 1 FDs in the pipe, FDs for the master-worker pipe can still be 0 and 1 if running in quiet mode as the stdin/stdout/stderr are still closed before creating the pipe then. Should the pipe be created earlier? I've moved the code to just before the mworker_wait() in new attached patch. This should allow (all?) possible warnings to be output before closing stdX, and still 'seems' to work properly.. we also need to rely on setsid() to do a proper tty detach. I've added a setsid(), but i must admit i have no clue what its doing exactly... This is already done in -D mode without -W, maybe this part of the code should me moved elsewhere, but we have to be careful not to break the daemon mode w/o mworker. I've tried most combinations of parameters like these: 1: -W 2: -W -q 3: -D -W 4: -D -W -q 5: -D 6: -D -q 7: -q 8: (without parameters) Both by starting directly from a ssh console, and by running from my php script that reads the stdout/stderr output. And reloading it with USR2 with the -W mode.. It seemed that the expected output or lack thereof was being produced in all cases. But it preferably also needs to be tested under systemd itself as that is the intended use-case, which i did not test at all :/ .. Also i did not change the config while running to include/exclude 'quiet' or 'daemon' option or something like that. Seems like a odd thing to do.. I'm not sure if the attached patch is OK for you like this, or needs to be implemented completely differently. I have made and tried to test the changed patch with above cases but am sure there are many things / combinations with other features i have not included.. If i need to change it slightly somehow please let me know, if you need time to look into it further, i can certainly wait :) i do not 'need' the feature urgently or perhaps won't need it at all.. Anyhow when you have time to look into it, i look forward to your feedback :) . Thanks in advance. Regards, PiBa-NL / Pieter From c103dbd7837d49721ccadfb1aee9520e821a020f Mon Sep 17 00:00:00 2001 From: PiBa-NL Date: Tue, 28 Nov 2017 23:26:08 +0100 Subject: [PATCH] BUG/MINOR: when master-worker is in daemon mode, detach from tty This allows a calling script to show the first startup output and know when to stop reading from stdout so haproxy can daemonize. --- src/haproxy.c | 12 +++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/src/haproxy.c b/src/haproxy.c index 891a021..702501d 100644 --- a/src/haproxy.c +++ b/src/haproxy.c @@ -2749,7 +2749,7 @@ int main(int argc, char **argv) //lseek(pidfd, 0, SEEK_SET); /* debug: emulate eglibc bug */ close(pidfd); } - + /* We won't ever use this anymore */ free(global.pidfile); global.pidfile = NULL; @@ -2757,6 +2757,16 @@ int main(int argc, char **argv) if (global.mode & MODE_MWORKER) { mworker_cleanlisteners(); deinit_pollers(); + + if ((!(global.mode & MODE_QUIET) || (global.mode & MODE_VERBOSE)) && + ((global.mode & (MODE_DAEMON | MODE_MWORKER)) == (MODE_DAEMON | MODE_MWORKER))) { + /* detach from the tty, this is required to properly daemonize. */ + fclose(stdin); fclose(stdout); fclose(stderr); + global.mode &= ~MODE_VERBOSE; + global.mode |= MODE_QUIET; /* ensure that we won't say anything from now */ +
haproxy 1.8.1 email-alert with log-health-checks, 100% cpu usage / mailbomb
Hi List, Hereby a seemingly new case of 100% cpu usage / mailbomb on FreeBSD 11.1. Below seems to be the (close to) minimal config, there is no mailserver, and no webserver listening on those ports.. The stats page is not requested. (But without it haproxy wont start as it doesn't see any bind's then..) If a mailserver is listening, then 800+ mails are received.. Regards, PiBa-NL / Pieter defaults option log-health-checks listen HAProxyLocalStats bind 127.0.0.1:42200 name localstats mode http stats enable mailers globalmailers mailer ex01 127.0.0.1:3325 backend ServerTest_http_ipv4 mode http email-alert mailers globalmailers email-alert level info email-alert from haproxy@pfsense.local email-alert to m...@me.tld server ServerTest 127.0.0.1:33443 check inter 1 root@:~ # uname -a FreeBSD 11.1-RELEASE-p4 FreeBSD 11.1-RELEASE-p4 #0: Tue Nov 14 06:12:40 UTC 2017 r...@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC amd64 It produces like 600+ lines of console output, ive numbered some of the lines and skipped most repeating ones..: root@:~ # haproxy -f /root/hapconf.conf [WARNING] 337/193939 (44649) : Health check for server ServerTest_http_ipv4/ServerTest failed, reason: Layer4 connection problem, info: "Connection error during SSL handshake (Broken pipe)", check duration: 0ms, status: 0/2 DOWN. [WARNING] 337/193939 (44649) : Server ServerTest_http_ipv4/ServerTest is DOWN. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue. [ALERT] 337/193939 (44649) : backend 'ServerTest_http_ipv4' has no server available! [WARNING] 337/193939 (44649) : Health check for server ServerTest_http_ipv4/ServerTest failed, reason: Layer4 connection problem, info: "Connection refused at step 1 of tcp-check (connect)", check duration: 1ms, status: 0/1 DOWN. [WARNING] 337/193939 (44649) : Health check for server ServerTest_http_ipv4/ServerTest failed, reason: Layer4 connection problem, info: "Connection refused at step 1 of tcp-check (connect)", check duration: 0ms, status: 0/1 DOWN. [WARNING] 337/193939 (44649) : Health check for server ServerTest_http_ipv4/ServerTest failed, reason: Layer4 connection problem, info: "Connection refused at step 1 of tcp-check (connect)", check duration: 0ms, status: 0/1 DOWN. [WARNING] 337/193939 (44649) : Health check for server ServerTest_http_ipv4/ServerTest failed, reason: Layer4 connection problem, info: "Connection refused at step 1 of tcp-check (connect)", check duration: 0ms, status: 0/1 DOWN. [WARNING] 337/193939 (44649) : Health check for server ServerTest_http_ipv4/ServerTest failed, reason: Layer4 connection problem, info: "Connection refused at step 1 of tcp-check (connect)", check duration: 0ms, status: 0/1 DOWN. 10[WARNING] 337/193939 (44649) : Health check for server ServerTest_http_ipv4/ServerTest failed, reason: Layer4 connection problem, info: "Connection refused at step 1 of tcp-check (connect)", check duration: 0ms, status: 0/1 DOWN. [WARNING] 337/193939 (44649) : Health check for server ServerTest_http_ipv4/ServerTest failed, reason: Layer4 connection problem, info: "Connection refused at step 1 of tcp-check (connect)", check duration: 0ms, status: 0/1 DOWN. [WARNING] 337/193939 (44649) : Health check for server ServerTest_http_ipv4/ServerTest failed, reason: Layer4 connection problem, info: "Connection refused at step 1 of tcp-check (connect)", check duration: 0ms, status: 0/1 DOWN. [WARNING] 337/193939 (44649) : Health check for server ServerTest_http_ipv4/ServerTest failed, reason: Layer4 connection problem, info: "Connection refused at step 1 of tcp-check (connect)", check duration: 0ms, status: 0/1 DOWN. [WARNING] 337/193939 (44649) : Health check for server ServerTest_http_ipv4/ServerTest failed, reason: Layer4 connection problem, info: "Connection refused at step 1 of tcp-check (connect)", check duration: 0ms, status: 0/1 DOWN. ... repeating same line over and over... 203[WARNING] 337/193939 (44649) : Health check for server ServerTest_http_ipv4/ServerTest failed, reason: Layer4 connection problem, info: "Connection refused at step 1 of tcp-check (connect)", check duration: 0ms, status: 0/1 DOWN. 204[WARNING] 337/193942 (44649) : Health check for server ServerTest_http_ipv4/ServerTest failed, reason: Layer4 connection problem, info: "Connection refused at step 1 of tcp-check (connect)", check duration: 3059ms, status: 0/1 DOWN. 205[WARNING] 337/193942 (44649) : Health check for server ServerTest_http_ipv4/ServerTest failed, reason: Layer4 connection problem, info: "Connection refused at step 1 of tcp-check (connect)", check duration: 0ms, status: 0/1 DOWN. ... repeating same line over and
[PATCH] BUG/MEDIUM: email-alert: don't set server check status from a email-alert task
Hi List, Simon and Baptiste, Sending to both of you guys as its both tcp-check and email related and you are the maintainers of those parts. Patch subject+content basically says it all (i hope.). It is intended to fixes yesterdays report: https://www.mail-archive.com/haproxy@formilux.org/msg28158.html Please let me know if it is OK, or should be done differently. Thanks in advance, PiBa-NL / Pieter From bf80b0398c08f94bebec30feaaddda422cb87ba1 Mon Sep 17 00:00:00 2001 From: PiBa-NL Date: Wed, 6 Dec 2017 01:35:43 +0100 Subject: [PATCH] BUG/MEDIUM: email-alert: don't set server check status from a email-alert task This avoids possible 100% cpu usage deadlock on a EMAIL_ALERTS_LOCK and avoids sending lots of emails when 'option log-health-checks' is used. It is avoided to change the server state and possibly queue a new email while processing the email alert by checking if the check task is being processed for the process_email_alert struct. This needs to be backported to 1.8. --- src/checks.c | 4 1 file changed, 4 insertions(+) diff --git a/src/checks.c b/src/checks.c index eaf84a2..55bfde2 100644 --- a/src/checks.c +++ b/src/checks.c @@ -72,6 +72,7 @@ static int tcpcheck_main(struct check *); static struct pool_head *pool_head_email_alert = NULL; static struct pool_head *pool_head_tcpcheck_rule = NULL; +static struct task *process_email_alert(struct task *t); static const struct check_status check_statuses[HCHK_STATUS_SIZE] = { @@ -198,6 +199,9 @@ const char *get_analyze_status(short analyze_status) { */ static void set_server_check_status(struct check *check, short status, const char *desc) { + if (check->task->process == process_email_alert) + return; // email alerts should not change the status of the server + struct server *s = check->server; short prev_status = check->status; int report = 0; -- 2.10.1.windows.1
crash with regtest: /reg-tests/connection/h00001.vtc after commit f157384
Hi List, Willy, Current 1.9-dev master ( 6e0d8ae ) crashes with regtest: /reg-tests/connection/h1.vtc stack below, it fails after commit f157384. Can someone check? Thanks. Regards, PiBa-NL (Pieter) Program terminated with signal 11, Segmentation fault. #0 0x0057f34f in connect_server (s=0x802616500) at src/backend.c:1384 1384 HA_ATOMIC_ADD(&srv->counters.connect, 1); (gdb) bt full #0 0x0057f34f in connect_server (s=0x802616500) at src/backend.c:1384 cli_conn = (struct connection *) 0x8026888c0 srv_conn = (struct connection *) 0x802688a80 old_conn = (struct connection *) 0x0 srv_cs = (struct conn_stream *) 0x8027b8180 srv = (struct server *) 0x0 reuse = 0 reuse_orphan = 0 err = 0 i = 5 #1 0x004a8acc in sess_update_stream_int (s=0x802616500) at src/stream.c:928 conn_err = 8 srv = (struct server *) 0x0 si = (struct stream_interface *) 0x802616848 req = (struct channel *) 0x802616510 #2 0x004a37c2 in process_stream (t=0x80265c320, context=0x802616500, state=257) at src/stream.c:2302 srv = (struct server *) 0x0 s = (struct stream *) 0x802616500 sess = (struct session *) 0x8027be000 rqf_last = 9469954 rpf_last = 2147483648 rq_prod_last = 7 rq_cons_last = 0 rp_cons_last = 7 rp_prod_last = 0 req_ana_back = 0 req = (struct channel *) 0x802616510 res = (struct channel *) 0x802616570 si_f = (struct stream_interface *) 0x802616808 si_b = (struct stream_interface *) 0x802616848 #3 0x005e9da7 in process_runnable_tasks () at src/task.c:432 t = (struct task *) 0x80265c320 state = 257 ctx = (void *) 0x802616500 process = (struct task *(*)(struct task *, void *, unsigned short)) 0x4a0480 t = (struct task *) 0x80265c320 max_processed = 200 #4 0x00511592 in run_poll_loop () at src/haproxy.c:2620 next = 0 exp = 0 #5 0x0050dc00 in run_thread_poll_loop (data=0x802637080) at src/haproxy.c:2685 ---Type to continue, or q to quit--- start_lock = {lock = 0, info = {owner = 0, waiters = 0, last_location = {function = 0x0, file = 0x0, line = 0}}} ptif = (struct per_thread_init_fct *) 0x92ee30 ptdf = (struct per_thread_deinit_fct *) 0x0 #6 0x0050a2b6 in main (argc=4, argv=0x7fffea48) at src/haproxy.c:3314 tids = (unsigned int *) 0x802637080 threads = (pthread_t *) 0x802637088 i = 1 old_sig = {__bits = 0x7fffe770} blocked_sig = {__bits = 0x7fffe780} err = 0 retry = 200 limit = {rlim_cur = 4042, rlim_max = 4042} errmsg = 0x7fffe950 "" pidfd = -1 Current language: auto; currently minimal haproxy -vv HA-Proxy version 1.9-dev10-6e0d8ae 2018/12/14 Copyright 2000-2018 Willy Tarreau Build options : TARGET = freebsd CPU = generic CC = cc CFLAGS = -DDEBUG_THREAD -DDEBUG_MEMORY -pipe -g -fstack-protector -fno-strict-aliasing -fno-strict-aliasing -Wdeclaration-after-statement -fwrapv -fno-strict-overflow -Wno-address-of-packed-member -Wno-null-dereference -Wno-unused-label -DFREEBSD_PORTS -DFREEBSD_PORTS OPTIONS = USE_GETADDRINFO=1 USE_ZLIB=1 USE_ACCEPT4=1 USE_REGPARM=1 USE_OPENSSL=1 USE_LUA=1 USE_STATIC_PCRE=1 USE_PCRE_JIT=1 Default settings : maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200 Built with OpenSSL version : OpenSSL 1.0.2k-freebsd 26 Jan 2017 Running on OpenSSL version : OpenSSL 1.0.2k-freebsd 26 Jan 2017 OpenSSL library supports TLS extensions : yes OpenSSL library supports SNI : yes OpenSSL library supports : SSLv3 TLSv1.0 TLSv1.1 TLSv1.2 Built with Lua version : Lua 5.3.4 Built with transparent proxy support using: IP_BINDANY IPV6_BINDANY Built with zlib version : 1.2.11 Running on zlib version : 1.2.11 Compression algorithms supported : identity("identity"), deflate("deflate"), raw-deflate("deflate"), gzip("gzip") Built with PCRE version : 8.40 2017-01-11 Running on PCRE version : 8.40 2017-01-11 PCRE library supports JIT : yes Encrypted password support via crypt(3): yes Built with multi-threading support. Available polling systems : kqueue : pref=300, test result OK poll : pref=200, test result OK select : pref=150, test result OK Total: 3 (3 usable), will use kqueue. Available multiplexer protocols : (protocols marked as cannot be specified using 'proto' keyword) h2 : mode=HTTP side=FE h2 : mode=HTX side=FE|BE : mode=HTX side=FE|BE : mode=TCP|HTTP side=FE|BE Available filters : [SPOE] spoe [COMP] compression [CACHE] cache [TRACE] trace # commit d02286d # BUG/MINOR: log: pin the front
stdout logging makes syslog logging fail.. 1.9-dev10-6e0d8ae
Hi List, Willy, stdout logging makes syslog logging fail.. regtest that reproduces the issue attached. Attached test (a modification of /log/b0.vtc) fails, by just adding a stdout logger: *** h1 0.0 debug|[ALERT] 348/000831 (51048) : sendmsg()/writev() failed in logger #2: Socket operation on non-socket (errno=38), which apparently modifies the syslog behavior.? Tested with version 1.9-dev10-6e0d8ae, but i think it never worked since stdout logging was introduced. Regards, PiBa-NL (Pieter) # commit d02286d # BUG/MINOR: log: pin the front connection when front ip/ports are logged # # Mathias Weiersmueller reported an interesting issue with logs which Lukas # diagnosed as dating back from commit 9b061e332 (1.5-dev9). When front # connection information (ip, port) are logged in TCP mode and the log is # emitted at the end of the connection (eg: because %B or any log tag # requiring LW_BYTES is set), the log is emitted after the connection is # closed, so the address and ports cannot be retrieved anymore. # # It could be argued that we'd make a special case of these to immediately # retrieve the source and destination addresses from the connection, but it # seems cleaner to simply pin the front connection, marking it "tracked" by # adding the LW_XPRT flag to mention that we'll need some of these elements # at the last moment. Only LW_FRTIP and LW_CLIP are affected. Note that after # this change, LW_FRTIP could simply be removed as it's not used anywhere. # # Note that the problem doesn't happen when using %[src] or %[dst] since # all sample expressions set LW_XPRT. varnishtest "Wrong ip/port logging" feature ignore_unknown_macro server s1 { rxreq txresp } -start syslog Slg_1 -level notice { recv recv recv info expect ~ \"dip\":\"${h1_fe_1_addr}\",\"dport\":\"${h1_fe_1_port}.*\"ts\":\"[cC]D\",\" } -start haproxy h1 -conf { global log stdout format short daemon log ${Slg_1_addr}:${Slg_1_port} local0 defaults log global timeout connect 3000 timeout client 1 timeout server 1 frontend fe1 bind "fd@${fe_1}" mode tcp log-format {\"dip\":\"%fi\",\"dport\":\"%fp\",\"c_ip\":\"%ci\",\"c_port\":\"%cp\",\"fe_name\":\"%ft\",\"be_name\":\"%b\",\"s_name\":\"%s\",\"ts\":\"%ts\",\"bytes_read\":\"%B\"} default_backendbe_app backend be_app server app1 ${s1_addr}:${s1_port} } -start client c1 -connect ${h1_fe_1_sock} { txreq -url "/" delay 0.02 } -run syslog Slg_1 -wait
Re: Quick update on 1.9
Hi Willy, Op 14-12-2018 om 22:32 schreef Willy Tarreau: if we manage to get haproxy.org to work reasonably stable this week- end, it will be a sign that we can release it. There are still several known issues that should be addressed before 'release' imho. - Compression corrupts data(Christopher is investigating): https://www.mail-archive.com/haproxy@formilux.org/msg32059.html - Dispatch server crashes haproxy (i found it today): https://www.mail-archive.com/haproxy@formilux.org/msg32078.html - stdout logging makes syslog logging fail (i mentioned it before, but i thought lets 'officially' re-report it now): https://www.mail-archive.com/haproxy@formilux.org/msg32079.html - As you mention haproxy serving the haproxy.org website apparently crashed several times today when you tried a recent build.. I think a week of running without a single crash would be a better indicator than a single week-end that a release could be imminent.?. - Several of the '/checks/' regtests don't work. Might be a problem with varnishtest though, not sure.. But you already discovered that. And thats just the things i am aware off a.t.m.. I'm usually not 'scared' to run a -dev version on my production box for a while and try a few new experimental features that seem useful to me over a weekend, but i do need to have the idea that it will 'work' as good as the version i update from, and to me it just doesn't seem there yet. (i would really like the compression to be functional before i try again..) So with still several known bugs to solve imho its not yet a good time to release it as a 'stable' version already in few days time.?. Or did i misunderstand the 'sign' to release, is it one of several signs that needs to be checked.?. I think a -dev11 or perhaps a -RC if someone likes that term, would probably be more appropriate, before distro's start including the new release expecting stability while actually bringing a seemingly largish potential of breaking some features that used to work. So even current new commits are still introducing new breakage, while shortly before release i would expect mostly little fixes to issues to get committed. That 'new' features arn't 100% stable, that might not be a blocker. But existing features that used to work properly should imho not get released in a broken state.. my 2 cent. Regards, PiBa-NL (Pieter)
Re: Quick update on 1.9
Hi Willy, Op 15-12-2018 om 6:15 schreef Willy Tarreau: - Compression corrupts data(Christopher is investigating): https://www.mail-archive.com/haproxy@formilux.org/msg32059.html This one was fixed, he had to leave quickly last evening so he couldn't respond, but it was due to some of my changes to avoid copies, I failed to grasp some corner cases of htx. Could it be it is not fixed/committed in the git repository? (b.t.w. i don't use htx in the vtc testfile ..). As "6e0d8ae BUG/MINOR: mworker: don't use unitialized mworker_proc struct master" seems to be the latest commit and the .vtc file still produces files with different hashes for the 3 curl commands for me. Besides that and as usual thanks for your elaborate response on all the other subjects :). Regards, PiBa-NL (Pieter)
regtests - with option http-use-htx
Hi List, Willy, Trying to run some existing regtests with added option: option http-use-htx Using: HA-Proxy version 1.9-dev10-c11ec4a 2018/12/15 I get the below issues sofar: based on /reg-tests/connection/b0.vtc Takes 8 seconds to pass, in a slightly modified manor 1.1 > 2.0 expectation for syslog. This surely needs a closer look? # top TEST ./htx-test/connection-b0.vtc passed (8.490) based on /reg-tests/stick-table/b1.vtc Difference here is the use=1 vs use=0 , maybe that is better, but then the 'old' expectation seems wrong, and the bug is the case without htx? h1 0.0 CLI recv|0x8026612c0: key=127.0.0.1 use=1 exp=0 gpt0=0 gpc0=0 gpc0_rate(1)=0 conn_rate(1)=1 http_req_cnt=1 http_req_rate(1)=1 http_err_cnt=0 http_err_rate(1)=0 h1 0.0 CLI recv| h1 0.0 CLI expect failed ~ "table: http1, type: ip, size:1024, used:(0|1\n0x[0-9a-f]*: key=127\.0\.0\.1 use=0 exp=[0-9]* gpt0=0 gpc0=0 gpc0_rate\(1\)=0 conn_rate\(1\)=1 http_req_cnt=1 http_req_rate\(1\)=1 http_err_cnt=0 http_err_rate\(1\)=0)\n$" Regards, PiBa-NL (Pieter) #commit b406b87 # BUG/MEDIUM: connection: don't store recv() result into trash.data # # Cyril Bonté discovered that the proxy protocol randomly fails since # commit 843b7cb ("MEDIUM: chunks: make the chunk struct's fields match # the buffer struct"). This is because we used to store recv()'s return # code into trash.data which is now unsigned, so it never compares as # negative against 0. Let's clean this up and test the result itself # without storing it first. varnishtest "PROXY protocol random failures" feature ignore_unknown_macro syslog Slog_1 -repeat 8 -level info { recv expect ~ "Connect from .* to ${h1_ssl_addr}:${h1_ssl_port}" recv expect ~ "ssl-offload-http/http .* \"POST /[1-8] HTTP/2\\.0\"" } -start haproxy h1 -conf { global nbproc 4 nbthread 4 tune.ssl.default-dh-param 2048 stats bind-process 1 log ${Slog_1_addr}:${Slog_1_port} len 2048 local0 debug err defaults mode http timeout client 1s timeout server 1s timeout connect 1s log global option http-use-htx listen http bind-process 1 bind unix@${tmpdir}/http.socket accept-proxy name ssl-offload-http option forwardfor listen ssl-offload-http option httplog bind-process 2-4 bind "fd@${ssl}" ssl crt ${testdir}/common.pem ssl no-sslv3 alpn h2,http/1.1 server http unix@${tmpdir}/http.socket send-proxy } -start shell { HOST=${h1_ssl_addr} if [ "$HOST" = "::1" ] ; then HOST="\[::1\]" fi for i in 1 2 3 4 5 6 7 8 ; do urls="$urls https://$HOST:${h1_ssl_port}/$i"; done curl -i -k -d 'x=x' $urls & wait $! } syslog Slog_1 -wait # commit 3e60b11 # BUG/MEDIUM: stick-tables: Decrement ref_cnt in table_* converters # # When using table_* converters ref_cnt was incremented # and never decremented causing entries to not expire. # # The root cause appears to be that stktable_lookup_key() # was called within all sample_conv_table_* functions which was # incrementing ref_cnt and not decrementing after completion. # # Added stktable_release() to the end of each sample_conv_table_* # function and reworked the end logic to ensure that ref_cnt is # always decremented after use. # # This should be backported to 1.8 varnishtest "stick-tables: Test expirations when used with table_*" # As some macros for haproxy are used in this file, this line is mandatory. feature ignore_unknown_macro # Do nothing. server s1 { } -start haproxy h1 -conf { # Configuration file of 'h1' haproxy instance. defaults mode http timeout connect 5s timeout server 30s timeout client 30s option http-use-htx frontend http1 bind "fd@${my_frontend_fd}" stick-table size 1k expire 1ms type ip store conn_rate(10s),http_req_cnt,http_err_cnt,http_req_rate(10s),http_err_rate(10s),gpc0,gpc0_rate(10s),gpt0 http-request track-sc0 req.hdr(X-Forwarded-For) http-request redirect location https://${s1_addr}:${s1_port}/ if { req.hdr(X-Forwarded-For),table_http_req_cnt(http1) -m int lt 0 } http-request redirect location https://${s1_addr}:${s1_port}/ if { req.hdr(X-Forwarded-For),table_trackers(http1) -m int lt 0 } http-request redirect location https://${s1_addr}:${s1_port}/ if { req.hdr(X-Forwarded-For),in_table(http1) -m int lt 0 } http-request redirect location https://${s1_addr}:${s1_port}/ if { req.hdr(X-Forwarded-For),table_bytes_in_rate(http1) -m int lt 0 } http-request redirect location https://${s1_addr}:${s1_port}/ if { req.hdr(X-Forwa
Re: regtests - with option http-use-htx
Hi Willy, Op 15-12-2018 om 17:06 schreef Willy Tarreau: Hi Pieter, On Sat, Dec 15, 2018 at 04:52:10PM +0100, PiBa-NL wrote: Hi List, Willy, Trying to run some existing regtests with added option: option http-use-htx Using: HA-Proxy version 1.9-dev10-c11ec4a 2018/12/15 I get the below issues sofar: based on /reg-tests/connection/b0.vtc Takes 8 seconds to pass, in a slightly modified manor 1.1 > 2.0 expectation for syslog. This surely needs a closer look? # top TEST ./htx-test/connection-b0.vtc passed (8.490) It looks exactly like another issue we've found when a content-length is missing but the close is not seen, which is the same in your case with the first proxy returning the 503 error page by default. Christopher told me he understands what's happening in this situation (at least for the one we've met), I'm CCing him in case this report fuels this thoughts. Ok thanks. based on /reg-tests/stick-table/b1.vtc Difference here is the use=1 vs use=0 , maybe that is better, but then the 'old' expectation seems wrong, and the bug is the case without htx? h1 0.0 CLI recv|0x8026612c0: key=127.0.0.1 use=1 exp=0 gpt0=0 gpc0=0 gpc0_rate(1)=0 conn_rate(1)=1 http_req_cnt=1 http_req_rate(1)=1 http_err_cnt=0 http_err_rate(1)=0 h1 0.0 CLI recv| h1 0.0 CLI expect failed ~ "table: http1, type: ip, size:1024, used:(0|1\n0x[0-9a-f]*: key=127\.0\.0\.1 use=0 exp=[0-9]* gpt0=0 gpc0=0 gpc0_rate\(1\)=0 conn_rate\(1\)=1 http_req_cnt=1 http_req_rate\(1\)=1 http_err_cnt=0 http_err_rate\(1\)=0)\n$" Hmmm here I think we're really hitting corner cases depending on whether the tracked counters are released before or after the logs are emitted. In the case of htx, the logs are emitted slightly later than before, which may induce this. Quite honestly I'd be inclined to set use=[01] here in the regex to cover the race condition that exists in both cases, as there isn't any single good value. Christopher, are you also OK with this ? I can do the patch if you're OK. Its not about emitting logs, its querying the stats admin socket, and even with a added 'delay 2' before doing so the results seem to show the same difference with/without htx. I don't think its a matter of 'timing' .? Regards, PiBa-NL (Pieter) ** c1 0.0 === expect resp.status == 503 c1 0.0 EXPECT resp.status (503) == "503" match *** c1 0.0 closing fd 7 ** c1 0.0 Ending ** top 0.0 === delay 2 *** top 0.0 delaying 2 second(s) ** top 2.1 === haproxy h1 -cli { ** h1 2.1 CLI starting ** h1 2.1 CLI waiting *** h1 2.1 CLI connected fd 7 from ::1 26202 to ::1 26153 ** h1 2.1 === send "show table http1" h1 2.1 CLI send|show table http1 ** h1 2.1 === expect ~ "table: http1, type: ip, size:1024, used:(0|1\\n0x[... *** h1 2.1 debug|0001:GLOBAL.accept(0005)=000b from [::1:26202] ALPN= h1 2.1 CLI connection normally closed *** h1 2.1 CLI closing fd 7 *** h1 2.1 debug|0001:GLOBAL.srvcls[adfd:] *** h1 2.1 debug|0001:GLOBAL.clicls[adfd:] *** h1 2.1 debug|0001:GLOBAL.closed[adfd:] h1 2.1 CLI recv|# table: http1, type: ip, size:1024, used:1 h1 2.1 CLI recv|0x8026612c0: key=127.0.0.1 use=1 exp=0 gpt0=0 gpc0=0 gpc0_rate(1)=0 conn_rate(1)=1 http_req_cnt=1 http_req_rate(1)=1 http_err_cnt=0 http_err_rate(1)=0 h1 2.1 CLI recv| h1 2.1 CLI expect failed ~ "table: http1, type: ip, size:1024, used:(0|1\n0x[0-9a-f]*: key=127\.0\.0\.1 use=0 exp=[0-9]* gpt0=0 gpc0=0 gpc0_rate\(1\)=0 conn_rate\(1\)=1 http_req_cnt=1 http_req_rate\(1\)=1 http_err_cnt=0 http_err_rate\(1\)=0)\n$"
Re: corruption of data with compression in 1.9-dev10
Hi Christopher, Fix confirmed. top 2.5 shell_out|File1 all OK top 2.5 shell_out|File2 all OK top 2.5 shell_out|File3 all OK Thank you! Regards, PiBa-NL (Pieter)
d94f877 causes timeout in a basic connection test 1.9-dev11_d94f877
Hi List, Christopher, Seems like d94f877 causes timeout in a pretty 'basic' connection test that transfers a little bit of data .? Or at least attached test fails to complete for me.. # top TEST ./PB-TEST/basic_connection.vtc TIMED OUT (kill -9) # top TEST ./PB-TEST/basic_connection.vtc FAILED (120.236) signal=9 Please can you take a look :) Thanks in advance. Regards, PiBa-NL (Pieter) # Checks a simple request varnishtest "Checks a simple request" feature ignore_unknown_macro server s1 { rxreq txresp -bodylen 42202 } -start haproxy h1 -conf { global stats socket /tmp/haproxy.socket level admin defaults mode http log global option httplog timeout connect 3s timeout client 4s timeout server 4s frontend fe1 bind "fd@${fe_1}" default_backend b1 backend b1 http-reuse never server srv1 ${s1_addr}:${s1_port} #pool-max-conn 0 } -start shell { HOST=${h1_fe_1_addr} if [ "${h1_fe_1_addr}" = "::1" ] ; then HOST="\[::1\]" fi curl -v -k "http://$HOST:${h1_fe_1_port}/CurlTest1"; } -run server s1 -wait
[PATCH] REG-TEST: mailers: add new test for 'mailers' section
Hi List, Attached a new test to verify that the 'mailers' section is working properly. Currently with 1.9 the mailers sends thousands of mails for my setup... As the test is rather slow i have marked it with a starting letter 's'. Note that the test also fails on 1.6/1.7/1.8 but can be 'fixed' there by adding a 'timeout mail 200ms'.. (except on 1.6 which doesn't have that setting.) I don't think that should be needed though if everything was working properly? If the test could be committed, and related issues exposed fixed that would be neat ;) Thanks in advance, PiBa-NL (Pieter) From 49a605bfadaafe25de0f084c7d1d449eef9c23aa Mon Sep 17 00:00:00 2001 From: PiBa-NL Date: Sun, 23 Dec 2018 21:06:31 +0100 Subject: [PATCH] REG-TEST: mailers: add new test for 'mailers' section This test verifies the mailers section works properly by checking that it sends the proper amount of mails when health-checks are changing and or marking a server up/down The test currently fails on all versions of haproxy i tried with varying results. 1.9.0 produces thousands of mails.. 1.8.14 only sends 1 mail, needs a 200ms 'timeout mail' to succeed 1.7.11 only sends 1 mail, needs a 200ms 'timeout mail' to succeed 1.6 only sends 1 mail, (does not have the 'timeout mail' setting implemented) --- reg-tests/mailers/shealthcheckmail.lua | 105 + reg-tests/mailers/shealthcheckmail.vtc | 75 ++ 2 files changed, 180 insertions(+) create mode 100644 reg-tests/mailers/shealthcheckmail.lua create mode 100644 reg-tests/mailers/shealthcheckmail.vtc diff --git a/reg-tests/mailers/shealthcheckmail.lua b/reg-tests/mailers/shealthcheckmail.lua new file mode 100644 index ..9c75877b --- /dev/null +++ b/reg-tests/mailers/shealthcheckmail.lua @@ -0,0 +1,105 @@ + +local vtc_port1 = 0 +local mailsreceived = 0 +local mailconnectionsmade = 0 +local healthcheckcounter = 0 + +core.register_action("bug", { "http-res" }, function(txn) + data = txn:get_priv() + if not data then + data = 0 + end + data = data + 1 + print(string.format("set to %d", data)) + txn.http:res_set_status(200 + data) + txn:set_priv(data) +end) + +core.register_service("luahttpservice", "http", function(applet) + local response = "?" + local responsestatus = 200 + if applet.path == "/setport" then + vtc_port1 = applet.headers["vtcport1"][0] + response = "OK" + end + if applet.path == "/svr_healthcheck" then + healthcheckcounter = healthcheckcounter + 1 + if healthcheckcounter < 2 or healthcheckcounter > 6 then + responsestatus = 403 + end + end + + applet:set_status(responsestatus) + if applet.path == "/checkMailCounters" then + response = "MailCounters" + applet:add_header("mailsreceived", mailsreceived) + applet:add_header("mailconnectionsmade", mailconnectionsmade) + end + applet:start_response() + applet:send(response) +end) + +core.register_service("fakeserv", "http", function(applet) + applet:set_status(200) + applet:start_response() +end) + +function RecieveAndCheck(applet, expect) + data = applet:getline() + if data:sub(1,expect:len()) ~= expect then + core.Info("Expected: "..expect.." but got:"..data:sub(1,expect:len())) + applet:send("Expected: "..expect.." but got:"..data.."\r\n") + return false + end + return true +end + +core.register_service("mailservice", "tcp", function(applet) + core.Info("# Mailservice Called #") + mailconnectionsmade = mailconnectionsmade + 1 + applet:send("220 Welcome\r\n") + local data + + if RecieveAndCheck(applet, "EHLO") == false then + return + end + applet:send("250 OK\r\n") + if RecieveAndCheck(applet, "MAIL FROM:") == false then + return + end + applet:send("250 OK\r\n") + if RecieveAndCheck(applet, "RCPT TO:") == false then + return + end + applet:send("250 OK\r\n") + if RecieveAndCheck(applet, "DATA") == false then + return + end + applet:send("354 OK\r\n") + core.Info(" Send your mailbody") + local endofmail = false + local subject = "" + while endofmail ~= true do + data = apple
[PATCH] REGTEST: filters: add compression test
Hi Frederic, As requested hereby the regtest send for inclusion into the git repository. Without randomization and with your .diff applied. Also outputting expected and actual checksum if the test fails so its clear that that is the issue detected. Is it okay like this? Should the blob be bigger? As you mentioned needing a 10MB output to reproduce the original issue on your machine? Regards, PiBa-NL (Pieter) From 64460dfeacef3d04af4243396007a606c2e5dbf7 Mon Sep 17 00:00:00 2001 From: PiBa-NL Date: Sun, 23 Dec 2018 21:21:51 +0100 Subject: [PATCH] REGTEST: filters: add compression test This test checks that data transferred with compression is correctly received at different download speeds --- reg-tests/filters/b5.lua | 19 reg-tests/filters/b5.vtc | 58 2 files changed, 77 insertions(+) create mode 100644 reg-tests/filters/b5.lua create mode 100644 reg-tests/filters/b5.vtc diff --git a/reg-tests/filters/b5.lua b/reg-tests/filters/b5.lua new file mode 100644 index ..6dbe1d33 --- /dev/null +++ b/reg-tests/filters/b5.lua @@ -0,0 +1,19 @@ + +local data = "abcdefghijklmnopqrstuvwxyz" +local responseblob = "" +for i = 1,1 do + responseblob = responseblob .. "\r\n" .. i .. data:sub(1, math.floor(i % 27)) +end + +http01applet = function(applet) + local response = responseblob + applet:set_status(200) + applet:add_header("Content-Type", "application/javascript") + applet:add_header("Content-Length", string.len(response)*10) + applet:start_response() + for i = 1,10 do +applet:send(response) + end +end + +core.register_service("fileloader-http01", "http", http01applet) diff --git a/reg-tests/filters/b5.vtc b/reg-tests/filters/b5.vtc new file mode 100644 index ..5216cdaf --- /dev/null +++ b/reg-tests/filters/b5.vtc @@ -0,0 +1,58 @@ +# Checks that compression doesnt cause corruption.. + +varnishtest "Compression validation" +#REQUIRE_VERSION=1.6 + +feature ignore_unknown_macro + +haproxy h1 -conf { +global +# log stdout format short daemon + lua-load${testdir}/b5.lua + +defaults + modehttp + log global + option httplog + +frontend main-https + bind"fd@${fe1}" ssl crt ${testdir}/common.pem + compression algo gzip + compression type text/html text/plain application/json application/javascript + compression offload + use_backend TestBack if TRUE + +backend TestBack + server LocalSrv ${h1_fe2_addr}:${h1_fe2_port} + +listen fileloader + mode http + bind "fd@${fe2}" + http-request use-service lua.fileloader-http01 +} -start + +shell { +HOST=${h1_fe1_addr} +if [ "${h1_fe1_addr}" = "::1" ] ; then +HOST="\[::1\]" +fi + +md5=$(which md5 || which md5sum) + +if [ -z $md5 ] ; then +echo "MD5 checksum utility not found" +exit 1 +fi + +expectchecksum="4d9c62aa5370b8d5f84f17ec2e78f483" + +for opt in "" "--limit-rate 300K" "--limit-rate 500K" ; do +checksum=$(curl --compressed -k "https://$HOST:${h1_fe1_port}"; $opt | $md5 | cut -d ' ' -f1) +if [ "$checksum" != "$expectchecksum" ] ; then + echo "Expecting checksum $expectchecksum" + echo "Received checksum: $checksum" + exit 1; +fi +done + +} -run -- 2.18.0.windows.1
Re: [PATCH] REG-TEST: mailers: add new test for 'mailers' section
Changed subject of patch requirement to 'REGTEST'. Op 23-12-2018 om 21:17 schreef PiBa-NL: Hi List, Attached a new test to verify that the 'mailers' section is working properly. Currently with 1.9 the mailers sends thousands of mails for my setup... As the test is rather slow i have marked it with a starting letter 's'. Note that the test also fails on 1.6/1.7/1.8 but can be 'fixed' there by adding a 'timeout mail 200ms'.. (except on 1.6 which doesn't have that setting.) I don't think that should be needed though if everything was working properly? If the test could be committed, and related issues exposed fixed that would be neat ;) Thanks in advance, PiBa-NL (Pieter) From 8d63f5a39a9b4b326b636e42ccafcf0c2173d752 Mon Sep 17 00:00:00 2001 From: PiBa-NL Date: Sun, 23 Dec 2018 21:06:31 +0100 Subject: [PATCH] REGTEST: mailers: add new test for 'mailers' section This test verifies the mailers section works properly by checking that it sends the proper amount of mails when health-checks are changing and or marking a server up/down The test currently fails on all versions of haproxy i tried with varying results. 1.9.0 produces thousands of mails.. 1.8.14 only sends 1 mail, needs a 200ms 'timeout mail' to succeed 1.7.11 only sends 1 mail, needs a 200ms 'timeout mail' to succeed 1.6 only sends 1 mail, (does not have the 'timeout mail' setting implemented) --- reg-tests/mailers/shealthcheckmail.lua | 105 + reg-tests/mailers/shealthcheckmail.vtc | 75 ++ 2 files changed, 180 insertions(+) create mode 100644 reg-tests/mailers/shealthcheckmail.lua create mode 100644 reg-tests/mailers/shealthcheckmail.vtc diff --git a/reg-tests/mailers/shealthcheckmail.lua b/reg-tests/mailers/shealthcheckmail.lua new file mode 100644 index ..9c75877b --- /dev/null +++ b/reg-tests/mailers/shealthcheckmail.lua @@ -0,0 +1,105 @@ + +local vtc_port1 = 0 +local mailsreceived = 0 +local mailconnectionsmade = 0 +local healthcheckcounter = 0 + +core.register_action("bug", { "http-res" }, function(txn) + data = txn:get_priv() + if not data then + data = 0 + end + data = data + 1 + print(string.format("set to %d", data)) + txn.http:res_set_status(200 + data) + txn:set_priv(data) +end) + +core.register_service("luahttpservice", "http", function(applet) + local response = "?" + local responsestatus = 200 + if applet.path == "/setport" then + vtc_port1 = applet.headers["vtcport1"][0] + response = "OK" + end + if applet.path == "/svr_healthcheck" then + healthcheckcounter = healthcheckcounter + 1 + if healthcheckcounter < 2 or healthcheckcounter > 6 then + responsestatus = 403 + end + end + + applet:set_status(responsestatus) + if applet.path == "/checkMailCounters" then + response = "MailCounters" + applet:add_header("mailsreceived", mailsreceived) + applet:add_header("mailconnectionsmade", mailconnectionsmade) + end + applet:start_response() + applet:send(response) +end) + +core.register_service("fakeserv", "http", function(applet) + applet:set_status(200) + applet:start_response() +end) + +function RecieveAndCheck(applet, expect) + data = applet:getline() + if data:sub(1,expect:len()) ~= expect then + core.Info("Expected: "..expect.." but got:"..data:sub(1,expect:len())) + applet:send("Expected: "..expect.." but got:"..data.."\r\n") + return false + end + return true +end + +core.register_service("mailservice", "tcp", function(applet) + core.Info("# Mailservice Called #") + mailconnectionsmade = mailconnectionsmade + 1 + applet:send("220 Welcome\r\n") + local data + + if RecieveAndCheck(applet, "EHLO") == false then + return + end + applet:send("250 OK\r\n") + if RecieveAndCheck(applet, "MAIL FROM:") == false then + return + end + applet:send("250 OK\r\n") + if RecieveAndCheck(applet, "RCPT TO:") == false then + return + end + applet:send("250 OK\r\n") + if RecieveAndCheck(applet, "DATA") == false then + return + end + applet:send("354 OK\r\n") + core.Info(" Send your mailbody") + local endofmail = false +
Re: [PATCH] REGTEST: filters: add compression test
Added LUA requirement into the test.. Op 23-12-2018 om 23:05 schreef PiBa-NL: Hi Frederic, As requested hereby the regtest send for inclusion into the git repository. Without randomization and with your .diff applied. Also outputting expected and actual checksum if the test fails so its clear that that is the issue detected. Is it okay like this? Should the blob be bigger? As you mentioned needing a 10MB output to reproduce the original issue on your machine? Regards, PiBa-NL (Pieter) From 29c3b9d344f360503bcd30f48558ca8a51df92ed Mon Sep 17 00:00:00 2001 From: PiBa-NL Date: Sun, 23 Dec 2018 21:21:51 +0100 Subject: [PATCH] REGTEST: filters: add compression test This test checks that data transferred with compression is correctly received at different download speeds --- reg-tests/filters/b5.lua | 19 reg-tests/filters/b5.vtc | 59 2 files changed, 78 insertions(+) create mode 100644 reg-tests/filters/b5.lua create mode 100644 reg-tests/filters/b5.vtc diff --git a/reg-tests/filters/b5.lua b/reg-tests/filters/b5.lua new file mode 100644 index ..6dbe1d33 --- /dev/null +++ b/reg-tests/filters/b5.lua @@ -0,0 +1,19 @@ + +local data = "abcdefghijklmnopqrstuvwxyz" +local responseblob = "" +for i = 1,1 do + responseblob = responseblob .. "\r\n" .. i .. data:sub(1, math.floor(i % 27)) +end + +http01applet = function(applet) + local response = responseblob + applet:set_status(200) + applet:add_header("Content-Type", "application/javascript") + applet:add_header("Content-Length", string.len(response)*10) + applet:start_response() + for i = 1,10 do +applet:send(response) + end +end + +core.register_service("fileloader-http01", "http", http01applet) diff --git a/reg-tests/filters/b5.vtc b/reg-tests/filters/b5.vtc new file mode 100644 index ..2f4982cb --- /dev/null +++ b/reg-tests/filters/b5.vtc @@ -0,0 +1,59 @@ +# Checks that compression doesnt cause corruption.. + +varnishtest "Compression validation" +#REQUIRE_VERSION=1.6 +#REQUIRE_OPTIONS=LUA + +feature ignore_unknown_macro + +haproxy h1 -conf { +global +# log stdout format short daemon + lua-load${testdir}/b5.lua + +defaults + modehttp + log global + option httplog + +frontend main-https + bind"fd@${fe1}" ssl crt ${testdir}/common.pem + compression algo gzip + compression type text/html text/plain application/json application/javascript + compression offload + use_backend TestBack if TRUE + +backend TestBack + server LocalSrv ${h1_fe2_addr}:${h1_fe2_port} + +listen fileloader + mode http + bind "fd@${fe2}" + http-request use-service lua.fileloader-http01 +} -start + +shell { +HOST=${h1_fe1_addr} +if [ "${h1_fe1_addr}" = "::1" ] ; then +HOST="\[::1\]" +fi + +md5=$(which md5 || which md5sum) + +if [ -z $md5 ] ; then +echo "MD5 checksum utility not found" +exit 1 +fi + +expectchecksum="4d9c62aa5370b8d5f84f17ec2e78f483" + +for opt in "" "--limit-rate 300K" "--limit-rate 500K" ; do +checksum=$(curl --compressed -k "https://$HOST:${h1_fe1_port}"; $opt | $md5 | cut -d ' ' -f1) +if [ "$checksum" != "$expectchecksum" ] ; then + echo "Expecting checksum $expectchecksum" + echo "Received checksum: $checksum" + exit 1; +fi +done + +} -run -- 2.18.0.windows.1
htx with compression issue, "Gunzip error: Body lacks gzip magics"
Hi List, When using compression with htx, and a slightly delayed body content it will prefix some rubbish and corrupt the gzip header.. Below output i get with attached test.. Removing http-use-htx 'fixes' the test. This happens with both 1.9.0 and todays commit a2dbeb2, not sure if this ever worked before.. c1 0.1 len|1A\r c1 0.1 chunk|\222\7\0\0\0\377\377\213\10\0\0\0\0\0\4\3JLJN\1\0\0\0\377\377 c1 0.1 len|0\r c1 0.1 bodylen = 26 ** c1 0.1 === expect resp.status == 200 c1 0.1 EXPECT resp.status (200) == "200" match ** c1 0.1 === expect resp.http.content-encoding == "gzip" c1 0.1 EXPECT resp.http.content-encoding (gzip) == "gzip" match ** c1 0.1 === gunzip c1 0.1 Gunzip error: Body lacks gzip magics Can someone take a look? Thanks in advance. Regards, PiBa-NL (Pieter) # Checks htx with compression and a short delay between headers and data send by the server varnishtest "Connection counters check" feature ignore_unknown_macro server s1 { rxreq txresp -nolen -hdr "Content-Length: 4" delay 0.05 send "abcd" } -start haproxy h1 -conf { global stats socket /tmp/haproxy.socket level admin defaults mode http option http-use-htx frontend fe1 bind "fd@${fe1}" compression algo gzip #filter trace name BEFORE-HTTP-COMP #filter compression #filter trace name AFTER-HTTP-COMP default_backend b1 backend b1 server srv1 ${s1_addr}:${s1_port} } -start # configure port for lua to call fe4 client c1 -connect ${h1_fe1_sock} { txreq -url "/" -hdr "Accept-Encoding: gzip" rxresp expect resp.status == 200 expect resp.http.content-encoding == "gzip" gunzip expect resp.body == "abcd" } -run
Re: htx with compression issue, "Gunzip error: Body lacks gzip magics"
Hi Christopher, Willy, Op 2-1-2019 om 15:37 schreef Christopher Faulet: Le 29/12/2018 à 01:29, PiBa-NL a écrit : compression with htx, and a slightly delayed body content it will prefix some rubbish and corrupt the gzip header.. Hi Pieter, In fact, It is not a bug related to the compression. But a pure HTX one, about the defragmentation when we need space to store data. Here is a patch. It fixes the problem for me. Okay so the compression somehow 'triggers' this defragmentation to happen, are there simpler ways to make that happen 'on demand' ? Willy, if it is ok for you, I can merge it in upstream and backport it in 1.9. -- Christopher Faulet The patch fixes the reg-test for me as well, I guess its good to go :). Thanks. Regards, PiBa-NL (Pieter)
compression in defaults happens twice with 1.9.0
Hi List, Using both 1.9.0 and 2.0-dev0-909b9d8 compression happens twice when configured in defaults. This was noticed by user walle303 on IRC. Seems like a bug to me as 1.8.14 does not show this behavior. Attached a little regtest that reproduces the issue. Can someone take a look, thanks in advance. Regards, PiBa-NL (Pieter) s1 0.0 txresp|!"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_ s1 0.0 txresp|"#$%&'()*+,-./0123456789:;<=>?@ABCD *** s1 0.0 shutting fd 4 ** s1 0.0 Ending *** h1 0.0 debug|:b1.srvrep[000a:adfd]: HTTP/1.1 200 OK *** h1 0.0 debug|:b1.srvhdr[000a:adfd]: Content-Length: 100 *** h1 0.0 debug|:b1.srvcls[000a:adfd] c1 0.0 rxhdr|HTTP/1.1 200 OK\r c1 0.0 rxhdr|Content-Encoding: gzip\r c1 0.0 rxhdr|Transfer-Encoding: chunked\r c1 0.0 rxhdr|Co57\r c1 0.0 rxhdr|\037\213\010 c1 0.0 rxhdrlen = 78 c1 0.0 http[ 0] |HTTP/1.1 c1 0.0 http[ 1] |200 c1 0.0 http[ 2] |OK c1 0.0 http[ 3] |Content-Encoding: gzip c1 0.0 http[ 4] |Transfer-Encoding: chunked c1 0.0 http[ 5] |Co57 c1 0.0 http[ 6] |\037\213\010 c1 10.2 HTTP rx timeout (fd:7 1 ms) # Checks compression defined in defaults doesnt happen twice varnishtest "Compression in defaults" feature ignore_unknown_macro server s1 { rxreq txresp -bodylen 100 } -start haproxy h1 -conf { defaults mode http compression algo gzip frontend fe1 bind "fd@${fe_1}" default_backend b1 backend b1 server srv1 ${s1_addr}:${s1_port} } -start client c1 -connect ${h1_fe_1_sock} { txreq -url "/" -hdr "Accept-Encoding: gzip" rxresp expect resp.status == 200 expect resp.http.content-encoding == "gzip" expect resp.http.transfer-encoding == "chunked" gunzip expect resp.bodylen == 100 } -run server s1 -wait
Re: [PATCH] REG-TEST: mailers: add new test for 'mailers' section
Hi, 2 weeks passed without reply, so a little hereby a little 'bump'.. I know everyone has been busy, but would be nice to get test added or at least the biggest issue of the 'mailbomb' fixed before next release. If its 'scheduled' to get looked at later thats okay. Just making sure it aint forgotten about :). The 23654 mails received for a failed server is a bit much.. c2 7.5 EXPECT resp.http.mailsreceived (23654) == "16" failed Regards, PiBa-NL (Pieter) Op 23-12-2018 om 23:37 schreef PiBa-NL: Changed subject of patch requirement to 'REGTEST'. Op 23-12-2018 om 21:17 schreef PiBa-NL: Hi List, Attached a new test to verify that the 'mailers' section is working properly. Currently with 1.9 the mailers sends thousands of mails for my setup... As the test is rather slow i have marked it with a starting letter 's'. Note that the test also fails on 1.6/1.7/1.8 but can be 'fixed' there by adding a 'timeout mail 200ms'.. (except on 1.6 which doesn't have that setting.) I don't think that should be needed though if everything was working properly? If the test could be committed, and related issues exposed fixed that would be neat ;) Thanks in advance, PiBa-NL (Pieter)
Re: [PATCH] REG-TEST: mailers: add new test for 'mailers' section
Hi Willy, Op 7-1-2019 om 15:25 schreef Willy Tarreau: Hi Pieter, On Sun, Jan 06, 2019 at 04:38:21PM +0100, PiBa-NL wrote: The 23654 mails received for a failed server is a bit much.. I agree. I really don't know much how the mails work to be honest, as I have never used them. I remember that we reused a part of the tcp-check infrastructure because by then it offered a convenient way to proceed with send/expect sequences. Maybe there's something excessive in the sequence there, such as a certain status code being expected at the end while the mail succeeds, I don't know. Given that this apparently has always been broken, For 1 part its always been broken (needing the short mailer timeout to send all expected mails), for the other part, at least until 1.8.14 it used to NOT send thousands of mails so that would be a regression in the current 1.9 version that should get fixed on a shorter term. I'm hesitant between merging this in the slow category or the broken one. My goal with "broken" was to keep the scripts that trigger broken behaviours that need to be addressed, rather than keep broken scripts. Indeed keeping broken scripts wouldn't be help-full in the long run, unless there is still the intent to fix them. It isn't what the makefile says about 'LEVEL 5' though. It says its for 'broken scripts' and to quickly disable them, not as you write here for scripts that show 'broken haproxy behavior'. My goal is to make sure we never consider it normal to have failures in the regular test suite, otherwise you know how it becomes, just like compiler warnings, people say "oh I didn't notice this new error in the middle of all other ones". Agreed, though i will likely fall into repeat some day, apology in advance ;).. I guess we could 'fix' the regtest by specifying the 'timeout mail 200', that would fix it for 1.7 and 1.8.. And might help for 1.9 regressiontests and to get it fixed to at least not send thousands of mails. We might forget about the short time requirement then though, which seems strange as well. And the test wouldn't be 1.6 compatible as it doesn't have that setting at all. Thus probably the best thing to do is to use it at level 5 so that it's easier to work on the bug without triggering false positives when doing regression testing. What's your opinion ? With a changed description for 'level 5' being 'shows broken haproxy behavior, to be fixed in a future release' i think it would fit in there nicely. Can you change the starting letter of the .vtc test (and the .lua and reference to that) to 'k' during committing? Or shall i re-send it? p.s. What do you think about the 'naming' of the test? 'k_healthcheckmail.vtc' or 'k0.vtc' personally i don't think the 'numbering' of tests makes them easier to use.?. thanks, Willy Regards, PiBa-NL (Pieter)
Re: compression in defaults happens twice with 1.9.0
Hi Christopher, Op 7-1-2019 om 16:32 schreef Christopher Faulet: Le 06/01/2019 à 16:22, PiBa-NL a écrit : Hi List, Using both 1.9.0 and 2.0-dev0-909b9d8 compression happens twice when configured in defaults. This was noticed by user walle303 on IRC. Seems like a bug to me as 1.8.14 does not show this behavior. Attached a little regtest that reproduces the issue. Can someone take a look, thanks in advance. Hi Pieter, Here is the patch that should fix this issue. Could you confirm please ? Thanks Works for me. Thanks! Regards, PiBa-NL (Pieter)
Re: regtests - with option http-use-htx
Hi Frederic, Op 8-1-2019 om 16:27 schreef Frederic Lecaille: On 12/15/18 4:52 PM, PiBa-NL wrote: Hi List, Willy, Trying to run some existing regtests with added option: option http-use-htx Using: HA-Proxy version 1.9-dev10-c11ec4a 2018/12/15 I get the below issues sofar: based on /reg-tests/connection/b0.vtc Takes 8 seconds to pass, in a slightly modified manor 1.1 > 2.0 expectation for syslog. This surely needs a closer look? # top TEST ./htx-test/connection-b0.vtc passed (8.490) based on /reg-tests/stick-table/b1.vtc Difference here is the use=1 vs use=0 , maybe that is better, but then the 'old' expectation seems wrong, and the bug is the case without htx? Note that the server s1 never responds. Furthermore, c1 client is run with -run argument. This means that we wait for its termination before running accessing CLI. Then we check that there is no consistency issue with the stick-table: if the entry has expired we get only this line: table: http1, type: ip, size:1024, used:0 if not we get these two lines: table: http1, type: ip, size:1024, used:1 .* use=0 ... here used=1 means there is still an entry in the stick-table, and use=0 means it is not currently in use (I guess this is because the client has closed its connection). I do not reproduce your issue with this script both on Linux and FreeBSD 11 both with or without htx. Did you try with the 'old' development version (1.9-dev10-c11ec4a 2018/12/15), i think in current version its already fixed see my own test results also below. h1 0.0 CLI recv|0x8026612c0: key=127.0.0.1 use=1 exp=0 gpt0=0 gpc0=0 gpc0_rate(1)=0 conn_rate(1)=1 http_req_cnt=1 http_req_rate(1)=1 http_err_cnt=0 http_err_rate(1)=0 h1 0.0 CLI recv| h1 0.0 CLI expect failed ~ "table: http1, type: ip, size:1024, used:(0|1\n0x[0-9a-f]*: key=127\.0\.0\.1 use=0 exp=[0-9]* gpt0=0 gpc0=0 gpc0_rate\(1\)=0 conn_rate\(1\)=1 http_req_cnt=1 http_req_rate\(1\)=1 http_err_cnt=0 http_err_rate\(10000\)=0)\n$" Regards, PiBa-NL (Pieter) I tried again today with both 2.0-dev0-251a6b7 and 1.9.0-8223050 and 1.9-dev10-c11ec4a : HA-Proxy version 2.0-dev0-251a6b7 2019/01/08 - https://haproxy.org/ ## Without HTX # top TEST ./PB-TEST/2018/connection-b0.vtc passed (0.146) # top TEST ./PB-TEST/2018/stick-table-b1.vtc passed (0.127) 0 tests failed, 0 tests skipped, 2 tests passed ## With HTX # top TEST ./PB-TEST/2018/connection-b0.vtc passed (0.147) # top TEST ./PB-TEST/2018/stick-table-b1.vtc passed (0.127) 0 tests failed, 0 tests skipped, 2 tests passed HA-Proxy version 1.9.0-8223050 2018/12/19 - https://haproxy.org/ ## Without HTX # top TEST ./PB-TEST/2018/connection-b0.vtc passed (0.150) # top TEST ./PB-TEST/2018/stick-table-b1.vtc passed (0.128) 0 tests failed, 0 tests skipped, 2 tests passed ## With HTX # top TEST ./PB-TEST/2018/connection-b0.vtc passed (0.148) # top TEST ./PB-TEST/2018/stick-table-b1.vtc passed (0.127) 0 tests failed, 0 tests skipped, 2 tests passed HA-Proxy version 1.9-dev10-c11ec4a 2018/12/15 Copyright 2000-2018 Willy Tarreau ## Without HTX # top TEST ./PB-TEST/2018/connection-b0.vtc passed (0.146) # top TEST ./PB-TEST/2018/stick-table-b1.vtc passed (0.127) 0 tests failed, 0 tests skipped, 2 tests passed ## With HTX # top TEST ./PB-TEST/2018/connection-b0.vtc passed (8.646) * top 0.0 TEST ./PB-TEST/2018/stick-table-b1.vtc starting h1 0.0 CLI recv|# table: http1, type: ip, size:1024, used:1 h1 0.0 CLI recv|0x80262a200: key=127.0.0.1 use=1 exp=0 gpt0=0 gpc0=0 gpc0_rate(1)=0 conn_rate(1)=1 http_req_cnt=1 http_req_rate(1)=1 http_err_cnt=0 http_err_rate(1)=0 h1 0.0 CLI recv| h1 0.0 CLI expect failed ~ "table: http1, type: ip, size:1024, used:(0|1\n0x[0-9a-f]*: key=127\.0\.0\.1 use=0 exp=[0-9]* gpt0=0 gpc0=0 gpc0_rate\(1\)=0 conn_rate\(1\)=1 http_req_cnt=1 http_req_rate\(1\)=1 http_err_cnt=0 http_err_rate\(1\)=0)\n$" * top 0.0 RESETTING after ./PB-TEST/2018/stick-table-b1.vtc ** h1 0.0 Reset and free h1 haproxy 92940 # top TEST ./PB-TEST/2018/stick-table-b1.vtc FAILED (0.127) exit=2 1 tests failed, 0 tests skipped, 1 tests passed With the 'old' 1.9-dev10 version and with HTX i can still reproduce the "passed (8.646)" and "use=1".. But both 1.9.0 and 2.0-dev don't show that behavior. I have not 'bisected' further, but i don't think there is anything to do a.t.m. regarding this old (already fixed) issue. Regards, PiBa-NL (Pieter)
Re: [PATCH] REGTEST: filters: add compression test
Hi Frederic, Op 7-1-2019 om 10:13 schreef Frederic Lecaille: On 12/23/18 11:38 PM, PiBa-NL wrote: As requested hereby the regtest send for inclusion into the git repository. It is OK like that. Note that you patch do not add reg-test/filters/common.pem which could be a symlink to ../ssl/common.pem. Also note that since 8f16148Christopher's commit, we add such a line where possible: ${no-htx} option http-use-htx We should also rename your test files to reg-test/filters/h0.* Thank you. Fred. Together with these changes you have supplied me already off-list, i've also added a " --max-time 15" for the curl request, that should be sufficient for most systems to complete the 3 second testcase, and allows the shell command to complete without varnishtest killing it after a timeout and not showing any of the curl output.. One last question, currently its being added to a new folder: reg-test/filters/ , perhaps it should be in reg-test/compression/ ? If you agree that needs changing i guess that can be done upon committing it? Note that the test fails on my FreeBSD system when using HTX when using '2.0-dev0-251a6b7 2019/01/08', i'm not aware it ever worked (i didn't test it with HTX before..). top 15.2 shell_out|curl: (28) Operation timed out after 15036 milliseconds with 187718 bytes received Log attached.. Would it help to log it with the complete "filter trace name BEFORE / filter compression / filter trace name AFTER" ? Or are there other details i could try and gather? Regards, PiBa-NL (Pieter) From 793e770b399157a1549a2655612a29845b165dd6 Mon Sep 17 00:00:00 2001 From: PiBa-NL Date: Sun, 23 Dec 2018 21:21:51 +0100 Subject: [PATCH] REGTEST: filters: add compression test This test checks that data transferred with compression is correctly received at different download speeds --- reg-tests/filters/common.pem | 1 + reg-tests/filters/s0.lua | 19 ++ reg-tests/filters/s0.vtc | 59 3 files changed, 79 insertions(+) create mode 12 reg-tests/filters/common.pem create mode 100644 reg-tests/filters/s0.lua create mode 100644 reg-tests/filters/s0.vtc diff --git a/reg-tests/filters/common.pem b/reg-tests/filters/common.pem new file mode 12 index ..a4433d56 --- /dev/null +++ b/reg-tests/filters/common.pem @@ -0,0 +1 @@ +../ssl/common.pem \ No newline at end of file diff --git a/reg-tests/filters/s0.lua b/reg-tests/filters/s0.lua new file mode 100644 index ..2cc874b9 --- /dev/null +++ b/reg-tests/filters/s0.lua @@ -0,0 +1,19 @@ + +local data = "abcdefghijklmnopqrstuvwxyz" +local responseblob = "" +for i = 1,1 do + responseblob = responseblob .. "\r\n" .. i .. data:sub(1, math.floor(i % 27)) +end + +http01applet = function(applet) + local response = responseblob + applet:set_status(200) + applet:add_header("Content-Type", "application/javascript") + applet:add_header("Content-Length", string.len(response)*10) + applet:start_response() + for i = 1,10 do +applet:send(response) + end +end + +core.register_service("fileloader-http01", "http", http01applet) diff --git a/reg-tests/filters/s0.vtc b/reg-tests/filters/s0.vtc new file mode 100644 index ..231344a6 --- /dev/null +++ b/reg-tests/filters/s0.vtc @@ -0,0 +1,59 @@ +# Checks that compression doesnt cause corruption.. + +varnishtest "Compression validation" +#REQUIRE_VERSION=1.6 + +feature ignore_unknown_macro + +haproxy h1 -conf { +global +# log stdout format short daemon + lua-load${testdir}/s0.lua + +defaults + modehttp + log global + ${no-htx} option http-use-htx + option httplog + +frontend main-https + bind"fd@${fe1}" ssl crt ${testdir}/common.pem + compression algo gzip + compression type text/html text/plain application/json application/javascript + compression offload + use_backend TestBack if TRUE + +backend TestBack + server LocalSrv ${h1_fe2_addr}:${h1_fe2_port} + +listen fileloader + mode http + bind "fd@${fe2}" + http-request use-service lua.fileloader-http01 +} -start + +shell { +HOST=${h1_fe1_addr} +if [ "${h1_fe1_addr}" = "::1" ] ; then +HOST="\[::1\]" +fi + +md5=$(which md5 || which md5sum) + +if [ -z $md5 ] ; then +echo "MD5 checksum utility not found" +exit 1 +fi + +expectchecksum="4d9c62aa5370b8d5f84f17ec2e78f483" + +for opt in "" "--limit-rate 300K" "--limit-rate 500K" ; do +checksum=$(curl --max-time 15 --compressed -k "https://$HOST:${h1_fe1_port}"; $opt | $m
coredump in h2_process_mux with 1.9.0-8223050
Hi List, Willy, Got a coredump of 1.9.0-8223050 today, see below. Would this be 'likely' the same one with the 'PRIORITY' that 1.9.1 fixes? I don't have any idea what the exact circumstance request/response was.. Anyhow i updated my system to 2.0-dev0-251a6b7 for the moment, lets see if something strange happens again. Might take a few days though, IF it still occurs.. Regards, PiBa-NL (Pieter) Core was generated by `/usr/local/sbin/haproxy -f /var/etc/haproxy/haproxy.cfg -p /var/run/haproxy.pid'. Program terminated with signal SIGSEGV, Segmentation fault. #0 0x004b91c7 in h2_process_mux (h2c=0x802657480) at src/mux_h2.c:2434 2434 src/mux_h2.c: No such file or directory. (gdb) bt full #0 0x004b91c7 in h2_process_mux (h2c=0x802657480) at src/mux_h2.c:2434 h2s = 0x80262c7a0 h2s_back = 0x80262ca40 #1 0x004b844d in h2_send (h2c=0x802657480) at src/mux_h2.c:2560 flags = 0 conn = 0x8026dc300 done = 0 sent = 1 #2 0x004b8a49 in h2_process (h2c=0x802657480) at src/mux_h2.c:2640 conn = 0x8026dc300 #3 0x004b32e1 in h2_wake (conn=0x8026dc300) at src/mux_h2.c:2715 h2c = 0x802657480 #4 0x005c8158 in conn_fd_handler (fd=7) at src/connection.c:190 conn = 0x8026dc300 flags = 0 io_available = 0 #5 0x005e3c7c in fdlist_process_cached_events (fdlist=0x9448f0 ) at src/fd.c:441 fd = 7 old_fd = 7 e = 117 #6 0x005e377c in fd_process_cached_events () at src/fd.c:459 No locals. #7 0x00514296 in run_poll_loop () at src/haproxy.c:2655 next = 762362654 exp = 762362654 #8 0x00510b78 in run_thread_poll_loop (data=0x802615970) at src/haproxy.c:2684 start_lock = 0 ptif = 0x92ed10 ptdf = 0x0 #9 0x0050d1a6 in main (argc=6, argv=0x7fffec60) at src/haproxy.c:3313 tids = 0x802615970 threads = 0x802615998 i = 1 old_sig = {__bits = {0, 0, 0, 0}} blocked_sig = {__bits = {4227856759, 4294967295, 4294967295, 4294967295}} err = 0 retry = 200 limit = {rlim_cur = 2040, rlim_max = 2040} errmsg = "\000\354\377\377\377\177\000\000\230\354\377\377\377\177\000\000`\354\377\377\377\177\000\000\006\000\000\000\000\000\000\000\f\373\353\230\373\032\351~\240\270\223\000\000\000\000\000X\354\377\377\377\177\000\000\230\354\377\377\377\177\000\000`\354\377\377\377\177\000\000\006\000\000\000\000\000\000\000\000\354\377\377\377\177\000\000\302z\000\002\b\000\000\000\001\000\000" pidfd = 17 (gdb)
Re: [PATCH] REGTEST: filters: add compression test
Thank you Christopher & Frederic. Op 9-1-2019 om 14:47 schreef Christopher Faulet: Le 09/01/2019 à 10:43, Frederic Lecaille a écrit : On 1/8/19 11:25 PM, PiBa-NL wrote: Hi Frederic, Hi Pieter, Op 7-1-2019 om 10:13 schreef Frederic Lecaille: On 12/23/18 11:38 PM, PiBa-NL wrote: As requested hereby the regtest send for inclusion into the git repository. It is OK like that. Note that you patch do not add reg-test/filters/common.pem which could be a symlink to ../ssl/common.pem. Also note that since 8f16148Christopher's commit, we add such a line where possible: ${no-htx} option http-use-htx We should also rename your test files to reg-test/filters/h0.* Thank you. Fred. Together with these changes you have supplied me already off-list, i've also added a " --max-time 15" for the curl request, that should be sufficient for most systems to complete the 3 second testcase, and allows the shell command to complete without varnishtest killing it after a timeout and not showing any of the curl output.. One last question, currently its being added to a new folder: reg-test/filters/ , perhaps it should be in reg-test/compression/ ? If you agree that needs changing i guess that can be done upon committing it? I have modified your patch to move your new files to reg-test/compression. I have also applied this to it ;) : 's/\r$//' Note that the test fails on my FreeBSD system when using HTX when using '2.0-dev0-251a6b7 2019/01/08', i'm not aware it ever worked (i didn't test it with HTX before..). top 15.2 shell_out|curl: (28) Operation timed out after 15036 milliseconds with 187718 bytes received Ok, I will take some time to have a look at this BSD specific issue. Note that we can easily use the CLI at the end of the script to troubleshooting anything. Log attached.. Would it help to log it with the complete "filter trace name BEFORE / filter compression / filter trace name AFTER" ? Or are there other details i could try and gather? I do not fill at ease enough on compression/filter topics to reply to your question Pieter ;) Nevertheless I think your test deserve to be merged. *The patch to be merged is attached to this mail*. Thank a lot Pieter. Thanks Fred and Pieter, now merged. I've just updated the patch to add the list of required options in the VTC file. Hereby just a little confirmation that this works well now :) also in my tests. Regards, PiBa-NL (Pieter)
Re: Lots of mail from email alert on 1.9.x
Hi Johan, Olivier, Willy, Op 10-1-2019 om 17:00 schreef Johan Hendriks: I just updated to 1.9.1 on my test system. We noticed that when a server fails we now get tons of mail, and with tons we mean a lot. After a client backend server fails we usually get 1 mail on 1.8.x now with 1.9.1 within 1 minute we have the following. mailq | grep -B2 l...@testdomain.nl | grep '^[A-F0-9]' | awk '{print $1}' | sed 's/*//' | postsuper -d - postsuper: Deleted: 19929 messages My setting from the backend part is as follows. email-alert mailers alert-mailers email-alert from l...@testdomain.nl email-alert to not...@testdomain.nl server webserver09 11.22.33.44:80 check Has something changed in 1.9.x (it was on 1.9.0 also) regards Johan Hendriks Its a 'known issue' see: https://www.mail-archive.com/haproxy@formilux.org/msg32290.html a 'regtest' is added in that mail thread also to aid developers in reproducing the issue and validating a possible fix. @Olivier, Willy, may i assume this mailbomb feature is 'planned' to get fixed in 1.9.2 ? (perhaps a bugtracker with a 'target version' would be nice ;) ?) Regards, PiBa-NL (Pieter)
Re: Lots of mail from email alert on 1.9.x
Hi Olivier, Op 11-1-2019 om 19:17 schreef Olivier Houchard: Ok so erm, I'd be lying if I claimed I enjoy working on the check code, or that I understand it fully. However, after talking with Willy and Christopher, I think I may have comed with an acceptable solution, and the attached patch should fix it (at least by getting haproxy to segfault, but it shouldn't mailbomb you anymore). Pieter, I'd be very interested to know if it still work with your setup. It's a different way of trying to fix what you tried ot fix with 1714b9f28694d750d446917672dd59c46e16afd7 I'd like to be sure I didn't break it for you again:) Regards, Olivier (Slightly modified patches, I think there were a potential race condition when running with multiple threads). Olivier Thanks for this 'change in behavior' ;). Indeed the mailbomb is fixed, and it seems the expected mails get generated and delivered, but a segfault also happens on occasion. Not with the regtest as it was, but with a few minor modifications (adding a unreachable mailserver, and giving it a little more time seems to be the most reliable reproduction a.t.m.) it will crash consistently after 11 seconds.. So i guess the patch needs a bit more tweaking. Regards, PiBa-NL (Pieter) Core was generated by `haproxy -d -f /tmp/vtc.37274.4b8a1a3a/h1/cfg'. Program terminated with signal SIGSEGV, Segmentation fault. #0 0x00500955 in chk_report_conn_err (check=0x802616a10, errno_bck=0, expired=1) at src/checks.c:689 689 dns_trigger_resolution(check->server->dns_requester); (gdb) bt full #0 0x00500955 in chk_report_conn_err (check=0x802616a10, errno_bck=0, expired=1) at src/checks.c:689 cs = 0x8027de0c0 conn = 0x802683180 err_msg = 0x80266d0c0 " at step 1 of tcp-check (connect)" chk = 0x80097b848 step = 1 comment = 0x0 #1 0x005065a5 in process_chk_conn (t=0x802656640, context=0x802616a10, state=513) at src/checks.c:2261 check = 0x802616a10 proxy = 0x8026c3000 cs = 0x8027de0c0 conn = 0x802683180 rv = 0 ret = 0 expired = 1 #2 0x0050596e in process_chk (t=0x802656640, context=0x802616a10, state=513) at src/checks.c:2330 check = 0x802616a10 #3 0x004fe0a2 in process_email_alert (t=0x802656640, context=0x802616a10, state=513) at src/checks.c:3210 check = 0x802616a10 q = 0x802616a00 alert = 0x7fffe340 #4 0x005f2523 in process_runnable_tasks () at src/task.c:435 t = 0x802656640 state = 513 ctx = 0x802616a10 process = 0x4fdeb0 t = 0x8026566e0 max_processed = 200 #5 0x005163a2 in run_poll_loop () at src/haproxy.c:2619 next = 1062130135 exp = 1062129684 #6 0x00512ff8 in run_thread_poll_loop (data=0x8026310f0) at src/haproxy.c:2684 start_lock = 0 ptif = 0x935d40 ---Type to continue, or q to quit--- ptdf = 0x0 #7 0x0050f626 in main (argc=4, argv=0x7fffead8) at src/haproxy.c:3313 tids = 0x8026310f0 threads = 0x8026310f8 i = 1 old_sig = {__bits = {0, 0, 0, 0}} blocked_sig = {__bits = {4227856759, 4294967295, 4294967295, 4294967295}} err = 0 retry = 200 limit = {rlim_cur = 4046, rlim_max = 4046} errmsg = "\000\352\377\377\377\177\000\000\000\353\377\377\377\177\000\000\330\352\377\377\377\177\000\000\004\000\000\000\000\000\000\00 0\b\250\037\315})5:`)\224\000\000\000\000\000\320\352\377\377\377\177\000\000\000\353\377\377\377\177\000\000\330\352\377\377\377\177\000\000\004 \000\000\000\000\000\00 reg-tests/mailers/k_healthcheckmail.vtc | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/reg-tests/mailers/k_healthcheckmail.vtc b/reg-tests/mailers/k_healthcheckmail.vtc index d3af3589..820191c8 100644 --- a/reg-tests/mailers/k_healthcheckmail.vtc +++ b/reg-tests/mailers/k_healthcheckmail.vtc @@ -48,6 +48,7 @@ defaults # timeout mail 20s # timeout mail 200ms mailer smtp1 ${h1_femail_addr}:${h1_femail_port} + mailer smtp2 ipv4@192.0.2.100:1025 } -start @@ -62,7 +63,7 @@ client c1 -connect ${h1_luahttpservice_sock} { delay 2 server s2 -repeat 5 -start -delay 5 +delay 10 client c2 -connect ${h1_luahttpservice_sock} { timeout 2
Re: Lots of mail from email alert on 1.9.x
Hi Willy, Olivier, Op 12-1-2019 om 13:11 schreef Willy Tarreau: Hi Pieter, it is needed to prepend this at the beginning of chk_report_conn_err() : if (!check->server) return; We need to make sure that check->server is properly tested everywhere. With a bit of luck this one was the only remnant. Thanks! Willy With the check above added, mail alerts seem to work properly here, or just as good as they used to anyhow. Once the patches and above addition get committed, that leaves the other 'low priority' issue of needing a short timeout to send the exact amount of 'expected' mails. EXPECT resp.http.mailsreceived (10) == "16" failed To be honest i only noticed it due to making the regtest, and double-checking what to expect.. When i validated mails on my actual environment it seems to work properly. (Though the server i took out to test has a health-check with a 60 second interval..) Anyhow its been like this for years afaik, i guess it wont matter much if stays like this a bit longer. Regards, PiBa-NL (Pieter)
stats webpage crash, htx and scope filter, [PATCH] REGTEST is included
Hi List, I've configured haproxy with htx and when i try to filter the stats webpage. Sending this request: "GET /?;csv;scope=b1" to '2.0-dev0-762475e 2019/01/10' it will crash with the trace below. 1.9.0 and 1.9.1 are also affected. Can someone take a look? Thanks in advance. A regtest is attached that reproduces the behavior, and which i think could be included into the haproxy repository. Regards, PiBa-NL (Pieter) Program terminated with signal SIGSEGV, Segmentation fault. #0 0x00564fe7 in strnistr (str1=0x802631048 "fe1", len_str1=3, str2=0x804e3bf4c , len_str2=3) at src/standard.c:3657 3657 while (toupper(*start) != toupper(*str2)) { (gdb) bt full #0 0x00564fe7 in strnistr (str1=0x802631048 "fe1", len_str1=3, str2=0x804e3bf4c , len_str2=3) at src/standard.c:3657 pptr = 0x804e3bf4c 0x804e3bf4c> sptr = 0x80271df80 "\330?" start = 0x802631048 "fe1" slen = 3 plen = 3 tmp1 = 0 tmp2 = 4294959392 #1 0x004d01d3 in stats_dump_proxy_to_buffer (si=0x8026416d8, htx=0x8027c8e40, px=0x8026b3c00, uri=0x802638000) at src/stats.c:2079 appctx = 0x802678380 s = 0x802641400 rep = 0x802641470 sv = 0x8027c8e40 svs = 0x33be1e0 l = 0x4d31df flags = 0 #2 0x004d4139 in stats_dump_stat_to_buffer (si=0x8026416d8, htx=0x8027c8e40, uri=0x802638000) at src/stats.c:2652 appctx = 0x802678380 rep = 0x802641470 px = 0x8026b3c00 #3 0x004d56bb in htx_stats_io_handler (appctx=0x802678380) at src/stats.c:3299 si = 0x8026416d8 s = 0x802641400 req = 0x802641410 res = 0x802641470 req_htx = 0x8027c8e40 res_htx = 0x8027c8e40 #4 0x004d2546 in http_stats_io_handler (appctx=0x802678380) at src/stats.c:3367 si = 0x8026416d8 s = 0x802641400 req = 0x802641410 res = 0x802641470 #5 0x005f729f in task_run_applet (t=0x8026566e0, context=0x802678380, state=16385) at src/applet.c:85 app = 0x802678380 si = 0x8026416d8 #6 0x005f2533 in process_runnable_tasks () at src/task.c:435 t = 0x8026566e0 state = 16385 ctx = 0x802678380 process = 0x5f7200 t = 0x8026566e0 max_processed = 199 #7 0x005163b2 in run_poll_loop () at src/haproxy.c:2619 next = 0 exp = 1137019023 #8 0x00513008 in run_thread_poll_loop (data=0x8026310f0) at src/haproxy.c:2684 start_lock = 0 ptif = 0x935d40 ptdf = 0x0 #9 0x0050f636 in main (argc=4, argv=0x7fffeb08) at src/haproxy.c:3313 tids = 0x8026310f0 threads = 0x8026310f8 i = 1 old_sig = {__bits = {0, 0, 0, 0}} blocked_sig = {__bits = {4227856759, 4294967295, 4294967295, 4294967295}} err = 0 retry = 200 limit = {rlim_cur = 4052, rlim_max = 4052} errmsg = "\000\353\377\377\377\177\000\000\060\353\377\377\377\177\000\000\b\353\377\377\377\177\000\000\004\000\000\000\000\000\000\000t\240\220?\260|6\224`)\224\000\000\000\000\000\000\353\377\377\377\177\000\000\060\353\377\377\377\177\000\000\b\353\377\377\377\177\000\000\004\000\000\000\000\000\000\000\240\352\377\377\377\177\000\000R\201\000\002\b\000\000\000\001\000\000" pidfd = -1 From 838ecb4e153c1d859d0a49e0554ff050ff10033c Mon Sep 17 00:00:00 2001 From: PiBa-NL Date: Sat, 12 Jan 2019 21:57:48 +0100 Subject: [PATCH] REGTEST: checks basic stats webpage functionality This regtest verifies that the stats webpage can be used to change a server state to maintenance or drain, and that filtering the page scope will result in a filtered page. --- .../h_webstats-scope-and-post-change.vtc | 83 +++ 1 file changed, 83 insertions(+) create mode 100644 reg-tests/webstats/h_webstats-scope-and-post-change.vtc diff --git a/reg-tests/webstats/h_webstats-scope-and-post-change.vtc b/reg-tests/webstats/h_webstats-scope-and-post-change.vtc new file mode 100644 index ..a77483b5 --- /dev/null +++ b/reg-tests/webstats/h_webstats-scope-and-post-change.vtc @@ -0,0 +1,83 @@ +varnishtest "Webgui stats page check filtering with scope and changing server state" +#REQUIRE_VERSION=1.6 + +feature ignore_unknown_macro + +server s1 { +} -start + +haproxy h1 -conf { + global +stats socket /tmp/haproxy.socket level admin + + defaults +mode http +${no-htx} option http-use-htx + + frontend fe1 +bind "fd@${fe1}" +stats enable +stats refresh 5s +stats uri / +stats admin if TRUE + + backend b1 +server srv1 ${s1_addr}:${s1_port} +server srv2 ${s1_addr}:${s1_port} +server srv3 ${s1_addr}:${s1_port} + + backend b2 +server srv1 ${s1_addr}:${s1_port} +server srv2 ${s1_addr}:${s1_port} +
Re: Get client IP
Hi, Op 13-1-2019 om 13:11 schreef Aleksandar Lazic: Hi. Am 13.01.2019 um 12:17 schrieb Vũ Xuân Học: Hi, Please help me to solve this problem. I use HAProxy version 1.5.18, SSL transparent mode and I can not get client IP in my .net mvc website. With mode http, I can use option forwardfor to catch client ip but with tcp mode, my web read X_Forwarded_For is null. My diagram: Client => Firewall => HAProxy => Web I read HAProxy document, try to use send-proxy. But when use send-proxy, I can access my web. This is my config: frontend test2233 bind *:2233 option forwardfor default_backend testecus backend testecus mode http server web1 192.168.0.151:2233 check Above config work, and I can get the client IP That's good as it's `mode http` therefore haproxy can see the http traffic. Indeed it can insert the http forwardfor header with 'mode http'. Config with SSL: frontend ivan bind 192.168.0.4:443 mode tcp option tcplog #option forwardfor reqadd X-Forwarded-Proto:\ https This can't work as you use `mode tcp` and therefore haproxy can't see the http traffic. From my point of view have you now 2 options. * use https termination on haproxy. Then you can add this http header. Thats one option indeed. * use accept-proxy in the bind line. This option requires that the firewall is able to send the PROXY PROTOCOL header to haproxy. https://cbonte.github.io/haproxy-dconv/1.5/configuration.html#5.1-accept-proxy I dont expect a firewall to send such a header. And if i understand correctly the 'webserver' would need to be configured to accept proxy-protocol. The modification to make in haproxy would be to configure send-proxy[-v2-ssl-cn] http://cbonte.github.io/haproxy-dconv/1.9/snapshot/configuration.html#5.2-send-proxy And how to configure it with for example nginx: https://wakatime.com/blog/23-how-to-scale-ssl-with-haproxy-and-nginx The different modes are described in the doc https://cbonte.github.io/haproxy-dconv/1.5/configuration.html#4-mode Here is a blog post about basic setup of haproxy with ssl https://www.haproxy.com/blog/how-to-get-ssl-with-haproxy-getting-rid-of-stunnel-stud-nginx-or-pound/ acl tls req.ssl_hello_type 1 tcp-request inspect-delay 5s tcp-request content accept if tls # Define hosts acl host_1 req.ssl_sni -i ebh.vn acl host_2 req.ssl_sni hdr_end(host) -i einvoice.com.vn use_backend eBH if host_1 use_backend einvoice443 if host_2 backend eBH mode tcp balance roundrobin option ssl-hello-chk server web1 192.168.0.153:443 maxconn 3 check #cookie web1 server web1 192.168.0.154:443 maxconn 3 check #cookie web2 Above config doesn’t work, and I can not get the client ip. I try server web1 192.168.0.153:443 send-proxy and try server web1 192.168.0.153:443 send-proxy-v2 but I can’t access my web. This is expected as the Firewall does not send the PROXY PROTOCOL header and the bind line is not configured for that. Firewall's by themselves will never use proxy-protocol at all. That it doesn't work with send-proxy on the haproxy server line is likely because the webservice that is receiving the traffic isn't configured to accept the proxy protocol. How to configure a ".net mvc website" to accept that is something i don't know if it is even possible at all.. Many thanks, Best regards Aleks Thanks & Best Regards! * VU XUAN HOC Regards, PiBa-NL (Pieter)
Re: stats webpage crash, htx and scope filter, [PATCH] REGTEST is included
Hi Christopher, Op 14-1-2019 om 11:17 schreef Christopher Faulet: Le 12/01/2019 à 23:23, PiBa-NL a écrit : Hi List, I've configured haproxy with htx and when i try to filter the stats webpage. Sending this request: "GET /?;csv;scope=b1" to '2.0-dev0-762475e 2019/01/10' it will crash with the trace below. 1.9.0 and 1.9.1 are also affected. Can someone take a look? Thanks in advance. A regtest is attached that reproduces the behavior, and which i think could be included into the haproxy repository. Pieter, Here is the patch that should fix this issue. This was "just" an oversight when the stats applet has been adapted to support the HTX. If it's ok for you, I'll also merge your regtest. Thanks It seems the patch did not change/fix the crash.? Below looks pretty much the same as previously. Did i fail to apply the patch properly.? It seems to have 'applied' properly checking a few lines of the touched code manually. As for the regtest, yes please merge that if its okay as-is, perhaps after the fix is also ready :). Regards, PiBa-NL (Pieter) Program terminated with signal SIGSEGV, Segmentation fault. #0 0x005658e7 in strnistr (str1=0x802631048 "fe1", len_str1=3, str2=0x271dfcc , len_str2=3) at src/standard.c:3657 3657 while (toupper(*start) != toupper(*str2)) { (gdb) bt full #0 0x005658e7 in strnistr (str1=0x802631048 "fe1", len_str1=3, str2=0x271dfcc , len_str2=3) at src/standard.c:3657 pptr = 0x271dfcc sptr = 0x6995d3 "text/plain" start = 0x802631048 "fe1" slen = 3 plen = 3 tmp1 = 0 tmp2 = 4294958728 #1 0x004d09ff in stats_dump_proxy_to_buffer (si=0x8026416d8, htx=0x8027c8e40, px=0x8026b3c00, uri=0x802638000) at src/stats.c:2087 scope_ptr = 0x271dfcc 0x271dfcc> appctx = 0x802678380 s = 0x802641400 rep = 0x802641470 sv = 0x8027c8e40 svs = 0x343e1e0 l = 0x4d3a8f flags = 0 #2 0x004d49e9 in stats_dump_stat_to_buffer (si=0x8026416d8, htx=0x8027c8e40, uri=0x802638000) at src/stats.c:2664
Re: stats webpage crash, htx and scope filter, [PATCH] REGTEST is included
Hi Christopher, Op 15-1-2019 om 10:48 schreef Christopher Faulet: Le 14/01/2019 à 21:53, PiBa-NL a écrit : Hi Christopher, Op 14-1-2019 om 11:17 schreef Christopher Faulet: Le 12/01/2019 à 23:23, PiBa-NL a écrit : Hi List, I've configured haproxy with htx and when i try to filter the stats webpage. Sending this request: "GET /?;csv;scope=b1" to '2.0-dev0-762475e 2019/01/10' it will crash with the trace below. 1.9.0 and 1.9.1 are also affected. Can someone take a look? Thanks in advance. A regtest is attached that reproduces the behavior, and which i think could be included into the haproxy repository. Pieter, Here is the patch that should fix this issue. This was "just" an oversight when the stats applet has been adapted to support the HTX. If it's ok for you, I'll also merge your regtest. Thanks It seems the patch did not change/fix the crash.? Below looks pretty much the same as previously. Did i fail to apply the patch properly.? It seems to have 'applied' properly checking a few lines of the touched code manually. As for the regtest, yes please merge that if its okay as-is, perhaps after the fix is also ready :). Hi Pieter, Sorry, I made my patch too quickly. It seemed ok, but obviously not... This new one should do the trick. Well.. 'something' changed, still crashing though.. but at a different place. Regards, PiBa-NL (Pieter) Program terminated with signal SIGSEGV, Segmentation fault. #0 0x004d3770 in htx_sl_p2 (sl=0x0) at include/common/htx.h:237 237 return ist2(HTX_SL_P2_PTR(sl), HTX_SL_P2_LEN(sl)); (gdb) bt full #0 0x004d3770 in htx_sl_p2 (sl=0x0) at include/common/htx.h:237 No locals. #1 0x004d3665 in htx_sl_req_uri (sl=0x0) at include/common/htx.h:252 No locals. #2 0x004d1125 in stats_scope_ptr (appctx=0x802678540, si=0x8026416d8) at src/stats.c:268 req = 0x802641410 htx = 0x80271df80 uri = {ptr = 0x60932e "H\213E\320H\211E\370H\213E\370H\201\304\260", len = 4304914720} p = 0x4802631048 0x4802631048> #3 0x004d8505 in stats_send_htx_redirect (si=0x8026416d8, htx=0x8027c8e40) at src/stats.c:3162 scope_ptr = 0x5f80f5 <__pool_get_first+21> "H\211E\310H\203}\310" scope_txt = "\000\342\377\377\377\177\000\000\351}M\000\000\000\000\000x\024d\002\b\000\000\000x\024d\002" s = 0x802641400 uri = 0x802638000 appctx = 0x802678540 sl = 0x8027c8e40 flags = 8 #4 0x004d60fb in htx_stats_io_handler (appctx=0x802678540) at src/stats.c:3337 si = 0x8026416d8 s = 0x802641400 req = 0x802641410 res = 0x802641470 req_htx = 0x8027c8e40 res_htx = 0x8027c8e40 #5 0x004d2d36 in http_stats_io_handler (appctx=0x802678540) at src/stats.c:3393 si = 0x8026416d8 s = 0x802641400 req = 0x802641410 res = 0x802641470 #6 0x005f7d5f in task_run_applet (t=0x802656780, context=0x802678540, state=16385) at src/applet.c:85 app = 0x802678540 si = 0x8026416d8 #7 0x005f3023 in process_runnable_tasks () at src/task.c:435 t = 0x802656780 state = 16385 ctx = 0x802678540 process = 0x5f7cc0 t = 0x802656780 max_processed = 200 #8 0x00516ca2 in run_poll_loop () at src/haproxy.c:2620 next = 0 exp = 1394283990 #9 0x005138f8 in run_thread_poll_loop (data=0x8026310e8) at src/haproxy.c:2685 start_lock = 0 ptif = 0x936d40 ptdf = 0x0 #10 0x0050ff26 in main (argc=4, argv=0x7fffeb08) at src/haproxy.c:3314 tids = 0x8026310e8 threads = 0x8026310f0 i = 1 old_sig = {__bits = {0, 0, 0, 0}} blocked_sig = {__bits = {4227856759, 4294967295, 4294967295, 4294967295}} err = 0 retry = 200 limit = {rlim_cur = 4051, rlim_max = 4051} errmsg = "\000\353\377\377\377\177\000\000\060\353\377\377\377\177\000\000\b\353\377\377\377\177\000\000\004\000\000\000\000\000\000\000\376\310\311\070\333\207d\000`9\224\000\000\000\000\000\000\353\377\377\377\177\000\000\060\353\377\377\377\177\000\000\b\353\377\377\377\177\000\000\004\000\000\000\000\000\000\000\240\352\377\377\377\177\000\000R\201\000\002\b\000\000\000\001\000\000" pidfd = -1
Re: stats webpage crash, htx and scope filter, [PATCH] REGTEST is included
Hi Willy, Christopher, Op 16-1-2019 om 17:32 schreef Willy Tarreau: On Wed, Jan 16, 2019 at 02:28:56PM +0100, Christopher Faulet wrote: here is a new patch, again. Willy, I hope it will be good for the release 1.9.2. This one works :). OK so I've mergd it now, thank you! Willy Op 14-1-2019 om 11:17 schreef Christopher Faulet: If it's ok for you, I'll also merge your regtest. Can you add the regtest as well into the git repo? Regards, PiBa-NL (Pieter)
Re: [PATCH] REG-TEST: mailers: add new test for 'mailers' section
Hi Christopher, Op 21-1-2019 om 15:28 schreef Christopher Faulet: Hi Pieter, About the timing issue, could you try the following patch please ? With it, I can run the regtest about email alerts without any error. Thanks, -- Christopher Faulet The regtest works for me as well with this patch. Without needing the 'timeout mail' setting. I think we can call it fixed once committed. Thanks, PiBa-NL (Pieter)
Re: haproxy 1.9.2 with boringssl
Hi Aleksandar, Just FYI. Op 22-1-2019 om 22:08 schreef Aleksandar Lazic: But this could be a know bug and is fixed in the current git - ## Starting vtest ## Testing with haproxy version: 1.9.2 #top TEST ./reg-tests/mailers/k_healthcheckmail.vtc FAILED (7.808) exit=2 1 tests failed, 0 tests skipped, 32 tests passed ## Gathering results ## ## Test case: ./reg-tests/mailers/k_healthcheckmail.vtc ## ## test results in: "/tmp/haregtests-2019-01-22_20-56-31.QMI0Ue/vtc.5740.39907fe1" c27.0 EXPECT resp.http.mailsreceived (11) == "16" failed This was indeed identified as a bug, and is fixed in current master. The impact of this was rather low though, and this specific issue of a few 'missing' mails under certain configuration circumstances existed for years before it was spotted with the regtest. https://www.mail-archive.com/haproxy@formilux.org/msg32190.html http://git.haproxy.org/?p=haproxy.git;a=commit;h=774c486cece942570b6a9d16afe236a16ee12079 Regards, PiBa-NL (Pieter)
h1-client to h2-server host header / authority conversion failure.?
Hi List, Attached a regtest which i 'think' should pass. ** s1 0.0 === expect tbl.dec[1].key == ":authority" s1 0.0 EXPECT tbl.dec[1].key (host) == ":authority" failed It seems to me the Host <> Authority conversion isn't happening properly.? But maybe i'm just making a mistake in the test case... I was using HA-Proxy version 2.0-dev0-f7a259d 2019/01/24 with this test. The test was inspired by the attempt to connect to mail.google.com , as discussed in the "haproxy 1.9.2 with boringssl" mail thread.. Not sure if this is the main problem, but it seems suspicious to me.. Regards, PiBa-NL (Pieter) varnishtest "Check H1 client to H2 server with HTX." feature ignore_unknown_macro syslog Slog_1 -repeat 1 -level info { recv } -start server s1 -repeat 2 { rxpri stream 0 { txsettings rxsettings txsettings -ack } -run stream 1 { rxreq expect tbl.dec[1].key == ":authority" expect tbl.dec[1].value == "domain.tld" txresp } -run } -start haproxy h1 -conf { global log ${Slog_1_addr}:${Slog_1_port} len 2048 local0 debug err defaults mode http timeout client 2s timeout server 2s timeout connect 1s log global option http-use-htx frontend fe1 option httplog bind "fd@${fe1}" default_backend b1 backend b1 server s1 ${s1_addr}:${s1_port} proto h2 frontend fe2 option httplog bind "fd@${fe2}" proto h2 default_backend b2 backend b2 server s2 ${s1_addr}:${s1_port} proto h2 } -start client c1 -connect ${h1_fe1_sock} { txreq -url "/" -hdr "host: domain.tld" rxresp expect resp.status == 200 } -run client c2 -connect ${h1_fe2_sock} { txpri stream 0 { txsettings -hdrtbl 0 rxsettings } -run stream 1 { txreq -req GET -url /3 -litIdxHdr inc 1 huf "domain.tld" rxresp expect resp.status == 200 } -run } -run #syslog Slog_1 -wait
Re: h1-client to h2-server host header / authority conversion failure.?
Hi Willy, List, Just a little check, was below mail received properly with the 6 attachments (vtc/vtc/log/png/png/pcapng) .? (As it didn't show up on the mail-archive.) Regards, PiBa-NL (Pieter) Op 26-1-2019 om 21:04 schreef PiBa-NL: Hi Willy, Op 25-1-2019 om 17:04 schreef Willy Tarreau: Hi Pieter, On Fri, Jan 25, 2019 at 01:01:19AM +0100, PiBa-NL wrote: Hi List, Attached a regtest which i 'think' should pass. ** s1 0.0 === expect tbl.dec[1].key == ":authority" s1 0.0 EXPECT tbl.dec[1].key (host) == ":authority" failed It seems to me the Host <> Authority conversion isn't happening properly.? But maybe i'm just making a mistake in the test case... I was using HA-Proxy version 2.0-dev0-f7a259d 2019/01/24 with this test. The test was inspired by the attempt to connect to mail google com , as discussed in the "haproxy 1.9.2 with boringssl" mail thread.. Not sure if this is the main problem, but it seems suspicious to me.. It's not as simple, :authority is only required for CONNECT and is optional for other methods with Host as a fallback. Clients are encouraged to use it instead of the Host header field, according to paragraph 8.1.2.3, but there is nothing indicating that a gateway may nor should build one from scratch when translating HTTP/1.1 to HTTP/2. In fact the authority part is generally not present in the URIs we receive as a gateway, so what we'd put there would be completely reconstructed from the host header field. I don't even know if all servers are fine with authority only instead of Host. Please note, I'm not against changing this, I just want to be sure we actually fix something and that we don't break anything. Thus if you have any info indicating there is an issue with this one missing, it could definitely help. Thanks! Willy Today ive given it another shot. (connecting to mail google com). Is there a way in haproxy to directly 'manipulate' the h2 headers? Setting h2 header with set-header :authority didn't seem to work.? See attached some logs a packetcapture and a vtc that uses google's servers itself. It seems google replies "Header: :status: 400 Bad Request" But leaves me 'guessing' why it would be invalid, also the 'body' doesn't get downloaded but haproxy terminates the connection, which curl then reports as missing bytes.. There are a few differences between the 2 get requests, authority and scheme.. But i also wonder if that is the actual packet with the issue, H2 isnt quite a simple as H1 used to be ;). Also with "h2-client-mail google vtc" the first request succeeds, but the second where the Host header is used fails. I think this shows there is a 'need' for the :authority header to be present? Or i mixed something up... p.s. Wireshark doesnt nicely show/dissect the http2 requests made by vtest, probably because for example the first magic packet is spread out over multiple tcp packets, is there a way to make it send them in 1 go, or make haproxy 'buffer' the short packets into a bigger complete packets, i tried putting a little listen/bind/server section in the request path, but it just forwarded the small packets as is.. Regards, PiBa-NL (Pieter)
Re: h1-client to h2-server host header / authority conversion failure.?
Hi Willy, Op 2-2-2019 om 0:01 schreef Willy Tarreau: On Fri, Feb 01, 2019 at 09:43:13PM +0100, PiBa-NL wrote: The 'last' part is in TCP mode, and is intended like that to allow me to run tcpdump/wireshark on the un-encrypted traffic, and being certain that haproxy would not modify it before sending. But maybe the test contained a 'half done' edit as well, ill attach a new test now. OK. I've not tried 1.9 .. but did try with '2.0-dev0-ff5dd74 2019/01/31', that should contain the fix as well right.?. I just rechecked and no, these ones were added after ff5dd74 : 9c9da5e MINOR: muxes: Don't bother to LIST_DEL(&conn->list) before calling conn_ dc21ff7 MINOR: debug: Add an option that causes random allocation failures. 3c4e19f BUG/MEDIUM: backend: always release the previous connection into its own 3e45184 BUG/MEDIUM: htx: check the HTX compatibility in dynamic use-backend rule 9c4f08a BUG/MINOR: tune.fail-alloc: Don't forget to initialize ret. 1da41ec BUG/MINOR: backend: check srv_conn before dereferencing it 5be92ff BUG/MEDIUM: mux-h2: always omit :scheme and :path for the CONNECT method 053c157 BUG/MEDIUM: mux-h2: always set :authority on request output 32211a1 BUG/MEDIUM: stream: Don't forget to free s->unique_id in stream_free(). It's 053c157 which fixes it. You scared me, I thought I had messed up with the commit :-) I tested again here and it still works for me. Cheers, Willy Sorry, indeed all 4 tests pass. ( Using 2.0-dev0-32211a1 2019/02/01 ) I must have mixed the git-id to sync up with in my makefile, thought i picked the last one.. Sorry for the noise! Thanks for fixing and re-checking :) Regards, PiBa-NL (Pieter)
regtest, response lenght check failure for /reg-tests/http-capture/h00000.vtc with HTX enabled, using 2.0-dev1
Hi List, Christopher, With 2.0-dev1-6c1b667 and 2.0-dev1-12a7184 i get the 'failure' below when running reg-tests with HTX enabled. (without HTX the test passes) Seems this commit made it return different results: http://git.haproxy.org/?p=haproxy.git;a=commit;h=b8d2ee040aa21f2906a4921e5e1c7afefb7e I 'think' the syslog output for a single request/response should remain the same with/without htx? Or should the size check be less strict or accept 1 of 2 possible outcomes with/without htx.? Regards, PiBa-NL (Pieter) S 0.0 syslog|<134>Feb 26 20:42:52 haproxy[56313]: ::1:46091 [26/Feb/2019:20:42:52.065] fe be/srv 0/0/0/2/2 200 17473 - - 1/1/0/0/0 0/0 {HPhx8n59qjjNBLjP} {htb56qDdCcbRVTfS} "GET / HTTP/1.1" ** S 0.0 === expect ~ "[^:\\[ ]\\[${h_pid}\\]: .* .* fe be/srv .* 200 176... S 0.0 EXPECT FAILED ~ "[^:\[ ]\[56313\]: .* .* fe be/srv .* 200 17641 - - .* .* {HPhx8n59qjjNBLjP} {htb56qDdCcbRVTfS} "GET / HTTP/1\.1""
Re: haproxy reverse proxy to https streaming backend
Hi Thomas, Op 14-3-2019 om 20:28 schreef Thomas Schmiedl: Hello, I never got a reply from the original author of xupnpd2 to fix the hls-handling, so I created a lua-script (thanks to Thierry Fournier), but it's too slow for the router cpu. Could someone rewrite the script to a lua-c-module? I don't think making this exact code a lua-c-module would solve the issue, lua is not a 'slow' language. But I do wonder if regex is the right tool for data manipulation.. Regards, Thomas test.cfg: global lua-load /var/media/ftp/playlist.lua frontend main mode http bind *:8080 acl is_index_m3u8 path -m end /index.m3u8 http-request use-service lua.playlist if is_index_m3u8 default_backend forward backend forward mode http server gjirafa puma.gjirafa.com:443 ssl verify none playlist.lua: core.register_service("playlist", "http", function(applet) local tcp = core.tcp() tcp:connect_ssl("51.75.52.73", 443) tcp:send("GET ".. applet.path .." HTTP/1.1\r\nConnection: Close\r\nHost: puma.gjirafa.com\r\n\r\n") local body = tcp:receive("*a") local result = string.match(body,"^.*(#EXTM3U.-)#EXTINF") result = result .. string.match(body,"(...%d+.ts%d+.ts%d+.ts)[\r\n|0]*$") I think a 'easier' regex might already improve performance, can you try this one for example ?: result = result .. string.match(body,"(#EXTINF:%d+[/.]%d+,\n%d+[/.]ts.#EXTINF:%d[/.]%d%d%d,.%d+[/.]ts.#EXTINF:%d+[/.]%d+,\n%d+[/.]ts)[\r\n|0]*$") With my test using 'https://rextester.com/l/lua_online_compiler' and a little sample m3u8 it seemed to work faster anyhow. applet:set_status(200) applet:add_header("Content-Type", "application/x-mpegURL") applet:add_header("content-length", string.len(result)) applet:add_header("Connection", "close") applet:start_response() applet:send(result) end) Am 19.02.2019 um 21:31 schrieb Thomas Schmiedl: Am 19.02.2019 um 05:29 schrieb Willy Tarreau: Hello Thomas, On Sun, Feb 17, 2019 at 05:55:29PM +0100, Thomas Schmiedl wrote: Hello Bruno, I think the problem is the parsing of the .m3u8-playlist in xupnpd2. The first entry to the .ts-file is 4 hours behind the actual time. But I have no c++ experience to change the code. For me if it works but not correctly like this, it clearly indicates there is a (possibly minor) incompatibility between the client and the server. It just happens that if your client doesn't support https, it was never tested against this server and very likely needs to be adapted to work correctly. Is it possible in haproxy to manipulate the playlist file (server response), that only the last .ts-entries will be available and returned to xupnpd2? No, haproxy doesn't manipulate contents. Not only it's completely out of the scope of a load balancing proxy, but it would also encourage some users to try to work around some of their deployment issues in the ugliest possible way, causing even more trouble (and frankly, on *every* infrastructure where you find such horrible tricks deployed, the admins implore you to help them because they're in big trouble and are stuck with no option left to fix the issues they've created). If it's only a matter of modifying one file on the fly, you may manage to do it using Lua : instead of forwarding the request to the server, you send it to a Lua function, which itself makes the request to the server, buffers the response, rewrites it, then sends it back to the client. You must just make sure to only send there the requests for the playlist file and nothing else. Could someone send me such a lua-script example and how to include in haproxy. Thanks I personally think this is ugly compared to trying to fix the faulty client. Maybe you can report your issue to the author(s) and share your config to help them reproduce it ? Regards, Willy Regards, PiBa-NL (Pieter)
Re: haproxy reverse proxy to https streaming backend
Hi Thomas, Op 15-3-2019 om 15:24 schreef Thomas Schmiedl: Hello Pieter, thanks for your help, it works well now. The regex solution was my only idea, because I'm not a developer. I know, the haproxy workaround isn't the best solution, but nobody would fix the xupnpd2 hls-handling. Maybe you could help me again. I see, that the playlist has 2 "states" ("header" tags). #EXTM3U #EXT-X-VERSION:3 #EXT-X-MEDIA-SEQUENCE:0 #EXT-X-TARGETDURATION:50 #EXT-X-DISCONTINUITY and #EXTM3U #EXT-X-VERSION:3 #EXT-X-MEDIA-SEQUENCE: #EXT-X-TARGETDURATION:2 The result from the lua-script (header tags) should always be: #EXTM3U #EXT-X-VERSION:3 #EXT-X-MEDIA-SEQUENCE: #EXT-X-TARGETDURATION:2 Something like this might do the trick? Just a 'fixed' header as the result with only the 'found' media sequence number inserted ?: local data = [=[ #EXTM3U #EXT-X-VERSION:3 #EXT-X-MEDIA-SEQUENCE:01234 #EXT-X-TARGETDURATION:50 #EXT-X-DISCONTINUITY ]=] local mediaseq_dummy,mediasequence,mediaseq_eol = string.match(data, "(#EXT[-]X[-]MEDIA[-]SEQUENCE:)(%d+)(\n)") local result = [=[ #EXTM3U #EXT-X-VERSION:3 #EXT-X-MEDIA-SEQUENCE:]=]..mediasequence..[=[ #EXT-X-TARGETDURATION:2 ]=] print("Result:\n"..result) Result: #EXTM3U #EXT-X-VERSION:3 #EXT-X-MEDIA-SEQUENCE:01234 #EXT-X-TARGETDURATION:2 Thanks, Thomas Am 15.03.2019 um 00:27 schrieb PiBa-NL: Hi Thomas, Op 14-3-2019 om 20:28 schreef Thomas Schmiedl: Hello, I never got a reply from the original author of xupnpd2 to fix the hls-handling, so I created a lua-script (thanks to Thierry Fournier), but it's too slow for the router cpu. Could someone rewrite the script to a lua-c-module? I don't think making this exact code a lua-c-module would solve the issue, lua is not a 'slow' language. But I do wonder if regex is the right tool for data manipulation.. Regards, Thomas test.cfg: global lua-load /var/media/ftp/playlist.lua frontend main mode http bind *:8080 acl is_index_m3u8 path -m end /index.m3u8 http-request use-service lua.playlist if is_index_m3u8 default_backend forward backend forward mode http server gjirafa puma.gjirafa.com:443 ssl verify none playlist.lua: core.register_service("playlist", "http", function(applet) local tcp = core.tcp() tcp:connect_ssl("51.75.52.73", 443) tcp:send("GET ".. applet.path .." HTTP/1.1\r\nConnection: Close\r\nHost: puma.gjirafa.com\r\n\r\n") local body = tcp:receive("*a") local result = string.match(body,"^.*(#EXTM3U.-)#EXTINF") result = result .. string.match(body,"(...%d+.ts%d+.ts%d+.ts)[\r\n|0]*$") I think a 'easier' regex might already improve performance, can you try this one for example ?: result = result .. string.match(body,"(#EXTINF:%d+[/.]%d+,\n%d+[/.]ts.#EXTINF:%d[/.]%d%d%d,.%d+[/.]ts.#EXTINF:%d+[/.]%d+,\n%d+[/.]ts)[\r\n|0]*$") With my test using 'https://rextester.com/l/lua_online_compiler' and a little sample m3u8 it seemed to work faster anyhow. applet:set_status(200) applet:add_header("Content-Type", "application/x-mpegURL") applet:add_header("content-length", string.len(result)) applet:add_header("Connection", "close") applet:start_response() applet:send(result) end) Am 19.02.2019 um 21:31 schrieb Thomas Schmiedl: Am 19.02.2019 um 05:29 schrieb Willy Tarreau: Hello Thomas, On Sun, Feb 17, 2019 at 05:55:29PM +0100, Thomas Schmiedl wrote: Hello Bruno, I think the problem is the parsing of the .m3u8-playlist in xupnpd2. The first entry to the .ts-file is 4 hours behind the actual time. But I have no c++ experience to change the code. For me if it works but not correctly like this, it clearly indicates there is a (possibly minor) incompatibility between the client and the server. It just happens that if your client doesn't support https, it was never tested against this server and very likely needs to be adapted to work correctly. Is it possible in haproxy to manipulate the playlist file (server response), that only the last .ts-entries will be available and returned to xupnpd2? No, haproxy doesn't manipulate contents. Not only it's completely out of the scope of a load balancing proxy, but it would also encourage some users to try to work around some of their deployment issues in the ugliest possible way, causing even more trouble (and frankly, on *every* infrastructure where you find such horrible tricks deployed, the admins implore you to help them because they're in big trouble and are stuck with no option left to fix the issues they've created). If it's only a matter of modifying one file on the
Re: DNS Resolver Issues
Hi Daniel, Baptiste, @Daniel, can you remove the 'addr loadbalancer-internal.xxx.yyy' from the server check? It seems to me that that name is not being resolved by the 'resolvers'. And even if it would it would be kinda redundant as it is in the example as it is the same as the servername.?. Not sure how far below scenarios are all explained by this though.. @Baptiste, is it intentional that a wrong 'addr' dns name makes haproxy fail to start despite having the supposedly never failing 'default-server init-addr last,libc,none' ? Is it possibly a good feature request to support re-resolving a dns name for the addr setting as well ? Regards, PiBa-NL (Pieter) Op 21-3-2019 om 20:37 schreef Daniel Schneller: Hi! Thanks for the response. I had looked at the "hold" directives, but since they all seem to have reasonable defaults, I did not touch them. I specified 10s explictly, but it did not make a difference. I did some more tests, however, and it seems to have more to do with the number of responses for the initial(?) DNS queries. Hopefully these three tables make sense and don't get mangled in the mail. The "templated" proxy is defined via "server-template" with 3 "slots". The "regular" one just as "server". Test 1: Start out with both "valid" and "broken" DNS entries. Then comment out/add back one at a time as described in (1)-(5). Each time after changing /etc/hosts, restart dnsmasq and check haproxy via hatop. Haproxy started fresh once dnsmasq was set up to (1). | state state /etc/hosts | regular templated |- (1) BRK| UP/L7OK DOWN/L4TOUT VALID | MAINT/resolution | UP/L7OK | (2) BRK| DOWN/L4TOUT DOWN/L4TOUT #VALID | MAINT/resolution | MAINT/resolution | (3) #BRK | UP/L7OK UP/L7OK VALID | MAINT/resolution | MAINT/resolution | (4) BRK| UP/L7OK UP/L7OK VALID | DOWN/L4TOUT | MAINT/resolution | (5) BRK| DOWN/L4TOUT DOWN/L4TOUT #VALID | MAINT/resolution | MAINT/resolution This all looks normal and as expected. As soon as the "VALID" DNS entry is present, the UP state follows within a few seconds. Test 2: Start out "valid only" (1) and proceed as described in (2)-(5), again restarting dnsmasq each time, and haproxy reloaded after dnsmasq was set up to (1). | state state /etc/hosts | regular templated | (1) #BRK | UP/L7OK MAINT/resolution VALID | MAINT/resolution | UP/L7OK | (2) BRK| UP/L7OK DOWN/L4TOUT VALID | MAINT/resolution | UP/L7OK | (3) #BRK | UP/L7OK MAINT/resolution VALID | MAINT/resolution | UP/L7OK | (4) BRK| UP/L7OK DOWN/L4TOUT VALID | MAINT/resolution | UP/L7OK | (5) BRK| DOWN/L4TOUT DOWN/L4TOUT #VALID | MAINT/resolution | MAINT/resolution Everything good here, too. Adding the broken DNS entry does not bring the proxies down until only the broken one is left. Test 3: Start out "broken only" (1). Again, same as before, haproxy restarted once dnsmasq was initialized to (1). | state state /etc/hosts | regular templated | (1) BRK
Re: How to allow Client Requests at a given rate
choServer.cpp:117] > current rate : 2488 It looks like me its +- exactly the configured 10 requests that got allowed above in that minute summing up the rate numbers listed above. >>> until almost 60 no http request are received to back ends >> this time gap varies with every run ... >>> after 60 secs rate limits are applied properly >>>> E0422 11:00:07.690192 18653 EchoServer.cpp:117] > current rate : 1 E0422 11:00:10.411736 18653 EchoServer.cpp:117] > current rate : 1 E0422 11:00:11.412317 18653 EchoServer.cpp:117] > current rate : 1679 E0422 11:00:12.412369 18653 EchoServer.cpp:117] > current rate : 1667 E0422 11:00:13.451706 18653 EchoServer.cpp:117] > current rate : 1668 E0422 11:00:14.453778 18653 EchoServer.cpp:117] > current rate : 1668 E0422 11:00:15.457597 18653 EchoServer.cpp:117] > current rate : 1645 E0422 11:00:16.458938 18653 EchoServer.cpp:117] > current rate : 1762 E0422 11:00:17.470010 18653 EchoServer.cpp:117] > current rate : 1598 Can I get some info on the issue, is this know issue or am I missing some config for rate limiting to be applied properly ? Thanks in advance, Badari I wonder if instead of allowing 10 requests per minute you would like 1666 requests to be allowed per second.? Which should effectively be similar besides that 'bursts' of requests will be blocked sooner.. To do this use 1s instead of 1m for the 'http_req_rate(1m)'. and put the 1666 as a limit in the map file... Still you might see a burst of 1000 requests in the first millisecond, and only 666 allowed in the other 999 milliseconds (theoretically.?.). But also its probably not really relevant on which ms a request is allowed or blocked. you could argue that allowing 2 requests per millisecond would achieve almost the desired benchmark result. But then if there is nothing to do, and a few 10 users send a request at the same millisecond you might block 8... while the server has actually little to do... and though managing this on a millisecond level is likely ridiculous its just to make it a bit more clear that a short 'burst' of requests isn't necessarily bad and that requests arn't always expected to come in at all the same speed.. So depending on expected runtime of a request and when the server will start to have trouble the current 10/minute might be perfectly fine.. or make it a 1 per 10 seconds.?. So to sum things up.. the limiting is working, and its allowing 10 request in the first minute, just as specified. So in that regard its working correctly already.. Regards, PiBa-NL (Pieter)
2.0-dev5-ea8dd94 - conn_fd_handler() - dumps core - Program terminated with signal 11, Segmentation fault.
Hi Olivier, It seems this commit ea8dd94 broke something for my FreeBSD11 system. Before that commit (almost) all vtest's succeed. After it several cause core-dumps. (and keep doing that including the current HEAD: 03abf2d ) Can you take a look at the issue? Below in this mail are the following: - gdb# bt full of one of the crashed tests.. - summary of failed tests Regards, PiBa-NL (Pieter) gdb --core /tmp/haregtests-2019-06-05_20-40-20.7ZSvbo/vtc.65353.510907b0/h1/haproxy.core ./work/haproxy-ea8dd94/haproxy GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "amd64-marcel-freebsd"... Core was generated by `/usr/ports/net/haproxy-devel/work/haproxy-ea8dd94/haproxy -d -f /tmp/haregtests-'. Program terminated with signal 11, Segmentation fault. Reading symbols from /lib/libcrypt.so.5...done. Loaded symbols for /lib/libcrypt.so.5 Reading symbols from /lib/libz.so.6...done. Loaded symbols for /lib/libz.so.6 Reading symbols from /lib/libthr.so.3...done. Loaded symbols for /lib/libthr.so.3 Reading symbols from /usr/lib/libssl.so.8...done. Loaded symbols for /usr/lib/libssl.so.8 Reading symbols from /lib/libcrypto.so.8...done. Loaded symbols for /lib/libcrypto.so.8 Reading symbols from /usr/local/lib/liblua-5.3.so...done. Loaded symbols for /usr/local/lib/liblua-5.3.so Reading symbols from /lib/libm.so.5...done. Loaded symbols for /lib/libm.so.5 Reading symbols from /lib/libc.so.7...done. Loaded symbols for /lib/libc.so.7 Reading symbols from /libexec/ld-elf.so.1...done. Loaded symbols for /libexec/ld-elf.so.1 #0 conn_fd_handler (fd=55) at src/connection.c:201 201 conn->mux->wake && conn->mux->wake(conn) < 0) (gdb) bt full #0 conn_fd_handler (fd=55) at src/connection.c:201 conn = (struct connection *) 0x8027e7000 flags = 0 io_available = 1 #1 0x005fe3f2 in fdlist_process_cached_events (fdlist=0xa66ac0) at src/fd.c:452 fd = 55 old_fd = 55 e = 39 locked = 0 #2 0x005fdefc in fd_process_cached_events () at src/fd.c:470 No locals. #3 0x0051c15d in run_poll_loop () at src/haproxy.c:2553 next = 0 wake = 0 #4 0x00519ba7 in run_thread_poll_loop (data=0x0) at src/haproxy.c:2607 ptaf = (struct per_thread_alloc_fct *) 0x94f158 ptif = (struct per_thread_init_fct *) 0x94f168 ptdf = (struct per_thread_deinit_fct *) 0x7fffe640 ptff = (struct per_thread_free_fct *) 0x610596 #5 0x0051620d in main (argc=4, argv=0x7fffe648) at src/haproxy.c:3286 blocked_sig = {__bits = 0x7fffe398} old_sig = {__bits = 0x7fffe388} i = 16 err = 0 retry = 200 limit = {rlim_cur = 234908, rlim_max = 234909} errmsg = 0x7fffe550 "" pidfd = -1 Current language: auto; currently minimal ## Starting vtest ## Testing with haproxy version: 2.0-dev5-ea8dd94 # top TEST reg-tests/lua/txn_get_priv.vtc FAILED (0.308) exit=2 # top TEST reg-tests/ssl/wrong_ctx_storage.vtc FAILED (0.308) exit=2 # top TEST reg-tests/compression/lua_validation.vtc FAILED (0.433) exit=2 # top TEST reg-tests/checks/tls_health_checks.vtc TIMED OUT (kill -9) # top TEST reg-tests/checks/tls_health_checks.vtc FAILED (20.153) signal=9 # top TEST reg-tests/peers/tls_basic_sync_wo_stkt_backend.vtc TIMED OUT (kill -9) # top TEST reg-tests/peers/tls_basic_sync_wo_stkt_backend.vtc FAILED (20.137) signal=9 # top TEST reg-tests/peers/tls_basic_sync.vtc TIMED OUT (kill -9) # top TEST reg-tests/peers/tls_basic_sync.vtc FAILED (20.095) signal=9 # top TEST reg-tests/connection/proxy_protocol_random_fail.vtc TIMED OUT (kill -9) # top TEST reg-tests/connection/proxy_protocol_random_fail.vtc FAILED (20.195) signal=9 7 tests failed, 0 tests skipped, 29 tests passed ## Gathering results ## ## Test case: reg-tests/compression/lua_validation.vtc ## ## test results in: "/tmp/haregtests-2019-06-05_20-40-20.7ZSvbo/vtc.65353.510907b0" top 0.4 shell_exit not as expected: got 0x0001 wanted 0x h1 0.4 Bad exit status: 0x008b exit 0x0 signal 11 core 128 ## Test case: reg-tests/peers/tls_basic_sync.vtc ## ## test results in: "/tmp/haregtests-2019-06-05_20-40-20.7ZSvbo/vtc.65353.5e9ca234" ## Test case: reg-tests/connection/proxy_protocol_random_fail.vtc ## ## test results in: "/tmp/haregtests-2019-06-05_20-40-20.7ZSvbo/vtc.65353.19403d36"
Re: 2.0-dev5-ea8dd94 - conn_fd_handler() - dumps core - Program terminated with signal 11, Segmentation fault.
Hi Olivier, Op 6-6-2019 om 18:20 schreef Olivier Houchard: Hi Pieter, On Wed, Jun 05, 2019 at 09:00:22PM +0200, PiBa-NL wrote: Hi Olivier, It seems this commit ea8dd94 broke something for my FreeBSD11 system. Before that commit (almost) all vtest's succeed. After it several cause core-dumps. (and keep doing that including the current HEAD: 03abf2d ) Can you take a look at the issue? Below in this mail are the following: - gdb# bt full of one of the crashed tests.. - summary of failed tests Regards, PiBa-NL (Pieter) Indeed, there were a few issues. I know pushed enough patches so that I only fail one reg test, which also failed before the offending commit (reg-tests/compression/basic.vtc). Can you confirm it's doing better for you too ? Thanks ! Olivier Looks better for me :). Testing with haproxy version: 2.0-dev5-7b3a79f 0 tests failed, 0 tests skipped, 36 tests passed This includes the /compression/basic.vtc for me. p.s. This result doesn't "always" happen. But at least it seems 'just as good' as before ea8dd94. For example i still see this on my tests: 1 tests failed, 0 tests skipped, 35 tests passed ## Gathering results ## ## Test case: ./work/haproxy-7b3a79f/reg-tests/http-rules/converters_ipmask_concat_strcmp_field_word.vtc ## ## test results in: "/tmp/haregtests-2019-06-06_20-22-18.5Z9PR6/vtc.4579.056b0e93" c1 0.3 EXPECT resp.status (504) == "200" failed But then another test run of the same binary again says '36 passed'.. so it seems some tests are rather timing sensitive, or maybe a other variable doesn't play nice.. Anyhow the core-dump as reported is fixed. Ill try and find why the testresults are a bit inconsistent when running them repeatedly.. Anyhow ill send a new mail for that if i find something conclusive :). Thanks, PiBa-NL (Pieter)
Re: haproxy 2.0-dev5-a689c3d - A bogus STREAM [0x805547500] is spinning at 100000 calls per second and refuses to die, aborting now!
Hi Willy, Op 7-6-2019 om 9:03 schreef Willy Tarreau: Hi again Pieter, On Tue, Jun 04, 2019 at 04:59:06PM +0200, Willy Tarreau wrote: Whatever the values, no single stream should be woken up 100k times per second or it definitely indicates a bug (spinning loop that leads to reports of 100% CPU)! I'll see if I can get something out of this. So just for the record, this is expected to be fixed in dev6 (it's the major change there). I'm interested in your feedback on this one, of course! Willy The stream does not spin anymore with dev6 so that seems to work alright. Thanks. Regards, PiBa-NL (Pieter)
slow healthchecks after dev6+ with added commit "6ec902a MINOR: threads: serialize threads initialization"
Hi Willy, After the commit "6ec902a MINOR: threads: serialize threads initialization" however i have failing / slow health checks in the tls_health_checks.vtc test. Before that the Layer7-OK takes 5ms after this commit the healthcheck takes up to 75ms or even more.. It causes the 20ms connect/server timeouts to also to fail the test fairly often but not always.. Seems like something isn't quite right there. Can you check? Regards, PiBa-NL (Pieter) Log can be seen below (p.s. i added milliseconds output also to the vtest log.. ): *** h2 0.299 debug|[WARNING] 157/231456 (78988) : Health check for server be2/srv1 succeeded, reason: Layer7 check passed, code: 200, info: "OK", check duration: 149ms, status: 1/1 UP. ## With HTX * top 0.000 TEST ./work/haproxy-6ec902a/reg-tests/checks/tls_health_checks.vtc starting top 0.000 extmacro def pwd=/usr/ports/net/haproxy-devel top 0.000 extmacro def no-htx= top 0.000 extmacro def localhost=127.0.0.1 top 0.000 extmacro def bad_backend=127.0.0.1 19775 top 0.000 extmacro def bad_ip=192.0.2.255 top 0.000 macro def testdir=/usr/ports/net/haproxy-devel/./work/haproxy-6ec902a/reg-tests/checks top 0.000 macro def tmpdir=/tmp/vtc.78981.41db79fd ** top 0.000 === varnishtest "Health-check test over TLS/SSL" * top 0.000 VTEST Health-check test over TLS/SSL ** top 0.000 === feature ignore_unknown_macro ** top 0.000 === server s1 { ** s1 0.000 Starting server s1 0.000 macro def s1_addr=127.0.0.1 s1 0.000 macro def s1_port=19776 s1 0.000 macro def s1_sock=127.0.0.1 19776 * s1 0.000 Listen on 127.0.0.1 19776 ** top 0.001 === server s2 { ** s2 0.001 Starting server s2 0.001 macro def s2_addr=127.0.0.1 s2 0.001 macro def s2_port=19777 s2 0.001 macro def s2_sock=127.0.0.1 19777 * s2 0.001 Listen on 127.0.0.1 19777 ** top 0.002 === syslog S1 -level notice { ** S1 0.002 Starting syslog server S1 0.002 macro def S1_addr=127.0.0.1 S1 0.002 macro def S1_port=14641 S1 0.002 macro def S1_sock=127.0.0.1 14641 * S1 0.002 Bound on 127.0.0.1 14641 ** s2 0.002 Started on 127.0.0.1 19777 (1 iterations) ** s1 0.002 Started on 127.0.0.1 19776 (1 iterations) ** top 0.002 === haproxy h1 -conf { ** S1 0.002 Started on 127.0.0.1 14641 (level: 5) ** S1 0.002 === recv h1 0.007 macro def h1_cli_sock=::1 19778 h1 0.007 macro def h1_cli_addr=::1 h1 0.007 macro def h1_cli_port=19778 h1 0.007 setenv(cli, 8) h1 0.007 macro def h1_fe1_sock=::1 19779 h1 0.007 macro def h1_fe1_addr=::1 h1 0.007 macro def h1_fe1_port=19779 h1 0.007 setenv(fe1, 9) h1 0.007 macro def h1_fe2_sock=::1 19780 h1 0.007 macro def h1_fe2_addr=::1 h1 0.007 macro def h1_fe2_port=19780 h1 0.007 setenv(fe2, 10) ** h1 0.007 haproxy_start h1 0.007 opt_worker 0 opt_daemon 0 opt_check_mode 0 h1 0.007 argv|exec "haproxy" -d -f "/tmp/vtc.78981.41db79fd/h1/cfg" h1 0.007 conf| global h1 0.007 conf|\tstats socket "/tmp/vtc.78981.41db79fd/h1/stats.sock" level admin mode 600 h1 0.007 conf| stats socket "fd@${cli}" level admin h1 0.007 conf| h1 0.007 conf| global h1 0.007 conf| tune.ssl.default-dh-param 2048 h1 0.007 conf| h1 0.007 conf| defaults h1 0.007 conf| mode http h1 0.007 conf| timeout client 20 h1 0.007 conf| timeout server 20 h1 0.007 conf| timeout connect 20 h1 0.007 conf| h1 0.007 conf| backend be1 h1 0.007 conf| server srv1 127.0.0.1:19776 h1 0.007 conf| h1 0.007 conf| backend be2 h1 0.007 conf| server srv2 127.0.0.1:19777 h1 0.007 conf| h1 0.007 conf| frontend fe1 h1 0.007 conf| option httplog h1 0.007 conf| log 127.0.0.1:14641 len 2048 local0 debug err h1 0.007 conf| bind "fd@${fe1}" ssl crt /usr/ports/net/haproxy-devel/./work/haproxy-6ec902a/reg-tests/checks/common.pem h1 0.007 conf| use_backend be1 h1 0.007 conf| h1 0.007 conf| frontend fe2 h1 0.007 conf| option tcplog h1 0.007 conf| bind "fd@${fe2}" ssl crt /usr/ports/net/haproxy-devel/./work/haproxy-6ec902a/reg-tests/checks/common.pem h1 0.007 conf| use_backend be2 h1 0.007 XXX 12 @637 *** h1 0.008 PID: 78985 h1 0.008 macro def h1_pid=78985 h1 0.008 macro def h1_name=/tmp/vtc.78981.41db79fd/h1 ** top 0.008 === syslog S2 -level notice { ** S2 0.008 Starting syslog server S2 0.008 macro def S2_addr=127.0.0.1 S2 0.008 macro def S2_port=35409 S
Re: slow healthchecks after dev6+ with added commit "6ec902a MINOR: threads: serialize threads initialization"
Hi Willy, Op 10-6-2019 om 11:09 schreef Willy Tarreau: Hi Pieter, On Sat, Jun 08, 2019 at 06:07:09AM +0200, Willy Tarreau wrote: Hi Pieter, On Fri, Jun 07, 2019 at 11:32:18PM +0200, PiBa-NL wrote: Hi Willy, After the commit "6ec902a MINOR: threads: serialize threads initialization" however i have failing / slow health checks in the tls_health_checks.vtc test. Before that the Layer7-OK takes 5ms after this commit the healthcheck takes up to 75ms or even more.. It causes the 20ms connect/server timeouts to also to fail the test fairly often but not always.. This is very strange, as the modification only involves threads startup. Hmmm actually I'm starting to think about a possibility I need to verify. I suspect it may happen that a thread manages to finish its initialization before others request synchronzation, thus belives it's alone and starts. I'm going to have a deeper look at this problem with this in mind. I didn't notice the failed check here but I'll hammer it a bit more. Sorry for the long silence, it was harder than I thought. So I never managed to reproduce this typical issue, even by adding random delays here and there, but I managed to see that some threads were starting the event loop before others were done initializing, which will obviously result in issues such as missed events that could result in what you observed. I initially thought I could easily add a synchronization step using the current two bit fields (and spent my whole week-end writing parallel algorithms and revisiting all our locking mechanism just because of this). After numerous failed attempts, I later figured that I needed to represent more than 4 states per thread and that 2 bits are not enough. Bah... at least I had fun time... Thus I added a new field and a simple function to allow the code to start in synchronous steps. We now initialize one thread at a time, then once they are all initialized we enable the listeners, and once they are enabled, we start the pollers in all threads. It is pretty obvious from the traces that it now does the right thing. However since I couldn't reproduce the health check issue you were facing, I'm interested in knowing if it's still present with the latest master, as it could also uncover another issue. Thanks! Willy Things certainly look better again now regarding this issue. Running the test repeatedly, and manually looking over the results its pretty much as good as it was before. There seems to be a 1 ms increase in the check-duration, but maybe this is because of the moved initialization which on startup delays the first test a millisecond or something? Below some test results that are based on manual observation and some in my head filtering of the console output.. (mistakes included ;) ) repeat 10 ./vt -v ./work/haproxy-*/reg-tests/checks/tls_health_checks.vtc | grep Layer7 | grep OK | grep WARNING Commit-ID , min-max time for +-95% check durations , comment e4d7c9d , 6 - 9 ms , all tests pass (1 tests out of +- a hundred showed 29ms , none below 6ms and almost half of them show 7ms) 6ec902a , 11 - 150 ms , of the 12 tests that passed e186161 , 5 - 8 ms , all tests pass (1 test used 15 ms, more than half the tests show 5ms check duration the majority of the remainder show 6ms) I'm not sure if this deserves further investigation at the moment, i think it does not. Thanks for spending your weekend on this :) that wasn't my intention. Regards, PiBa-NL (Pieter)
Re: slow healthchecks after dev6+ with added commit "6ec902a MINOR: threads: serialize threads initialization"
Hi Willy, Op 10-6-2019 om 16:14 schreef Willy Tarreau: Hi Pieter, On Mon, Jun 10, 2019 at 04:06:13PM +0200, PiBa-NL wrote: Things certainly look better again now regarding this issue. Ah cool! Running the test repeatedly, and manually looking over the results its pretty much as good as it was before. There seems to be a 1 ms increase in the check-duration, but maybe this is because of the moved initialization which on startup delays the first test a millisecond or something? It should not. At this point I think it can be anything including measurement noise or even thread assigment on startup! Below some test results that are based on manual observation and some in my head filtering of the console output.. (mistakes included ;) ) repeat 10 ./vt -v ./work/haproxy-*/reg-tests/checks/tls_health_checks.vtc | grep Layer7 | grep OK | grep WARNING Commit-ID , min-max time for +-95% check durations , comment e4d7c9d , 6 - 9 ms , all tests pass (1 tests out of +- a hundred showed 29ms , none below 6ms and almost half of them show 7ms) Great! 6ec902a , 11 - 150 ms , of the 12 tests that passed That's quite a difference indeed. e186161 , 5 - 8 ms , all tests pass (1 test used 15 ms, more than half the tests show 5ms check duration the majority of the remainder show 6ms) OK! I'm not sure if this deserves further investigation at the moment, i think it does not. Thanks for spending your weekend on this :) that wasn't my intention. Oh don't worry, you know I'm a low-level guy, just give me a problem to solve with a few bits available only and I can spend countless hours on it! Others entertain themselves playing games, for me this is a game :-) Thanks a lot for testing, at least we know there isn't another strange thing hidden behind. Cheers, Willy After a bit more fiddling i noticed that the new startup method seems more CPU intensive. Also it can be seen the vtest does take a bit longer to pass 1.3sec v.s. 0.8sec even though the health-check durations themselves are short as expected. Also its using quite a bit more 'user' cpu. I was wondering if this is a consequence of the new init sequence, or perhaps some improvement is still needed there.? I noticed this after trying to run multiple tests simultaneously again they interfered more with each-other then they used to.. 2.0-dev6-e4d7c9d 2019/06/10 ** h1 1.279 WAIT4 pid=63820 status=0x0002 (user 4.484293 sys 0.054781) ** h2 1.394 WAIT4 pid=63823 status=0x0002 (user 4.637692 sys 0.015588) # top TEST ./test/tls_health_checks-org.vtc passed (1.395) Before the latest changes it used less 'user': 2.0-dev6-e186161 2019/06/07 ** h1 0.783 WAIT4 pid=65811 status=0x0002 (user 1.077052 sys 0.031218) ** h2 0.897 WAIT4 pid=65814 status=0x0002 (user 0.341360 sys 0.037928) # top TEST ./test/tls_health_checks-org.vtc passed (0.899) And with 'nbthread 1' the user cpu usage is even more dramatically lower with the same test.. 2.0-dev6-e4d7c9d 2019/06/10 ** h1 0.684 WAIT4 pid=67990 status=0x0002 (user 0.015203 sys 0.015203) ** h2 0.791 WAIT4 pid=67993 status=0x0002 (user 0.013551 sys 0.009034) # top TEST ./test/tls_health_checks-org.vtc passed (0.793) 2.0-dev6-e186161 2019/06/07 ** h1 0.682 WAIT4 pid=65854 status=0x0002 (user 0.007158 sys 0.021474) ** h2 0.790 WAIT4 pid=65857 status=0x0002 (user 0.007180 sys 0.014361) If a single threaded haproxy process can run with 0.015 user-cpu-usage, i would not have expected it to required 4.4 on a 16 core cpu for the same startup&actions. Where it should be easier to spawn a second thread with the already parsed config instead of more expensive.?. Even if it parses the config once in each thread separately it doesn't make sense to me. So i thought also to try with 'nbthread 8' and that still seems to be 'alright' as seen below.. so i guess with the default of nbthread 16 the h1 and h2 get into some conflict fighting over the available cores.??. And haproxy by default will use all cores since 2.0-dev3 so i guess it might cause some undesirable effects in the field once it gets released and isn't the only process running on a machine, and even if it is the only intensive process, i wonder what other VM's might think about it on the same hypervisor, though i know VM's always give 'virtual performance' ;) .. Running with nbthread 8, still relatively low user usage & test time: 2.0-dev6-e4d7c9d 2019/06/10 ** h1 0.713 WAIT4 pid=68467 status=0x0002 (user 0.197443 sys 0.022781) ** h2 0.824 WAIT4 pid=68470 status=0x0002 (user 0.184567 sys 0.026366) # top TEST ./test/tls_health_checks-org.vtc passed (0.825) Hope you can make sense of some of this. Sorry for not noticing earlier, i guess i was to focused at only the health-check-duration. Or maybe its just me interpreting the numbers wrongly, that's surely also an option. Regards, PiBa-NL (Pieter)
Re: slow healthchecks after dev6+ with added commit "6ec902a MINOR: threads: serialize threads initialization"
Hi Willy, Op 11-6-2019 om 11:37 schreef Willy Tarreau: On Tue, Jun 11, 2019 at 09:06:46AM +0200, Willy Tarreau wrote: I'd like you to give it a try in your environment to confirm whether or not it does improve things. If so, I'll clean it up and merge it. I'm also interested in any reproducer you could have, given that the made up test case I did above doesn't even show anything alarming. No need to waste your time anymore, I now found how to reproduce it with this config : global stats socket /tmp/sock1 mode 666 level admin nbthread 64 backend stopme timeout server 1s option tcp-check tcp-check send "debug dev exit\n" server cli unix@/tmp/sock1 check The I run it in loops bound to different CPU counts : $ time for i in {1..20}; do taskset -c 0,1,2,3 ./haproxy -db -f slow-init.cfg >/dev/null 2>&1 done With a single CPU, it can take up to 10 seconds to run the loop on commits e186161 and e4d7c9d while it takes 0.18 second with the patch. With 4 CPUs like above, it takes 1.5s with e186161, 2.3s with e4d7c9d and 0.16 second with the patch. The tests I had run consisted in starting hundreds of thousands of listeners to amplify the impact of the start time, but in the end it was diluting the extra time in an already very long time. Running it in loops like above is quite close to what regtests do and explains why I couldn't spot the difference (e.g. a few hundreds of ms at worst among tens of seconds). Thus I'm merging the patch now (cleaned up already and tested as well without threads). Let's hope it's the last time :-) Thanks, Willy Seems i kept you busy for another day.. But the result is there, it looks 100% fixed to me :). Running without nbthread, and as such using 16 threads of the VM i'm using, i now get this: 2.0-dev7-ca3551f 2019/06/11 ** h1 0.732 WAIT4 pid=80796 status=0x0002 (user 0.055515 sys 0.039653) ** h2 0.846 WAIT4 pid=80799 status=0x0002 (user 0.039039 sys 0.039039) # top TEST ./test/tls_health_checks-org.vtc passed (0.848) Also with repeating the testcase 1000 times while running 10 of them in parallel only 1 of them failed with a timeout: S1 0.280 syslog|<134>Jun 11 22:24:48 haproxy[88306]: ::1:63856 [11/Jun/2019:22:24:48.074] fe1/1: Timeout during SSL handshake I think together with the really short timeouts in the testcase itself this is an excellent result. I'm considering this one fully fixed, thanks again. Regards, PiBa-NL (Pieter)
haproxy -v doesn't show commit used when building from 2.0 repository?
Hi List, I have build haproxy 2.0.3-0ff395c from sources however after running a 'haproxy -v' it shows up as: 'HA-Proxy version 2.0.3 2019/07/23 - https://haproxy.org/' this isn't really correct imho as its a version based on code committed on date 7/30. And i kinda expected the commit-id to be part of the version shown? Did i do something wrong? I thought the commit should automatically become part of the version. Though its very well possible ive broken the local freebsd makefile im using.. When building from master repository it seems to work fine though. If its caused by the contents of the repository, can it be changed? I find it really useful to see which commit a certain compiled haproxy binary was based upon. Thanks in advance :). Regards, PiBa-NL (Pieter)
haproxy -v doesn't show commit used when building from 2.0 repository?
Hi List, I have build haproxy 2.0.3-0ff395c from sources however after running a 'haproxy -v' it shows up as: 'HA-Proxy version 2.0.3 2019/07/23 - https://haproxy.org/' this isn't really correct imho as its a version based on code committed on date 7/30. And i kinda expected the commit-id to be part of the version shown? Did i do something wrong? I thought the commit should automatically become part of the version. Though its very well possible ive broken the local freebsd makefile im using.. When building from master repository it seems to work fine though. If its caused by the contents of the repository, can it be changed? I find it really useful to see which commit a certain compiled haproxy binary was based upon. Thanks in advance . Regards, PiBa-NL (Pieter)
Re: haproxy -v doesn't show commit used when building from 2.0 repository?
Hi Willy, Op 1-8-2019 om 6:21 schreef Willy Tarreau: Hi Pieter, On Wed, Jul 31, 2019 at 10:56:54PM +0200, PiBa-NL wrote: Hi List, I have build haproxy 2.0.3-0ff395c from sources however after running a 'haproxy -v' it shows up as: 'HA-Proxy version 2.0.3 2019/07/23 - https://haproxy.org/' this isn't really correct imho as its a version based on code committed on date 7/30. And i kinda expected the commit-id to be part of the version shown? I know what's happening, I always forget to do it with each new major release. We're using Git attributes to automatically patch files "SUBVERS" and "VERDATE" when creating the archive : $ cat info/attributes SUBVERS export-subst VERDATE export-subst And this is something I forget to re-create with each new repository, I've fixed it now. It will be OK with new snapshots starting tomorrow. Thanks! Willy Works for me, building latest commit in 2.0 repository now haproxy -v shows: "HA-Proxy version 2.0.3-7343c71 2019/08/01 - https://haproxy.org/"; for me. Thanks for your quick fix&reply :). Regards, PiBa-NL (Pieter)
Re: freebsd builds are broken for few days - 30ee1ef, proxy_protocol_random_fail.vtc fails because scheme and host are now present in the syslog output.
Hi Ilya, Willy, Op 13-10-2019 om 19:30 schreef Илья Шипицин: https://cirrus-ci.com/github/haproxy/haproxy I'll bisect if noone else knows what's going on @IIlya, thanks for checking my favorite platform, FreeBSD ;). @Willy, this 30ee1ef <http://git.haproxy.org/?p=haproxy.git;a=commit;h=30ee1ef> (MEDIUM: h2: use the normalized URI encoding for absolute form requests) commit 'broke' the expect value of the vtest, i don't know why other platforms don't see the same change in syslog output though.. Anyhow this is the output i get when running the /reg-tests/connection/proxy_protocol_random_fail.vtc Slog_1 0.033 syslog|<134>Oct 14 19:56:34 haproxy[78982]: ::1:47040 [14/Oct/2019:19:56:34.391] ssl-offload-http~ ssl-offload-http/http 0/0/0/0/0 503 222 - - 1/1/0/0/0 0/0 "POST https://[::1]:47037/1 HTTP/2.0" ** Slog_1 0.033 === expect ~ "ssl-offload-http/http .* \"POST /[1-8] HTTP/(2\\.0... Slog_1 0.033 EXPECT FAILED ~ "ssl-offload-http/http .* "POST /[1-8] HTTP/(2\.0|1\.1)"" If i change the vtc vtest file from: expect ~ "ssl-offload-http/http .* \"POST /[1-8] HTTP/(2\\.0|1\\.1)\"" To: expect ~ "ssl-offload-http/http .* \"POST https://[[]::1]:[0-9]{1,5}/[1-8] HTTP/(2\\.0|1\\.1)\"" or: expect ~ "ssl-offload-http/http .* \"POST https://[[]${h1_ssl_addr}]:${h1_ssl_port}/[1-8] HTTP/(2\\.0|1\\.1)\"" Then the test succeeds for me... but now the question is, should or shouldn't the scheme and host be present in the syslog output on all platforms.? Or should the regex contain a (optional?) check for this extra part? (Also note that even with these added variables in my second regext attempt its still using accolades around the IPv6 address.. not sure if all machines would use ipv6 for their localhost connection..) Regards, PiBa-NL (Pieter)
commit 246c024 - breaks loading crt-list with .ocsp files present
Hi William, I'm having an issue with the latest master code 2.1-dev2-4a66013. It does compile but doesn't want to load my crt-list with .ocsp files present for the certificates mentioned. The commit that broke this is: 246c024 # haproxy -v HA-Proxy version 2.1-dev2-4a66013 2019/10/14 - https://haproxy.org/ # haproxy -f ./PB-TEST/ultimo_testcase/xxx/haproxy.cfg -d [ALERT] 286/223026 (39111) : parsing [./PB-TEST/ultimo_testcase/xxx/haproxy.cfg:61] : 'bind 0.0.0.0:443' : 'crt-list' : error processing line 1 in file '/usr/ports-pb_haproxy-devel/PB-TEST/ultimo_testcase/xxx/rtrcld.xxx.crt_list' : (null) [ALERT] 286/223026 (39111) : Error(s) found in configuration file : ./PB-TEST/ultimo_testcase/xxx/haproxy.cfg [ALERT] 286/223026 (39111) : Fatal errors found in configuration. Content of the crt-list file, but removing the alpn stuff doesn't help...: /usr/ports-pb_haproxy-devel/PB-TEST/ultimo_testcase/xxx/rtrcld.xxx.pem [ alpn h2,http/1.1] /usr/ports-pb_haproxy-devel/PB-TEST/ultimo_testcase/xxx/rtrcld.xxx/rtrcld.xxx_5ab0da70ab0cc.pem [ alpn h2,http/1.1] The last line is an empty one.. but it already complains about line 1... which seems valid and the .pem file exists.. exact same config loads alright commits before this one: 246c024. I do have a 'filled' .ocsp file present. But no matter if its outdated, empty or correct the error above stays. When the .ocsp is absent it complains about line 2 of the cert-list.. Which has its own .ocsp as well.. Can you take a look? Thanks in advance. Regards, PiBa-NL (Pieter)
Re: freebsd builds are broken for few days - 30ee1ef, proxy_protocol_random_fail.vtc fails because scheme and host are now present in the syslog output.
Hi Christopher, It seems you fixed/changed the issue i noticed below a few minutes ago in commit 452e578 :) , thanks. One question remaining about this on my side is if it is expected that some platforms will use 'normalized' URI and others platforms just the regular / ? Regards, PiBa-NL (Pieter) Op 14-10-2019 om 21:22 schreef PiBa-NL: Hi Ilya, Willy, Op 13-10-2019 om 19:30 schreef Илья Шипицин: https://cirrus-ci.com/github/haproxy/haproxy I'll bisect if noone else knows what's going on @IIlya, thanks for checking my favorite platform, FreeBSD ;). @Willy, this 30ee1ef <http://git.haproxy.org/?p=haproxy.git;a=commit;h=30ee1ef> (MEDIUM: h2: use the normalized URI encoding for absolute form requests) commit 'broke' the expect value of the vtest, i don't know why other platforms don't see the same change in syslog output though.. Anyhow this is the output i get when running the /reg-tests/connection/proxy_protocol_random_fail.vtc Slog_1 0.033 syslog|<134>Oct 14 19:56:34 haproxy[78982]: ::1:47040 [14/Oct/2019:19:56:34.391] ssl-offload-http~ ssl-offload-http/http 0/0/0/0/0 503 222 - - 1/1/0/0/0 0/0 "POST https://[::1]:47037/1 HTTP/2.0" ** Slog_1 0.033 === expect ~ "ssl-offload-http/http .* \"POST /[1-8] HTTP/(2\\.0... Slog_1 0.033 EXPECT FAILED ~ "ssl-offload-http/http .* "POST /[1-8] HTTP/(2\.0|1\.1)"" If i change the vtc vtest file from: expect ~ "ssl-offload-http/http .* \"POST /[1-8] HTTP/(2\\.0|1\\.1)\"" To: expect ~ "ssl-offload-http/http .* \"POST https://[[]::1]:[0-9]{1,5}/[1-8] HTTP/(2\\.0|1\\.1)\"" or: expect ~ "ssl-offload-http/http .* \"POST https://[[]${h1_ssl_addr}]:${h1_ssl_port}/[1-8] HTTP/(2\\.0|1\\.1)\"" Then the test succeeds for me... but now the question is, should or shouldn't the scheme and host be present in the syslog output on all platforms.? Or should the regex contain a (optional?) check for this extra part? (Also note that even with these added variables in my second regext attempt its still using accolades around the IPv6 address.. not sure if all machines would use ipv6 for their localhost connection..) Regards, PiBa-NL (Pieter)
Re: commit 246c024 - breaks loading crt-list with .ocsp files present
Op 15-10-2019 om 13:52 schreef William Lallemand: I pushed the fix. Thanks Fix confirmed. Thank you.
Re: freebsd ci is broken - commit 08fa16e - curl download stalls in reg-tests/compression/lua_validation.vtc
Hi Ilya, Thanks! Op 14-1-2020 om 07:48 schreef Илья Шипицин: Hello, since https://github.com/haproxy/haproxy/commit/08fa16e397ffb1c6511b98ade2a3bfff9435e521 freebsd CI is red: https://cirrus-ci.com/task/5960933184897024 I'd say "it is something with CI itself", when I run the same tests locally on freebsd, it is green. Sadly i do get the same problem on my test server (version info below its version 11.1 is a bit outdated, but hasn't failed my before...). PiBa ? thanks, Ilya Shipitcin Below a part of the output that the test generates for me. The first curl request seems to succeed, but the second one runs into a timeout.. When compiled with the commit before 08fa16e <https://github.com/haproxy/haproxy/commit/08fa16e397ffb1c6511b98ade2a3bfff9435e521> it does not show that behaviour.. Current latest(24c928c) commit is still affected.. top shell_out| % Total % Received % Xferd Average Speed Time Time Time Current top shell_out| Dload Upload Total Spent Left Speed top shell_out|\r 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0\r100 418k 0 418k 0 0 1908k 0 --:--:-- --:--:-- --:--:-- 1908k top shell_out| % Total % Received % Xferd Average Speed Time Time Time Current top shell_out| Dload Upload Total Spent Left Speed top shell_out|\r 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0\r100 141k 0 141k 0 0 284k 0 --:--:-- --:--:-- --:--:-- 284k\r100 343k 0 343k 0 0 156k 0 --:--:-- 0:00:02 --:--:-- 156k\r100 343k 0 343k 0 0 105k 0 --:--:-- 0:00:03 --:--:-- 105k\r100 343k 0 343k 0 0 81274 0 --:--:-- 0:00:04 --:--:-- 81274\r100 343k 0 343k 0 0 65228 0 --:--:-- 0:00:05 --:--:-- 65240\r100 343k 0 343k 0 0 54481 0 --:--:-- 0:00:06 --:--:-- 34743\r100 343k 0 343k 0 0 46768 0 --:--:-- 0:00:07 --:--:-- 0\r100 343k 0 343k 0 0 40968 0 --:--:-- 0:00:08 --:--:-- 0\r100 343k 0 343k 0 0 36452 0 --:--:-- 0:00:09 --:--:-- 0\r100 343k 0 343k 0 0 32830 0 --:--:-- 0:00:10 --:--:-- 0\r100 343k 0 343k 0 0 29865 0 --:--:-- 0:00:11 --:--:-- 0\r100 343k 0 343k 0 0 27395 0 --:--:-- 0:00:12 --:--:-- 0\r100 343k 0 343k 0 0 25297 0 --:--:-- 0:00:13 --:--:-- 0\r100 343k 0 343k 0 0 23500 0 --:--:-- 0:00:14 --:--:-- 0\r100 343k 0 343k 0 0 23431 0 --:--:-- 0:00:15 --:--:-- 0 top shell_out|curl: (28) Operation timed out after 15002 milliseconds with 351514 bytes received top shell_out|Expecting checksum 4d9c62aa5370b8d5f84f17ec2e78f483 top shell_out|Received checksum: da2d120aedfd693eeba9cf1e578897a8 top shell_status = 0x0001 top shell_exit not as expected: got 0x0001 wanted 0x * top RESETTING after ./work/haproxy-08fa16e/reg-tests/compression/lua_validation.vtc Should i update to a newer FreeBSD version, or is it likely unrelated, and in need of some developer attention.. Do you (Willy or anyone), need more information from my side? Or is there a patch i can try to validate? Regards, PiBa-NL (Pieter) Yes im running a somewhat outdated OS here: FreeBSD freebsd11 11.1-RELEASE FreeBSD 11.1-RELEASE #0 r321309: Fri Jul 21 02:08:28 UTC 2017 r...@releng2.nyi.freebsd.org:/usr/obj/usr/src/sys/GENERIC amd64 Version used: haproxy -vv HA-Proxy version 2.2-dev0-08fa16e 2020/01/08 - https://haproxy.org/ Status: development branch - not safe for use in production. Known bugs: https://github.com/haproxy/haproxy/issues?q=is:issue+is:open Build options : TARGET = freebsd CPU = generic CC = cc CFLAGS = -pipe -g -fstack-protector -fno-strict-aliasing -fno-strict-aliasing -Wdeclaration-after-statement -fwrapv -fno-strict-overflow -Wno-null-dereference -Wno-unused-label -Wno-unused-parameter -Wno-sign-compare -Wno-ignored-qualifiers -Wno-unused-command-line-argument -Wno-missing-field-initializers -Wno-address-of-packed-member -DFREEBSD_PORTS -DFREEBSD_PORTS OPTIONS = USE_PCRE=1 USE_PCRE_JIT=1 USE_REGPARM=1 USE_STATIC_PCRE=1 USE_GETADDRINFO=1 USE_OPENSSL=1 USE_LUA=1 USE_ACCEPT4=1 USE_ZLIB=1 Feature list : -EPOLL +KQUEUE -MY_EPOLL -MY_SPLICE -NETFILTER +PCRE +PCRE_JIT -PCRE2 -PCRE2_JIT +POLL -PRIVATE_CACHE +THREAD -PTHREAD_PSHARED +REGPARM +STATIC_PCRE -STATIC_PCRE2 +TPROXY -LINUX_TPROXY -LINUX_SPLICE +LIBCRYPT -CRYPT_H -VSYSCALL +GETADDRINFO +OPENSSL +LUA -FUTEX +ACCEPT4 -MY_ACCEPT4 +ZLIB -SLZ +CPU_AFFINITY -TFO -NS -DL -RT -DEVICEATLAS -51DEGREES -WURFL -SYSTEMD -OBSOLETE_LINKER -PRCTL -THREAD_DUMP -EVPORTS Default settings : bufsize = 16384, maxrewrite = 1024, maxpollevents = 200 Built wi
Re: freebsd ci is broken - commit 08fa16e - curl download stalls in reg-tests/compression/lua_validation.vtc
Hi Ilya, Willy, Op 14-1-2020 om 21:40 schreef Илья Шипицин: PiBa, how many CPU cores are you running ? it turned out that I run tests on very low vm, which only has 1 core. and tests pass. cirrus-ci as far as I remember do have many cores. I was running with 16 cores.. can you find single core vm ? Well, i reconfigured the VM to have 1 core, but same issue seems to show up, though not on every time the test is run, and actually a bit less often.. Below some additional testresults with different kqueue / vCPU settings.. *VM with 1 vCPU* Running: ./vtest/VTest-master/vtest -Dno-htx=no -l -k -b 50M -t 5 -n 20 ./work/haproxy-08fa16e/reg-tests/compression/lua_validation.vtc Results in: 4 tests failed, 0 tests skipped, 16 tests passed Adding "nokqueue" in the vtc file i get: 8 tests failed, 0 tests skipped, 12 tests passed 4 tests failed, 0 tests skipped, 16 tests passed So its a bit random, but the 'nokqueue' directive does not seem to affect results much.. *With 16 vCPU* Without nokqueue: 16 tests failed, 0 tests skipped, 4 tests passed With nokqueue (using poll): 17 tests failed, 0 tests skipped, 3 tests passed The failure rate seems certainly higher with many cores.. * Using commit 0eae632 it works OK* Just to be sure i re-tested on 16 cores with 2.2-dev0-0eae632 but that does nicely pass: 0 tests failed, 0 tests skipped, 20 tests passed Regards, PiBa-NL (Pieter)
Re: freebsd ci is broken - commit 08fa16e - curl download stalls in reg-tests/compression/lua_validation.vtc
Hi Olivier, Willy, Ilya, Thanks! I confirm 2.2-dev0-ac81474 fixes this issue for me. And cirrus-ci also shows 'all green' again :). Running the same test with 16vCPU and kqueue enabled its 'all okay': 0 tests failed, 0 tests skipped, 200 tests passed. Op 15-1-2020 om 19:20 schreef Olivier Houchard: Hi guys, On Tue, Jan 14, 2020 at 09:45:34PM +0100, Willy Tarreau wrote: Hi guys, On Tue, Jan 14, 2020 at 08:02:51PM +0100, PiBa-NL wrote: Below a part of the output that the test generates for me. The first curl request seems to succeed, but the second one runs into a timeout.. When compiled with the commit before 08fa16e <https://github.com/haproxy/haproxy/commit/08fa16e397ffb1c6511b98ade2a3bfff9435e521> Ah, and unsurprizingly I'm the author :-/ I'm wondering why it only affects FreeBSD (very likely kqueue in fact, I suppose it works if you start with -dk). Maybe something subtle escaped me in the poller after the previous changes. Should i update to a newer FreeBSD version, or is it likely unrelated, and in need of some developer attention.. Do you (Willy or anyone), need more information from my side? Or is there a patch i can try to validate? I don't think I need more info for now and your version has nothing to do with this (until proven otherwise). I apparently really broke something there. I think I have a FreeBSD VM somewhere, in the worst case I'll ask Olivier for some help :-) To give you a quick update, we investigating that, and I'm still not really sure why it only affects FreeBSD, but we fully understood the problem, and it should be fixed by now. Regards, Olivier Regards, PiBa-NL (Pieter)
mcli vtest broken by commit.?. MEDIUM: connections: Get ride of the xprt_done callback.
Hi Olivier, Just to let you know, seems this commit has broken a few regtests: http://git.haproxy.org/?p=haproxy.git;a=commit;h=477902bd2e8c1e978ad43d22dba1f28525bb797a https://api.cirrus-ci.com/v1/task/5885732300521472/logs/main.log Testing with haproxy version: 2.2-dev1 #top TEST reg-tests/mcli/mcli_show_info.vtc TIMED OUT (kill -9) #top TEST reg-tests/mcli/mcli_show_info.vtc FAILED (10.044) signal=9 #top TEST reg-tests/mcli/mcli_start_progs.vtc TIMED OUT (kill -9) #top TEST reg-tests/mcli/mcli_start_progs.vtc FAILED (10.019) signal=9 Can reproduce it on my own FreeBSD machine as well, the testcase just sits and waits.. until the vtest-timeout strikes. Do you need more info? If so what can i provide.? Regards, Pieter
dns fails to process response / hold valid? (since commit 2.2-dev0-13a9232)
Hi List, Baptiste, After updating haproxy i found that the DNS resolver is no longer working for me. Also i wonder about the exact effect that 'hold valid' should have. I pointed haproxy to a 'Unbound 1.9.4' dns server that does the recursive resolving of the dns request made by haproxy. Before commit '2.2-dev0-13a9232, released 2020/01/22 (use additional records from SRV responses)' i get seemingly proper working resolving of server a name. After this commit all responses are counted as 'invalid' in the socket stats. Attached also a pcap of the dns traffic. Which shows a short capture of a single attempt where 3 retries for both A and records show up. There is a additional record of type 'OPT' is present in the response.. But the exact same keeps repeating every 5 seconds. As for 'hold valid' (tested with the commit before this one) it seems that the stats page of haproxy shows the server in 'resolution' status way before the 3 minute 'hold valid' has passed when i simply disconnect the network of the server running the Unbound-DNS server. Though i guess that is less important that dns working at all in the first place.. If any additional information is needed please let me know :). Can you/someone take a look? Thanks in advance. p.s. i think i read something about a 'vtest' that can test the haproxy DNS functionality, if you have a example that does this i would be happy to provide a vtest with a reproduction of the issue though i guess it will be kinda 'slow' if it needs to test for hold valid timings.. Regards, PiBa-NL (Pieter) haproxy config: resolvers globalresolvers nameserver pfs_routerbox 192.168.0.18:53 resolve_retries 3 timeout retry 200 hold valid 3m hold nx 10s hold other 15s hold refused 20s hold timeout 25s hold obsolete 30s timeout resolve 5s frontend nu_nl bind 192.168.0.19:433 name 192.168.0.19:433 ssl crt-list /var/etc/haproxy/nu_nl.crt_list mode http log global option http-keep-alive timeout client 3 use_backend nu.nl_ipvANY backend nu.nl_ipvANY mode http id 2113 log global timeout connect 3 timeout server 3 retries 3 option httpchk GET / HTTP/1.0\r\nHost:\ nu.nl\r\nAccept:\ */* server nu_nl nu.nl:443 id 2114 ssl check inter 1 verify none resolvers globalresolvers check-sni nu.nl resolve-prefer ipv4 haproxy_socket.sh show resolvers Resolvers section globalresolvers nameserver pfs_routerbox: sent: 216 snd_error: 0 valid: 0 update: 0 cname: 0 cname_error: 0 any_err: 108 nx: 0 timeout: 0 refused: 0 other: 0 invalid: 108 too_big: 0 truncated: 0 outdated: 0 haproxy -vv HA-Proxy version 2.2-dev0-13a9232 2020/01/22 - https://haproxy.org/ Status: development branch - not safe for use in production. Known bugs: https://github.com/haproxy/haproxy/issues?q=is:issue+is:open Build options : TARGET = freebsd CPU = generic CC = cc CFLAGS = -pipe -g -fstack-protector -fno-strict-aliasing -fno-strict-aliasing -Wdeclaration-after-statement -fwrapv -fno-strict-overflow -Wno-null-dereference -Wno-unused-label -Wno-unused-parameter -Wno-sign-compare -Wno-ignored-qualifiers -Wno-unused-command-line-argument -Wno-missing-field-initializers -Wno-address-of-packed-member -DFREEBSD_PORTS -DFREEBSD_PORTS OPTIONS = USE_PCRE=1 USE_PCRE_JIT=1 USE_REGPARM=1 USE_STATIC_PCRE=1 USE_GETADDRINFO=1 USE_OPENSSL=1 USE_LUA=1 USE_ACCEPT4=1 USE_ZLIB=1 Feature list : -EPOLL +KQUEUE -MY_EPOLL -MY_SPLICE -NETFILTER +PCRE +PCRE_JIT -PCRE2 -PCRE2_JIT +POLL -PRIVATE_CACHE +THREAD -PTHREAD_PSHARED +REGPARM +STATIC_PCRE -STATIC_PCRE2 +TPROXY -LINUX_TPROXY -LINUX_SPLICE +LIBCRYPT -CRYPT_H -VSYSCALL +GETADDRINFO +OPENSSL +LUA -FUTEX +ACCEPT4 -MY_ACCEPT4 +ZLIB -SLZ +CPU_AFFINITY -TFO -NS -DL -RT -DEVICEATLAS -51DEGREES -WURFL -SYSTEMD -OBSOLETE_LINKER -PRCTL -THREAD_DUMP -EVPORTS Default settings : bufsize = 16384, maxrewrite = 1024, maxpollevents = 200 Built with multi-threading support (MAX_THREADS=64, default=2). Built with OpenSSL version : OpenSSL 1.1.1a-freebsd 20 Nov 2018 Running on OpenSSL version : OpenSSL 1.1.1a-freebsd 20 Nov 2018 OpenSSL library supports TLS extensions : yes OpenSSL library supports SNI : yes OpenSSL library supports : SSLv3 TLSv1.0 TLSv1.1 TLSv1.2 TLSv1.3 Built with Lua version : Lua 5.3.5 Built with transparent proxy support using: IP_BINDANY IPV6_BINDANY Built with PCRE version : 8.43 2019-02-23 Running on PCRE version : 8.43 2019-02-23 PCRE library supports JIT : yes Encrypted password support via crypt(3): yes Built with zlib version : 1.2.11 Running on zlib versio
Re: [PATCH] BUG/MEDIUM: email-alert: don't set server check status from a email-alert task
Hi Christopher, Willy, Op 7-12-2017 om 19:33 schreef Willy Tarreau: On Thu, Dec 07, 2017 at 04:27:16PM +0100, Christopher Faulet wrote: Honestly, I don't know which version is the best. Just let me know guys :-) imho Christopher's patch is smaller and probably easier to maintain and eventually remove without adding (unneeded) code to the set_server_check_status(). Though it is a bit less obvious to me that it will have the same effect, i works just as well. Email alerts should probably be rewritten to not use the checks. This was the only solution to do connections by hand when Simon implemented it. That's not true anymore. I agree and I think I was the one asking Simon to do it like this by then eventhough he didn't like this solution. That was an acceptable tradeoff in my opinion, with very limited impact on existing code. Now with applets being much more flexible we could easily reimplement a more complete and robust SMTP engine not relying on hijacking the tcp-check engine anymore. Willy A 'smtp engine' for sending email-alert's might be nice eventually but that is not easily done 'today'. (not by me anyhow) (Would it group messages together if multiple are created within a short time-span?) As for the current issue / patch, i prefer the solution Christopher found/made. Made a new version of it with a bit of extra comments inside the code, removed a unrelated white-space change, and added a matching patch description. Or perhaps Christopher can create it under his own name? Either way is fine for me. :) Regards, PiBa-NL / Pieter From 3129e1ae21e41a026f6d067b3658f6643835974c Mon Sep 17 00:00:00 2001 From: PiBa-NL Date: Wed, 6 Dec 2017 01:35:43 +0100 Subject: [PATCH] BUG/MEDIUM: email-alert: don't set server check status from a email-alert task This avoids possible 100% cpu usage deadlock on a EMAIL_ALERTS_LOCK and avoids sending lots of emails when 'option log-health-checks' is used. It is avoided to change the server state and possibly queue a new email while processing the email alert by setting check->status to HCHK_STATUS_UNKNOWN which will exit the set_server_check_status(..) early. This needs to be backported to 1.8. --- src/checks.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/checks.c b/src/checks.c index eaf84a2..3a6f020 100644 --- a/src/checks.c +++ b/src/checks.c @@ -3145,7 +3145,7 @@ static struct task *process_email_alert(struct task *t) t->expire = now_ms; check->server = alert->srv; check->tcpcheck_rules = &alert->tcpcheck_rules; - check->status = HCHK_STATUS_INI; + check->status = HCHK_STATUS_UNKNOWN; // the UNKNOWN status is used to exit set_server_check_status(.) early check->state |= CHK_ST_ENABLED; } -- 2.10.1.windows.1
Re: Status change from MAINT to UP
Hi Johan, Op 13-12-2017 om 17:31 schreef Johan Hendriks: When i use the show stat command I get different results? Just a guess, are you using?: nbproc > 1 Are multiple (old?) haproxy processes running? Perhaps including the used config could help diagnose. And 'haproxy -vv' is always appreciated. Regards, PiBa-NL