[PATCH] BUG/MINOR: ssl_sock.c: use PATH_MAX only when defined
Hi Willy, We are seeing build failures of 1.5 w/ SSL on Debian's Hurd builder machines due to the use of PATH_MAX (which is undefined in Hurd) when loading SSL certificates. You can see the build log here: https://buildd.debian.org/status/fetch.php?pkg=haproxyarch=hurd-i386ver=1.5%7Edev19-1stamp=1372199388 The attached patch should fix this issue, by modifying the code in question to use a dynamically allocated buffer while checking against PATH_MAX if appropriate. Regards, Apollon From 865e8c1ed5bc8dfa61eff6c33e5b59b6a554db96 Mon Sep 17 00:00:00 2001 From: Apollon Oikonomopoulos apoi...@gmail.com Date: Mon, 12 Aug 2013 12:22:26 +0300 Subject: [PATCH] BUG/MINOR: ssl_sock.c: use PATH_MAX only when defined bind_parse_crt() unconditionally uses PATH_MAX, which is not guaranteed to be defined by POSIX. In fact, GNU Hurd does not have a limit on path sizes and thus leaves PATH_MAX undefined, causing OpenSSL-enabled builds on Hurd to fail. We fix this by using dynamic allocation of the path buffer and checking the actual path length whenever PATH_MAX is defined. --- src/ssl_sock.c | 12 ++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/src/ssl_sock.c b/src/ssl_sock.c index ce1712d..655dc77 100644 --- a/src/ssl_sock.c +++ b/src/ssl_sock.c @@ -2543,17 +2543,25 @@ static int bind_parse_ciphers(char **args, int cur_arg, struct proxy *px, struct /* parse the crt bind keyword */ static int bind_parse_crt(char **args, int cur_arg, struct proxy *px, struct bind_conf *conf, char **err) { - char path[PATH_MAX]; + char *path; + size_t path_length; if (!*args[cur_arg + 1]) { memprintf(err, '%s' : missing certificate location, args[cur_arg]); return ERR_ALERT | ERR_FATAL; } if ((*args[cur_arg + 1] != '/' ) global.crt_base) { - if ((strlen(global.crt_base) + 1 + strlen(args[cur_arg + 1]) + 1) PATH_MAX) { + path_length = strlen(global.crt_base) + 1 + strlen(args[cur_arg + 1]) + 1; +#ifdef PATH_MAX + if (path_length PATH_MAX) { memprintf(err, '%s' : path too long, args[cur_arg]); return ERR_ALERT | ERR_FATAL; } +#endif + if ((path = malloc(path_length)) == NULL) { + memprintf(err, '%s' : unable to allocate path buffer, args[cur_arg]); + return ERR_ALERT | ERR_FATAL; + } sprintf(path, %s/%s, global.crt_base, args[cur_arg + 1]); if (ssl_sock_load_cert(path, conf, px, err) 0) return ERR_ALERT | ERR_FATAL; -- 1.7.10.4
Re: [PATCH] BUG/MINOR: ssl_sock.c: use PATH_MAX only when defined
On 12:34 Mon 12 Aug , Willy Tarreau wrote: Hi Apollon, Unfortunately, this patch introduces a memory leak. Since the path variable is just a temporary one, better use alloca() instead in order to dynamically allocate on the stack. Ooops, completely forgot to free(). Never submit patches directly after vacation :-( Anyway I'd prefer something simpler : let's define PATH_MAX in compat.h if it is not defined. Yes, that's probably a cleaner solution. Thanks, Willy Thanks, Apollon
Re: Static haproxy/openssl build error
Hi Julien, On 10:35 Sat 28 Sep , Julien Vehent wrote: On 2013-09-27 21:17, Lukas Tribus wrote: Try stable OpenSSL 1.0.1e, that should work. I did, and I get the exact same error. The problem happens on Debian Squeeze. My laptop on Fedora 19 doesn't have the issue. HAProxy's build system does not currently support static linking against openssl. The reason it works on Fedora 19, is probably because the system already ships OpenSSL 1.0.1 in the system library path; I bet if you do an ldd against the generated haproxy binary it will be dynamically linked against the system's OpenSSL, not statically against your own version. The only way to statically link against OpenSSL would be to patch HAProxy's Makefile to do more or less what is done with static PCRE: pass -Wl,-Bstatic before -lssl and -Wl,-Bdynamic afterwards. The following patch should do it (although it should really introduce a new flag): diff --git a/Makefile b/Makefile index 5e12af8..fc4d1bc 100644 --- a/Makefile +++ b/Makefile @@ -521,7 +521,7 @@ ifneq ($(USE_OPENSSL),) # to specify -I and -L if your OpenSSL library is not in the standard path. BUILD_OPTIONS += $(call ignore_implicit,USE_OPENSSL) OPTIONS_CFLAGS += -DUSE_OPENSSL -OPTIONS_LDFLAGS += -lssl -lcrypto +OPTIONS_LDFLAGS += -Wl,-Bstatic -lssl -lcrypto -Wl,-Bdynamic OPTIONS_OBJS += src/ssl_sock.o src/shctx.o ifneq ($(USE_PRIVATE_CACHE),) OPTIONS_CFLAGS += -DUSE_PRIVATE_CACHE -- Regards, Apollon PS: personally I wouldn't link statically against OpenSSL. I'd just co-install libssl1.0.0 along with the system's libssl0.9.8.
Re: Static haproxy/openssl build error
Hi all, On 13:35 Sun 29 Sep , Willy Tarreau wrote: Yes, and I have fixed this two weeks ago. The problem is that the ADDINC and ADDLIB variables are not suited for passing single-component paths since they suffix everything. Look what it results in your build log : -lcrypt -lssl -lcrypto -L/usr/lib -Wl,-Bstatic -lpcreposix \ -lpcre -Wl,-Bdynamic -L/tmp/staticlibssl/lib -ldl As you can see, -lssl and -lcrypto are looked up in your system path. Since commit 9a05945bd0, you now have an explicit set of SSL_INC/SSL_LIB variables, just like with PCRE, that you can point to your openssl location. I'm using this to build with a static openssl version, as this time it's safe, as you can see below : The real culprit is `-L/usr/lib' in the linker line above, which appears before `-L /tmp/staticlibssl/lib'. Here the relative order *is* important (unlike the relative order of -l and -L flags), and also the fact that /usr/lib is already in the system search path (and shouldn't really appear in a -L option). I should have spotted that earlier on, it was reported to us as Debian bug #722777 [1] and it's there because of USE_PCRE; when USE_PCRE is enabled, and PCREDIR is not specified, PCRE_LIB will be forced to `pcre-config --prefix`/lib (which 99% of the time will be /usr/lib) and included in LDFLAGS. If libssl.so is in that path (which will happen 99% of the time if PCRE is in the system library path and the system does not use multiarch triplets) it will be picked up instead. So in this case, it's the relative position of PCRE vs ADDLIB arguments in the linker command line that causes ADDLIB to be unable to override system libraries by default. The same also goes for PCRE_INC, although gcc is a bit smarter than ld: If the directory dir is a standard system include directory, the option is ignored to ensure that the default search order for system directories and the special treatment of system headers are not defeated . By the way, my previous patch on the Makefile worked because it forced ld to only consider static archives for OpenSSL, which were (of course) not present in Debian's /usr/lib. IMHO, we should have a closer look at this, as it has at least caused trouble twice (once with mips builds as per [1], and once now in Julien's system). PCRE_LIB is currently added even on systems it doesn't make any sense on; for example in Debian Wheezy with multiarch, PCRE resides in /usr/lib/x86_64-linux-gnu but the Makefile adds a -L/usr/lib which has nothing to do with PCRE. My guess is that PCRE_LIB/PCRE_INC are not generally needed and should be explicitly specified when desired. Regards, Apollon [1] http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=722777
Re: Static haproxy/openssl build error
Hi Willy, On 18:30 Sun 29 Sep , Willy Tarreau wrote: That's problematic. The issue was created by the addition of pcre-config because some people could not build on their systems in the past. The whole mess started with the lib64 subdir that broke basically every software relying on $FOO/lib. So some users had PCRE in /usr/local/{include,lib64} and could not specify a working PCREDIR. Thus we switched to pcre-config. And finally since pcre-config is still wrong on some systems (eg: when you want to use an alternate version such as the JIT-enabled one), we offered the support for separate PCRE_LIB / PCRE_INC. Still, this approach does not properly work on most distributions anymore. Debian and Ubuntu are shipping most of the libraries under /usr/lib/arch-linux-gnu, but the libdir generated using ${prefix}/lib is still /usr/lib. pcre-config can output the correct output path: $ pcre-config --libs -L/usr/lib/x86_64-linux-gnu -lpcre but we would still end up inserting a system library path using -L. So maybe we should in fact stop setting PCREDIR to $(pcre-config --prefix), which will result in PCRE_INC/PCRE_LIB remaining silent unless PCREDIR is forced. I suspect the following patch should fix it : diff --git a/Makefile b/Makefile index 0529e89..89f9f39 100644 --- a/Makefile +++ b/Makefile @@ -544,7 +544,6 @@ ifneq ($(USE_PCRE)$(USE_STATIC_PCRE)$(USE_PCRE_JIT),) # Forcing PCREDIR to an empty string will let the compiler use the default # locations. -PCREDIR := $(shell pcre-config --prefix 2/dev/null || echo /usr/local) ifneq ($(PCREDIR),) PCRE_INC := $(PCREDIR)/include PCRE_LIB := $(PCREDIR)/lib Looks good to me, that's what we're essentially doing for the Debian package right now anyway. Another approach would be to manually specify the path to pcre-config (e.g. using a PCRE_CONFIG variable) if desired and get `pcre-config --libs` and `pcre-config --cflags` from there if specified. But pcre is a simple use-case, just specifying includes and libdir should be enough. Thanks, Apollon
[PATCH 1.5 0/4] Manpage fixes
Hi, The following set of patches updates the manpage, adding missing options, removing obsolete options and fixing external references. Regards, Apollon Apollon Oikonomopoulos (4): DOC: add missing options to the manpage DOC: add manpage references to all system calls DOC: update manpage reference to haproxy-en.txt DOC: remove -s and -l options from the manpage doc/haproxy.1 | 57 ++--- 1 file changed, 38 insertions(+), 19 deletions(-) -- 1.7.10.4
[PATCH 1.5 2/4] DOC: add manpage references to all system calls
Add a man section to every system call reference, giving users pointers to the respective manpages. Signed-off-by: Apollon Oikonomopoulos apoi...@gmail.com --- doc/haproxy.1 | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/doc/haproxy.1 b/doc/haproxy.1 index 0fb2538..a2ac857 100644 --- a/doc/haproxy.1 +++ b/doc/haproxy.1 @@ -110,25 +110,25 @@ Show even more statistics (implies '\-s'). .TP \fB\-dk\fP -Disable use of kqueue(). kqueue() is available only on BSD systems. +Disable use of \fBkqueue\fP(2). \fBkqueue\fP(2) is available only on BSD systems. .TP \fB\-ds\fP -Disable use of speculative epoll(). epoll() is available only on Linux 2.6 -and some custom Linux 2.4 systems. +Disable use of speculative \fBepoll\fP(7). \fBepoll\fP(7) is available only on +Linux 2.6 and some custom Linux 2.4 systems. .TP \fB\-de\fP -Disable use of epoll(). epoll() is available only on Linux 2.6 +Disable use of \fBepoll\fP(7). \fBepoll\fP(7) is available only on Linux 2.6 and some custom Linux 2.4 systems. .TP \fB\-dp\fP -Disables use of poll(). select() might be used instead. +Disables use of \fBpoll\fP(2). \fBselect\fP(2) might be used instead. .TP \fB\-dS\fP -Disables use of splice(), which is broken on older kernels. +Disables use of \fBsplice\fP(2), which is broken on older kernels. .TP \fB\-db\fP -- 1.7.10.4
[PATCH 1.5 3/4] DOC: update manpage reference to haproxy-en.txt
The manpage refers to haproxy-en.txt, which is obsolete. Update the reference to point to configuration.txt, together with the location on Debian systems. Also capitalize Debian. Signed-off-by: Apollon Oikonomopoulos apoi...@gmail.com --- doc/haproxy.1 |5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/doc/haproxy.1 b/doc/haproxy.1 index a2ac857..85244af 100644 --- a/doc/haproxy.1 +++ b/doc/haproxy.1 @@ -198,9 +198,8 @@ This signal is intercepted and ignored on systems without \fBMSG_NOSIGNAL\fP. .SH SEE ALSO -A much better documentation can be found in haproxy-en.txt. On debian -systems, you can find this file in -/usr/share/doc/haproxy/haproxy-en.txt.gz. +A much better documentation can be found in configuration.txt. On Debian +systems, you can find this file in /usr/share/doc/haproxy/configuration.txt.gz. .SH AUTHOR -- 1.7.10.4
[PATCH 1.5 1/4] DOC: add missing options to the manpage
Document -L, -v(v), -C, -dS and -dM, as they were missing from the manpage. Signed-off-by: Apollon Oikonomopoulos apoi...@gmail.com --- doc/haproxy.1 | 32 +++- 1 file changed, 31 insertions(+), 1 deletion(-) diff --git a/doc/haproxy.1 b/doc/haproxy.1 index 48717ad..0fb2538 100644 --- a/doc/haproxy.1 +++ b/doc/haproxy.1 @@ -6,7 +6,7 @@ HAProxy \- fast and reliable http reverse proxy and load balancer .SH SYNOPSIS -haproxy \-f configuration\ file [\-n\ maxconn] [\-N\ maxconn] [\-d] [\-D] [\-q] [\-V] [\-c] [\-p\ pidfile] [\-s] [\-l] [\-dk] [\-ds] [\-de] [\-dp] [\-db] [\-m\ megs] [{\-sf|\-st}\ pidlist...] +haproxy \-f configuration\ file [\-L\ name] [\-n\ maxconn] [\-N\ maxconn] [\-C\ dir] [\-v|\-vv] [\-d] [\-D] [\-q] [\-V] [\-c] [\-p\ pidfile] [\-s] [\-l] [\-dk] [\-ds] [\-de] [\-dp] [\-db] [\-dM[byte]] [\-m\ megs] [{\-sf|\-st}\ pidlist...] .SH DESCRIPTION @@ -37,6 +37,13 @@ instances without risking the system's stability. Specify configuration file path. .TP +\fB\-L name\fP +Set the local instance's peer name. Peers are defined in the \fBpeers\fP +configuration section and used for syncing stick tables between different +instances. If this option is not specified, the local hostname is used as peer +name. + +.TP \fB\-n maxconn\fP Set the high limit for the total number of simultaneous connections. @@ -45,6 +52,18 @@ Set the high limit for the total number of simultaneous connections. Set the high limit for the per-listener number of simultaneous connections. .TP +\fB\-C dir\fP +Change directory to \fIdir\fP before loading any files. + +.TP +\fB\-v\fP +Display HAProxy's version. + +.TP +\fB\-vv\fP +Display HAProxy's version and all build options. + +.TP \fB\-d\fP Start in foregreound with debugging mode enabled. When the proxy runs in this mode, it dumps every connections, @@ -108,6 +127,10 @@ and some custom Linux 2.4 systems. Disables use of poll(). select() might be used instead. .TP +\fB\-dS\fP +Disables use of splice(), which is broken on older kernels. + +.TP \fB\-db\fP Disables background mode (stays in foreground, useful for debugging). For debugging, the '\-db' option is very useful as it temporarily @@ -116,6 +139,13 @@ stopped by simply pressing Ctrl-C, without having to edit the config nor run full debug. .TP +\fB\-dM[byte]\fP +Initializes all allocated memory areas with the given \fIbyte\fP. This makes +it easier to detect bugs resulting from uninitialized memory accesses, at the +expense of touching all allocated memory once. If \fIbyte\fP is not +specified, it defaults to 0x50 (ASCII 'P'). + +.TP \fB\-m megs\fP Enforce a memory usage limit to a maximum of megs megabytes. -- 1.7.10.4
[PATCH 1.5 4/4] DOC: remove -s and -l options from the manpage
These options are no longer supported since 1.3, so remove them from the manpage. Signed-off-by: Apollon Oikonomopoulos apoi...@gmail.com --- doc/haproxy.1 | 12 +--- 1 file changed, 1 insertion(+), 11 deletions(-) diff --git a/doc/haproxy.1 b/doc/haproxy.1 index 85244af..c479cc8 100644 --- a/doc/haproxy.1 +++ b/doc/haproxy.1 @@ -6,7 +6,7 @@ HAProxy \- fast and reliable http reverse proxy and load balancer .SH SYNOPSIS -haproxy \-f configuration\ file [\-L\ name] [\-n\ maxconn] [\-N\ maxconn] [\-C\ dir] [\-v|\-vv] [\-d] [\-D] [\-q] [\-V] [\-c] [\-p\ pidfile] [\-s] [\-l] [\-dk] [\-ds] [\-de] [\-dp] [\-db] [\-dM[byte]] [\-m\ megs] [{\-sf|\-st}\ pidlist...] +haproxy \-f configuration\ file [\-L\ name] [\-n\ maxconn] [\-N\ maxconn] [\-C\ dir] [\-v|\-vv] [\-d] [\-D] [\-q] [\-V] [\-c] [\-p\ pidfile] [\-dk] [\-ds] [\-de] [\-dp] [\-db] [\-dM[byte]] [\-m\ megs] [{\-sf|\-st}\ pidlist...] .SH DESCRIPTION @@ -99,16 +99,6 @@ Ask the process to write down each of its children's pids to this file in daemon mode. .TP -\fB\-s\fP -Show statistics (only if compiled in). -Statistics are only available if compiled in with the 'STATTIME' option. -It's only used during code optimization phases, and will soon disappear. - -.TP -\fB\-l\fP -Show even more statistics (implies '\-s'). - -.TP \fB\-dk\fP Disable use of \fBkqueue\fP(2). \fBkqueue\fP(2) is available only on BSD systems. -- 1.7.10.4
Re: Static haproxy/openssl build error
Hi Vincent, On 12:19 Mon 30 Sep , Vincent Bernat wrote: For me, pcre-config --libs does not use `-L`. Dunno why this is the case for Apollon. My version of pcre-config (8.30, also tested with 8.31) includes: libS= if test ${prefix}/lib/x86_64-linux-gnu != /usr/lib ; then libS=-L${prefix}/lib/x86_64-linux-gnu fi So, it seems it tries to be smart, but is not multiarch aware (I'll open a bug against it). This would obviously work correctly in squeeze, except that the version in squeeze lacks even this simple test :-) pkg-config on the other hand works better: $ pkg-config --libs libpcre -lpcre Apollon
Re: Static haproxy/openssl build error
On 13:49 Mon 30 Sep , Apollon Oikonomopoulos wrote: Hi Vincent, On 12:19 Mon 30 Sep , Vincent Bernat wrote: For me, pcre-config --libs does not use `-L`. Dunno why this is the case for Apollon. My version of pcre-config (8.30, also tested with 8.31) includes: libS= if test ${prefix}/lib/x86_64-linux-gnu != /usr/lib ; then libS=-L${prefix}/lib/x86_64-linux-gnu fi Update: Debian's 8.31 (testing/unstable) actually does not emit -L using pcre-config. I just looked at the source, but did a wrong ./configure. The actual script shipping with the package tests /usr/lib against /usr/lib. I assume that's the version you're running Vincent, right? Apollon
[PATCH] BUG/MINOR: reject malformed HTTP/0.9 requests
RFC 1945 (§4.1) defines an HTTP/0.9 request (Simple-Request) as: Simple-Request = GET SP Request-URI CRLF HAProxy tries to automatically upgrade HTTP/0.9 requests to to HTTP/1.0, by appending HTTP/1.0 to the request and setting the Request-URI to / if it was not present. The latter however is RFC-incompatible, as HTTP/0.9 requests must already have a Request-URI according to the definition above. Additionally, http_upgrade_v09_to_v10() does not check whether the request method is indeed GET (the mandatory method for HTTP/0.9). As a result, any single- or double-word request line is regarded as a valid HTTP request. We fix this by failing in http_upgrade_v09_to_v10() if the request method is not GET or the request URI is not present. --- src/proto_http.c | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/src/proto_http.c b/src/proto_http.c index df33991..c23fa54 100644 --- a/src/proto_http.c +++ b/src/proto_http.c @@ -1777,14 +1777,16 @@ static int http_upgrade_v09_to_v10(struct http_txn *txn) if (msg-sl.rq.v_l != 0) return 1; + /* RFC 1945 allows only GET for HTTP/0.9 requests */ + if (txn-meth != HTTP_METH_GET) + return 0; + cur_end = msg-chn-buf-p + msg-sl.rq.l; delta = 0; if (msg-sl.rq.u_l == 0) { - /* if no URI was set, add / */ - delta = buffer_replace2(msg-chn-buf, cur_end, cur_end, /, 2); - cur_end += delta; - http_msg_move_end(msg, delta); + /* HTTP/0.9 requests *must* have a request URI, per RFC 1945 */ + return 0; } /* add HTTP version */ delta = buffer_replace2(msg-chn-buf, cur_end, cur_end, HTTP/1.0\r\n, 11); -- 1.9.1
src_inc_gpc0 not working with IPv4 source and IPv6 stick-tables
Hi, While experimenting with counters in a dual-stack setup, I noticed that src_inc_gpc0 does not seem to work for IPv4 clients looked up against type ipv6 stick-tables. The following configuration: global log 127.0.0.1local0 user haproxy group haproxy stats socket /var/run/haproxy.sock user root group root level admin frontend test mode http bind 127.0.0.1: bind ::1: stick-table type ipv6 size 1k expire 1h store gpc0,http_req_rate(5m) tcp-request connection track-sc0 src redirect prefix http://example.com if { src_inc_gpc0 ge 0 } with a 1.5-dev22 instance and the following scenario: $ for i in {1..4}; do curl -6 http://localhost:/; done $ for i in {1..4}; do curl -4 http://localhost:/; done yields these results: $ echo show table test | sudo socat STDIO /var/run/haproxy.sock # table: test, type: ipv6, size:1024, used:2 0x17efd50: key=::1 use=0 exp=3596869 gpc0=4 http_req_rate(30)=4 0x17effb0: key=:::127.0.0.1 use=0 exp=3598516 gpc0=0 http_req_rate(30)=4 Notice that while the http_req_rate is properly calculated in both cases, gpc0 is not incremented for the mapped IPv4 client. Using sc0_inc_gpc0 on the other hand works as expected. Any ideas? Regards, Apollon
Re: src_inc_gpc0 not working with IPv4 source and IPv6 stick-tables
Hi Thierry, On 21:31 Mon 14 Apr , Thierry FOURNIER wrote: Hi, Thanks for the bug repport, this is fixed in the current HAProxy version. Tested and it works fine, thanks for the fix! Apollon
Re: Recommended strategy for running 1.5 in production
Hi Jonathan, On 22:27 Mon 14 Apr , Jonathan Matthews wrote: Hi all - I've been running 1.4 for a number of years, but am pondering moving some as-yet-unreleased apps to 1.5, for SSL and ACL-ish reasons. I'd like to ask how you, 1.5 sysadmins and devs, track the development version, and how you decide which version to run in production. We switched from 1.4 to 1.5 last September, after running 1.4 for 2+ years. The switch was made primarily due to SSL and proper IPv6 support; we had been using nginx for SSL termination previously, which ran fine, but made our setup a bit more complex than we'd like. We've been very satisfied with 1.5, both in terms of performance and features. The features we've been looking at more recently are stick-tables and counters, and we're planning to use them primarily for abuse detection and throttling. Do you just run 1.5-dev${LATEST}? The latest snapshot? Do you follow the list here and cherry-pick important bug fixes? I don't feel I have a firm understanding of the status of the different, co-existing codebases that one could call 1.5 at any given time. And nor do I have the C-skills and time to review every commit. What do /you/ do, fellow sysadmins? How do you run, upgrade and maintain confidence in your chosen version of 1.5 in production? We run a -dev version - not necessarily the latest one, just one that is running stable and has no security issues. In general I follow the list on a daily basis (not reading through every mail though) and I always keep a clone of the git repository around and periodically check the output of git log --grep BUG/ v1.5-dev${my_release}.. for anything serious (especially BUG/MAJOR commits). If anything too serious arises, we are willing to cherry-pick changes on top of our production version (which is always deployed using debian packages). The bugfix commits usually refer to either the commit, or the version that introduced the bug, so it is easy to determine whether we are affected by the bug or not. We do upgrades in a three-step process, upgrading the backup node of our active-backup setup, then failing over traffic to it while maintaining the old active as a backup with the previous version. After everything runs smoothly for a week or so, we upgrade the other node as well. Personally I consider 1.5 pretty stable at this point and most bugs I've seen (even the important ones) are either corner-cases or bugs in new features that won't break existing configurations. The only exception was -dev20, which introduced lots of changes and new functionality and it was probably expected that there would be some breakage. Thus said, I prefer to run milestone releases rather than daily snapshots or latest git. Regards, Apollon
Re: Recommended strategy for running 1.5 in production
(Cc'ing the Debian maintainers as well) Hi all, On 19:28 Wed 16 Apr , Willy Tarreau wrote: On Wed, Apr 16, 2014 at 07:14:31PM +0300, pablo platt wrote: An official Ubuntu dev repo will also make testing easier. It's much easier to use a apt-get than building from source and figuring out command line options. Actually Vincent Bernat maintains a PPA with rebuilds of our Debian packages from experimental, which should be handy for Ubuntu users: https://launchpad.net/~vbernat/+archive/haproxy-1.5 I think we're getting close to a release so we should not harrass distro maintainers with that *now* (but we could have done years ago). That reminds me that I tend not to always realize how much time slips between versions, and to forget that sometimes a previous version has some bugs. What I'd expect from our users is to sometimes complain loudly and insist for having a new dev release when the latest snapshot has become more reliable than the last dev release if that makes their lifes easier. A new version doesn't cost much (1 hour to read the changelog, write a human-friendly summary in an announce e-mail and update the site). With my Debian hat on, I'd like to complain a bit about 1.5. Of course we appreciate your dedication to making HAProxy rock-solid and feature-complete and at this point as a user 1.5 has been pretty stable for me (and the new features are definitely worth the wait). However, as Debian maintainers we probably will not replace 1.4 with 1.5 in our main track (unstable - testing - wheezy-backports) until 1.5-final is out; we would like to make sure that we will end up with a proper 1.5 release in Debian Jessie (and not with a development snapshot at any rate) that both, upstream and ourselves will be willing to support. Unfortunately, this means that 1.5 currently gets less user exposure (at least via Debian and Ubuntu), potentially slowing down the stabilization process. So please, leave some features aside for 1.6 ;-) Regards, Apollon
Segfault during soft-stop
Hi all, Current master segfaults during soft-stop with the following config: global log /dev/loglocal0 frontend test mode http bind 127.0.0.1: bind ::1: redirect prefix http://example.com gdb shows: Program received signal SIGUSR1, User defined signal 1. [WARNING] 106/123715 (18187) : Stopping frontend test in 0 ms. Program received signal SIGSEGV, Segmentation fault. 0x in ?? () (gdb) bt #0 0x in ?? () #1 0x00410c52 in process_runnable_tasks (next=0x7fffe00c) at src/task.c:240 #2 0x0040933c in run_poll_loop () at src/haproxy.c:1297 #3 0x00406a12 in main (argc=optimized out, argv=optimized out) at src/haproxy.c:1628 HAProxy was built using: make TARGET=linux2628 USE_PCRE=1 PCREDIR= USE_OPENSSL=1 USE_ZLIB=1 Regards, Apollon
Re: Segfault during soft-stop
Hi Lukas, On 12:09 Thu 17 Apr , Lukas Tribus wrote: Hi, gdb shows: Program received signal SIGUSR1, User defined signal 1. [WARNING] 106/123715 (18187) : Stopping frontend test in 0 ms. Program received signal SIGSEGV, Segmentation fault. 0x in ?? () (gdb) bt #0 0x in ?? () #1 0x00410c52 in process_runnable_tasks (next=0x7fffe00c) at src/task.c:240 #2 0x0040933c in run_poll_loop () at src/haproxy.c:1297 #3 0x00406a12 in main (argc=optimized out, argv=optimized out) at src/haproxy.c:1628 HAProxy was built using: make TARGET=linux2628 USE_PCRE=1 PCREDIR= USE_OPENSSL=1 USE_ZLIB=1 In a quick test I could not reproduce the crash here. Can you reproduce this easily in a non-production environment? Can you compile with CFLAGS=-g -O0 in the make command, to avoid that the compiler optimizes out to much and provide the gdb output of backtrace full? Strange, after a full cleanup and rebuild it doesn't segfault anymore, so please disregard this and apologies for the noise. Regards, Apollon
Re: Segfault during soft-stop
On 12:30 Thu 17 Apr , Lukas Tribus wrote: Hi, Can you compile with CFLAGS=-g -O0 in the make command, to avoid that the compiler optimizes out to much and provide the gdb output of backtrace full? Strange, after a full cleanup and rebuild it doesn't segfault anymore, so please disregard this and apologies for the noise. Can you double check the the crash doesn't happen on a clean rebuild without the CFLAGS I asked? I just want to make sure that -O0 doesn't hide this possible bug. I confirm that it doesn't happen with -O2. I suppose something was wrong with my build system, I'll investigate a bit more and let you know if I find anything out. Thanks Apollon
[PATCH 0/3] systemd wrapper improvements
Hi, The following patches improve the systemd wrapper in a couple of ways: - The systemd wrapper itself is now re-executed on SIGUSR2, allowing a new version of both HAProxy and the wrapper to be gracefully loaded. - We use standard error for message logging, which seems to get more reliably to systemd's journal than stdout. - We also use systemd's support for prefixing messages with the log level to differentiate between message severity. - Finally, we propagate the exit status of HAProxy to systemd, instead of always returning success. Regards, Apollon Apollon Oikonomopoulos (3): MINOR: systemd wrapper: re-execute on SIGUSR2 MINOR: systemd wrapper: improve logging MINOR: systemd wrapper: propagate exit status src/haproxy-systemd-wrapper.c | 69 ++- 1 file changed, 49 insertions(+), 20 deletions(-) -- 1.9.1
[PATCH 1/3] MINOR: systemd wrapper: re-execute on SIGUSR2
Re-execute the systemd wrapper on SIGUSR2 and before reloading HAProxy, making it possible to load a completely new version of HAProxy (including a new version of the systemd wrapper) gracefully. Since the wrapper accepts no command-line arguments of its own, re-execution is signaled using the HAPROXY_SYSTEMD_REEXEC environment variable. This is primarily intended to help seamless upgrades of distribution packages. --- src/haproxy-systemd-wrapper.c | 54 --- 1 file changed, 40 insertions(+), 14 deletions(-) diff --git a/src/haproxy-systemd-wrapper.c b/src/haproxy-systemd-wrapper.c index 8485dcd..e373483 100644 --- a/src/haproxy-systemd-wrapper.c +++ b/src/haproxy-systemd-wrapper.c @@ -18,9 +18,11 @@ #include unistd.h #include sys/wait.h +#define REEXEC_FLAG HAPROXY_SYSTEMD_REEXEC + static char *pid_file = /run/haproxy.pid; -static int main_argc; -static char **main_argv; +static int wrapper_argc; +static char **wrapper_argv; static void locate_haproxy(char *buffer, size_t buffer_size) { @@ -42,6 +44,11 @@ static void spawn_haproxy(char **pid_strv, int nb_pid) { char haproxy_bin[512]; pid_t pid; + int main_argc; + char **main_argv; + + main_argc = wrapper_argc - 1; + main_argv = wrapper_argv + 1; pid = fork(); if (!pid) { @@ -96,15 +103,10 @@ static int read_pids(char ***pid_strv) static void sigusr2_handler(int signum __attribute__((unused))) { - int i; - char **pid_strv = NULL; - int nb_pid = read_pids(pid_strv); + setenv(REEXEC_FLAG, 1, 1); + printf(haproxy-systemd-wrapper: re-executing\n); - spawn_haproxy(pid_strv, nb_pid); - - for (i = 0; i nb_pid; ++i) - free(pid_strv[i]); - free(pid_strv); + execv(wrapper_argv[0], wrapper_argv); } static void sigint_handler(int signum __attribute__((unused))) @@ -140,16 +142,40 @@ int main(int argc, char **argv) { int status; - --argc; ++argv; - main_argc = argc; - main_argv = argv; + wrapper_argc = argc; + wrapper_argv = argv; + --argc; ++argv; init(argc, argv); signal(SIGINT, sigint_handler); signal(SIGUSR2, sigusr2_handler); - spawn_haproxy(NULL, 0); + if (getenv(REEXEC_FLAG) != NULL) { + /* We are being re-executed: restart HAProxy gracefully */ + int i; + char **pid_strv = NULL; + int nb_pid = read_pids(pid_strv); + sigset_t sigs; + + unsetenv(REEXEC_FLAG); + spawn_haproxy(pid_strv, nb_pid); + + /* Unblock SIGUSR2 which was blocked by the signal handler +* before re-exec */ + sigprocmask(SIG_BLOCK, NULL, sigs); + sigdelset(sigs, SIGUSR2); + sigprocmask(SIG_SETMASK, sigs, NULL); + + for (i = 0; i nb_pid; ++i) + free(pid_strv[i]); + free(pid_strv); + } + else { + /* Start a fresh copy of HAProxy */ + spawn_haproxy(NULL, 0); + } + status = -1; while (-1 != wait(status) || errno == EINTR) ; -- 1.9.1
[PATCH 3/3] MINOR: systemd wrapper: propagate exit status
Use HAProxy's exit status as the systemd wrapper's exit status instead of always returning EXIT_SUCCESS, permitting the use of systemd's `Restart = on-failure' logic. --- src/haproxy-systemd-wrapper.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/haproxy-systemd-wrapper.c b/src/haproxy-systemd-wrapper.c index d4baa90..ba07ebe 100644 --- a/src/haproxy-systemd-wrapper.c +++ b/src/haproxy-systemd-wrapper.c @@ -184,5 +184,5 @@ int main(int argc, char **argv) fprintf(stderr, SD_NOTICE haproxy-systemd-wrapper: exit, haproxy RC=%d\n, status); - return EXIT_SUCCESS; + return status; } -- 1.9.1
[PATCH 2/3] MINOR: systemd wrapper: improve logging
Use standard error for logging messages, as it seems that this gets messages to the systemd journal more reliably. Also use systemd's support for specifying log levels via stderr to apply different levels to messages. --- src/haproxy-systemd-wrapper.c | 15 +-- 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/src/haproxy-systemd-wrapper.c b/src/haproxy-systemd-wrapper.c index e373483..d4baa90 100644 --- a/src/haproxy-systemd-wrapper.c +++ b/src/haproxy-systemd-wrapper.c @@ -19,6 +19,8 @@ #include sys/wait.h #define REEXEC_FLAG HAPROXY_SYSTEMD_REEXEC +#define SD_DEBUG 7 +#define SD_NOTICE 5 static char *pid_file = /run/haproxy.pid; static int wrapper_argc; @@ -68,10 +70,10 @@ static void spawn_haproxy(char **pid_strv, int nb_pid) } argv[argno] = NULL; - printf(%s, haproxy-systemd-wrapper: executing ); + fprintf(stderr, SD_DEBUG haproxy-systemd-wrapper: executing ); for (i = 0; argv[i]; ++i) - printf(%s , argv[i]); - puts(); + fprintf(stderr, %s , argv[i]); + fprintf(stderr, \n); execv(argv[0], argv); exit(0); @@ -104,7 +106,7 @@ static int read_pids(char ***pid_strv) static void sigusr2_handler(int signum __attribute__((unused))) { setenv(REEXEC_FLAG, 1, 1); - printf(haproxy-systemd-wrapper: re-executing\n); + fprintf(stderr, SD_NOTICE haproxy-systemd-wrapper: re-executing\n); execv(wrapper_argv[0], wrapper_argv); } @@ -117,7 +119,7 @@ static void sigint_handler(int signum __attribute__((unused))) for (i = 0; i nb_pid; ++i) { pid = atoi(pid_strv[i]); if (pid 0) { - printf(haproxy-systemd-wrapper: SIGINT - %d\n, pid); + fprintf(stderr, SD_DEBUG haproxy-systemd-wrapper: SIGINT - %d\n, pid); kill(pid, SIGINT); free(pid_strv[i]); } @@ -180,6 +182,7 @@ int main(int argc, char **argv) while (-1 != wait(status) || errno == EINTR) ; - printf(haproxy-systemd-wrapper: exit, haproxy RC=%d\n, status); + fprintf(stderr, SD_NOTICE haproxy-systemd-wrapper: exit, haproxy RC=%d\n, + status); return EXIT_SUCCESS; } -- 1.9.1
Re: haproxy-1.5-dev23 and ssl handshake failure
Hi all, On 22:59 Wed 23 Apr , Willy Tarreau wrote: Hi again Markus, I've checked my own logs and found SSL handshake failures starting on April 8th, or the day after Heartbleed was disclosed, as can be seen below with the number of errors per day : # err date 2 Mar 27 2 Mar 28 1 Mar 29 2 Mar 30 3 Mar 31 3 Apr 1 7 Apr 2 1 Apr 3 2 Apr 4 8 Apr 5 24 Apr 6 2 Apr 7 619 Apr 8 2 Apr 9 2 Apr 10 158 Apr 11 6 Apr 12 2 Apr 13 158 Apr 14 157 Apr 15 168 Apr 16 109 Apr 17 7 Apr 18 7 Apr 19 7 Apr 20 110 Apr 21 497 Apr 22 123 Apr 23 Interestingly, my version was neither upgraded nor restarted during this period, so it cannot be caused by a code change, and is very likely caused by bots trying the attack. So I think it's also possible that you're experiencing the same things and that you didn't notice them before upgrading and checking your logs. Hoping this helps, Willy We see similar results with -dev19: 20140401378 20140402922 20140403346 20140404370 20140405807 20140406501 20140407445 20140408 3509 20140409360 20140410 1143 20140411 1525 20140412989 20140413991 20140414 1217 20140415 1139 20140416 1141 ... Note the spike on the 8th of April, matching the Heartbleed hypothesis. These can be all sorts of failures occurring before the handshake is completed. I sampled a couple of requests using tcpdump: one of them was a plain HTTP request on the HTTPS port and in the other one the client sent a close-notify TLS alert, 250 ms after receiving the certificate (indicating perhaps a network issue). To put things in perspective, on the 8th of April we had a total of 1.38 million SSL connections¹ so these failures account for roughly 0.25%. Granted that on that day we were expecting a lot of unfinished handshakes probing the heartbeat vulnerability, I wouldn't worry much. ¹ actually unique source IP:source port entries Regards, Apollon
Re: debian repository http://haproxy.debian.net/
Hi Ghislain, On 14:01 Fri 23 May , Ghislain wrote: hello there, Could you tell me if those packages comes from the haproxy team ? from the packages: It depends on what you mean by the haproxy team. They come from the team that maintains the package in Debian itself, that means Debian developers and maintainers and not HAProxy developers. Regards, Apollon
Re: debian repository http://haproxy.debian.net/
On 17:10 Fri 23 May , Ghislain wrote: Le 23/05/2014 15:23, Baptiste a écrit : It is not provided by us (HAProxy.com) if this is what you mean. Baptiste yes that's what i meant. Thanks for both answer and thanks for the product, and the packages ! In any case from my high throne in the sky looking under at other petty humans i will consider worthy of my trust the debian team, amen ;p Thanks :) Now that we have cleared this, that raise another question as i may been a little paranoid as most sysadmin should be so , do you think this: /etc/apt/preferences.d/haproxy Package: haproxy Pin: origin haproxy.debian.net, version 1.5* Pin-Priority: 995 Package: * Pin: origin haproxy.debian.net Pin-Priority: -10 will make sure this repository can never ever touch anything else than the 'haproxy' package ? I think the Pin: line of the first entry is not valid, the following should work: Package: haproxy Pin: version 1.5* Pin: origin haproxy.debian.net Pin-Priority: 777 Package: * Pin: origin haproxy.debian.net Pin-Priority: -10 You can always check the result using `apt-cache policy haproxy'. You should also allow haproxy-dbg. i start to have a repository per process those days if i do not limit those a little they will one day come to bite me ! Another option would be to setup your own local repository using e.g. reprepro and import only the packages generated from the haproxy source. This is what I've been using at work to control a lot of upstream repositories and stage packages when/if we want to. Note that this repository is only a temporary solution, until 1.5-final is out. Then we will upload it to unstable and wheezy-backports proper. Regards, Apollon
Re: debian repository http://haproxy.debian.net/
On 11:14 Wed 04 Jun , Ghislain wrote: just in case, last update failed: Setting up haproxy (1.5~dev26-1~bpo70+1) ... Starting haproxy: haproxy/usr/sbin/haproxy already running. failed! invoke-rc.d: initscript haproxy, action start failed. dpkg: error processing haproxy (--configure): Ah, that's a bug in the initscript with upgrades from 1.5~dev24+. We'll fix it soon, thanks for the report. Apollon
Re: HAproxy crash when reload with systemd
Hi Kevin, On 15:51 Wed 23 Jul , Kevin COUSIN wrote: Here is the reload command in the systemd unit file: ExecReload=/bin/bash -c exec /usr/sbin/haproxy -D -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -sf $MAINPID This command will not work, because it will replace the already running haproxy process and systemd does not really like this. You have to use the dedicated systemd wrapper shipped with haproxy instead. There is also a sample systemd unit file in contrib/systemd in HAProxy's source that you can use as-is or as a reference. Regards, Apollon
Re: Maintenance release frequency
Hi Willy, On 19:18 Thu 02 Oct , Willy Tarreau wrote: So my question is : what do you think about the current maintenance release frequency ? Do you think we should release more often, which also means that some people might upgrade for no good reason, or get used to miss versions ? Do you think that instead we should simply consider that when someone asks for a release here on the list, it's likely the good time for it ? As you stated, currently there are two distinct release modes, driven by different needs: - Security vulnerabilities/major bugs - the event-driven release mode: I think everyone will agree that in these cases a release should be as immediate as possible. - Medium/minor bugs - the never-important-enough-to-warrant-a-release mode: These tend to pile up in a long queue of commits and wait until either someone asks for a release, or something bad happens. However, even if these fixes are really not that serious, they do fix existing problems. Furthermore, usually someone has complained about these problems and possibly more people have experienced them but not bothered to report them and most of those affected would like to see them fixed in a reasonable amount of time without cherry-picking commits from git themselves. Note that the combination of a long number of unreleased minor fixes with a forced release due to a major bug has higher chances of exposing users to undetected regressions since the last release, precisely at the moment you want all people to upgrade urgently. IMHO, with the 1.5 series being maintenance-only (i.e. no new features) I think we can apply a time-based rule specifying an upper limit, something like release on the 15th of every month (or the 1st Monday), as long as there is something to release of course. It's probably tempting to consider schedules like release before the oldest unreleased git commit is 1 month old, but I think it's essential to stick to a schedule easy to remember (by adding a periodic reminder) and difficult to prolong e.g. by waiting for 2 weeks without bugs. By the way, we are approaching the Debian freeze, due on Nov 5, which means that I would be really happy to see a release by Oct 15. That said, I was about to put on the that guy who asks for a release hat :-). Regards, Apollon
[PATCH] BUG/MEDIUM: systemd: set KillMode to 'mixed'
By default systemd will send SIGTERM to all processes in the service's control group. In our case, this includes the wrapper, the master process and all worker processes. Since commit c54bdd2a the wrapper actually catches SIGTERM and survives to see the master process getting killed by systemd and regard this as an error, placing the unit in a failed state during systemctl stop. Since the wrapper now handles SIGTERM by itself, we switch the kill mode to 'mixed', which means that systemd will deliver the initial SIGTERM to the wrapper only, and if the actual haproxy processes don't exit after a given amount of time (default: 90s), a SIGKILL is sent to all remaining processes in the control group. See systemd.kill(5) for more information. This should also be backported to 1.5. --- contrib/systemd/haproxy.service.in | 1 + 1 file changed, 1 insertion(+) diff --git a/contrib/systemd/haproxy.service.in b/contrib/systemd/haproxy.service.in index 1a3d2c0..0bc5420 100644 --- a/contrib/systemd/haproxy.service.in +++ b/contrib/systemd/haproxy.service.in @@ -5,6 +5,7 @@ After=network.target [Service] ExecStart=@SBINDIR@/haproxy-systemd-wrapper -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid ExecReload=/bin/kill -USR2 $MAINPID +KillMode=mixed Restart=always [Install] -- 2.1.1
Re: [PATCH] BUG/MEDIUM: systemd: set KillMode to 'mixed'
Hi Willy, On 11:26 Thu 09 Oct , Willy Tarreau wrote: Hi Apollon, On Wed, Oct 08, 2014 at 03:14:41PM +0300, Apollon Oikonomopoulos wrote: By default systemd will send SIGTERM to all processes in the service's control group. In our case, this includes the wrapper, the master process and all worker processes. Since commit c54bdd2a the wrapper actually catches SIGTERM and survives to see the master process getting killed by systemd and regard this as an error, placing the unit in a failed state during systemctl stop. Then shouldn't we fix this by letting the wrapper die after receiving the SIGTERM ? Otherwise I'm happy to merge your patch, but I'd rather ensure that we don't encounter yet another issue. The wrapper does exit on its own when the haproxy master process exits, which is done as soon as all worker processes exit. The problem is that the wrapper wants to control all worker processes on its own, while systemd second-guesses it by delivering SIGTERM to all processes by default. I'm really amazed by the amount of breakage these new service managers are causing to a simple process management that has been working well for over 40 years of UNIX existence now, and the difficulty we have to work around this whole mess! I guess every new system has its difficulties and learning curve, especially when it breaks implicit assumptions that hold for a long time. Cheers, Apollon
Re: [PATCH] BUG/MEDIUM: systemd: set KillMode to 'mixed'
On 11:44 Thu 09 Oct , Willy Tarreau wrote: On Thu, Oct 09, 2014 at 12:35:10PM +0300, Apollon Oikonomopoulos wrote: Hi Willy, On 11:26 Thu 09 Oct , Willy Tarreau wrote: Hi Apollon, On Wed, Oct 08, 2014 at 03:14:41PM +0300, Apollon Oikonomopoulos wrote: By default systemd will send SIGTERM to all processes in the service's control group. In our case, this includes the wrapper, the master process and all worker processes. Since commit c54bdd2a the wrapper actually catches SIGTERM and survives to see the master process getting killed by systemd and regard this as an error, placing the unit in a failed state during systemctl stop. Then shouldn't we fix this by letting the wrapper die after receiving the SIGTERM ? Otherwise I'm happy to merge your patch, but I'd rather ensure that we don't encounter yet another issue. The wrapper does exit on its own when the haproxy master process exits, which is done as soon as all worker processes exit. The problem is that the wrapper wants to control all worker processes on its own, while systemd second-guesses it by delivering SIGTERM to all processes by default. OK, so I'm merging your patch if you think it's the best solution. Well, I think it's the most sane thing to do and is behaviour-compatible with the current wrapper version. I'm really amazed by the amount of breakage these new service managers are causing to a simple process management that has been working well for over 40 years of UNIX existence now, and the difficulty we have to work around this whole mess! I guess every new system has its difficulties and learning curve, especially when it breaks implicit assumptions that hold for a long time. Well, we're far away from the learning curve, we're writing a wrapper to help a stupid service manager handle daemons, because the people who wrote it did not know that in the unix world, there were some services running in background. ps aux could have educated them by discovering that there were other processes than ps and their shell :-/ Truth is, we're writing a wrapper to handle gracefully reloading HAProxy by completely replacing the master process. Other than that, systemd is plain happy with just HAProxy running in the foreground using -Ds. I even have a suspicion that we don't need the wrapper at all to do graceful reloading. I have to do some experiments on that and I'll come back to you. Thanks, Apollon
Re: [PATCH] BUG/MEDIUM: systemd: set KillMode to 'mixed'
On 12:07 Thu 09 Oct , Willy Tarreau wrote: Anyway we're not there to discuss the benefits or defaults of systemd, some major distros have adopted it and now we have to work around its breakages so that users can continue to use their systems as if it was still a regular, manageable UNIX system. Trust me, there are lots of things I don't like about systemd either. But given that it seems to be here to stay for a while, I don't think we have much choice :) Regards, Apollon
Re: OpenSSL 1.1.0 support => merged
Hi Willy, Dirkjan, On 21:12 Tue 08 Nov , Willy Tarreau wrote: > Hi Dirkjan, > > I finally merged your patch after discussing with Emeric. He's fine with > it as well. Thanks for this. Is it too much of a hassle to ask for a 1.6 backport? We currently have a release-critical bug in Debian for OpenSSL 1.1 compatibility[1], so it would greatly help us. I could go ahead and try to make a backport myself, however I admit I'm a bit reluctant to touch OpenSSL-related code at this point. Thanks! Apollon [1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=828337
Re: OpenSSL 1.1.0 support => merged
On 12:07 Wed 09 Nov , Willy Tarreau wrote: > On Wed, Nov 09, 2016 at 11:44:41AM +0200, Apollon Oikonomopoulos wrote: > > Thanks for this. Is it too much of a hassle to ask for a 1.6 backport? > > Given that it breaks support for older versions (0.9.8 at least), for > now it's out of question. And it has received only limited testing. If > we manage to stabilise the patch to properly handle all versions where > 1.6 currently works, then maybe the question could be reconsidered. Agreed, thanks for the clarification. > > > We currently have a release-critical bug in Debian for OpenSSL 1.1 > > compatibility[1], so it would greatly help us. I could go ahead and try > > to make a backport myself, however I admit I'm a bit reluctant to touch > > OpenSSL-related code at this point. > > You should definitely avoid it, the testing is insufficient for now. > > Another, better option would be to upgrade the haproxy package to 1.7 for > the next debian release so that it matches the new openssl version as well. > There are (too) few changes in 1.7 compared to 1.6, it mostly accumulated > all the fixes that resulted from the bugs coming with the new architecture > brought in 1.6. I consider 1.7 almost as stable as 1.6, and will encourage > users to upgrade. I don't know how much time left you have to decide on a > version for a new distro (I don't know the process at all). Let's say that we must have settled with a stable-enough version by early December. Is there a chance there will be a final 1.7 release by then? Regards, Apollon
Re: multiproc ssl recommendations
Hi Willy, Elias, On 08:33 Fri 09 Dec , Willy Tarreau wrote: > On Thu, Dec 01, 2016 at 02:53:25PM +0100, Elias Abacioglu wrote: > > # Should I use core 0 on each CPU for backends (proc 1+15) or should > > I > > use core 1(proc 2+16)? > > Backends are processed on the same CPU as the frontend which passes them > the traffic, so the bind-process has no effect there. In fact bind-process > on a backend means "at least on these processes". > > That's why it's better to proceed like this (stupid numbers, just so that > you get the idea): > >listen ssl-offload > bind-proess 2-50 > bind :443 ssl process 2 > ... > bind :443 ssl process 50 I wonder if a `per-process' keyword would make sense here. I find bind :443 ssl per-process more concise than 15 or 20 individual bind lines. This would have the same effect as N bind lines, one for each process in the bind-process list. > server clear 127.0.0.1:1 send-proxy-v2 > >frontend clear > bind-process 1 > bind 127.0.0.1:1 accept-proxy Would you recommend using unix sockets for process-to-process communication, at least to get rid of the TCP state overhead? Regards, Apollon
Re: multiproc ssl recommendations
Hi Pavlos, On 17:31 Fri 09 Dec , Pavlos Parissis wrote: > On 09/12/2016 08:54 πμ, Apollon Oikonomopoulos wrote: > > Hi Willy, Elias, > > > > On 08:33 Fri 09 Dec , Willy Tarreau wrote: > >> On Thu, Dec 01, 2016 at 02:53:25PM +0100, Elias Abacioglu wrote: > >>> # Should I use core 0 on each CPU for backends (proc 1+15) or should > >>> I > >>> use core 1(proc 2+16)? > >> > >> Backends are processed on the same CPU as the frontend which passes them > >> the traffic, so the bind-process has no effect there. In fact bind-process > >> on a backend means "at least on these processes". > >> > >> That's why it's better to proceed like this (stupid numbers, just so that > >> you get the idea): > >> > >>listen ssl-offload > >> bind-proess 2-50 > >> bind :443 ssl process 2 > >> ... > >> bind :443 ssl process 50 > > > > I wonder if a `per-process' keyword would make sense here. I find > > > > bind :443 ssl per-process > > > > more concise than 15 or 20 individual bind lines. This would have the > > same effect as N bind lines, one for each process in the bind-process > > list. > > If you have bind per process then all sockets are bound separately and you > get X listening sockets on port 443, which results to have one distinct socket > in each process with its own queue(SYN backlog queues and etc), and the > kernel's > load balancing works much better. That's true, yes. However what I'm saying is that some syntactic sugar to have the parser auto-expand a single "bind" directive to create N sockets instead of one, would be nice. Regards, Apollon
Re: Haproxy refuses new connections when doing a reload followed by a restart
Hi all, On 22:01 Wed 04 Oct , Lukas Tribus wrote: > Hello Moemen, > > > Am 04.10.2017 um 19:21 schrieb Moemen MHEDHBI: > > > > I am wondering if this is actually an expected behaviour and if maybe > > that restart/stop should just shutdown the process and its open connections. > > I have made the following tests: > > 1/ keep an open connection then do a restart will work correctly without > > waiting for existing connections to be closed. > > You're right, I got confused there. > > Stop or restart is supposed to kill existing connections without any > timeouts, and > systemd would signal it with a SIGTERM to the systemd-wrapper: > > https://cbonte.github.io/haproxy-dconv/1.7/management.html#4 > > > > > I think it makes more sense to say that restart will not wait for > > established connections. > > Correct, that's the documented behavior as per haproxy documentation and > really the implicit assumption when talking about stopping/restarting in a > process > management context (I need more coffee). > > > > > Otherwise there will be no difference between > > reload and restart unless there is something else am not aware of. > > If we need to fix 2/, a possible solution would be: > > - Set killmode to "control-group" rather than "mixed" (the current > > value) in systemd unit file. > > Indeed the mixed killmode was a conscious choice: > https://marc.info/?l=haproxy=141277054505608=2 > > > I guess the problem is that when a reload happens before a restart and > pre-reload > systemd-wrapper process is still alive, systemd gets confused by that old > process > and therefor, refrains from starting up the new instance. The biggest issue here is that we are using a signal to trigger the reload (which is a complex, non-atomic operation) and let things settle on their own. Systemd assumes that as soon as the signal is delivered (i.e. the ExecReload command is done), the reload is finished, while in our case the reload is finished when the old haproxy process is really dead. Using a signal to trigger the reload is handy, so we could keep that, but the wrapper would need some changes to make reloads more robust: 1. It should use sd_notify(3) to communicate the start/stop/reload status to systemd (that would also mean converting the actual service to Type=notify). This way no other operation will be performed on the unit until the reload is finished and the process group is in a known-good state. 2. It should handle the old process better: apart from relying on the new haproxy process for killing the old one, it should explicitly SIGKILL it after a given timeout if it's not dead yet and make sure reloads are timeboxed. IIUC, in 1.8 the wrapper has been replaced by the master process which seems to do point 2 above, but point 1 is something that should still be handled IMHO. Regards, Apollon
Re: Haproxy refuses new connections when doing a reload followed by a restart
On 16:17 Thu 12 Oct , William Lallemand wrote: > Hi, > > On Thu, Oct 12, 2017 at 01:19:58PM +0300, Apollon Oikonomopoulos wrote: > > The biggest issue here is that we are using a signal to trigger the > > reload (which is a complex, non-atomic operation) and let things settle > > on their own. Systemd assumes that as soon as the signal is delivered > > (i.e. the ExecReload command is done), the reload is finished, while in > > our case the reload is finished when the old haproxy process is really > > dead. Using a signal to trigger the reload is handy, so we could keep > > that, but the wrapper would need some changes to make reloads more > > robust: > > > > 1. It should use sd_notify(3) to communicate the start/stop/reload > > status to systemd (that would also mean converting the actual > > service to Type=notify). This way no other operation will be > > performed on the unit until the reload is finished and the process > > group is in a known-good state. > > > > 2. It should handle the old process better: apart from relying on the > > new haproxy process for killing the old one, it should explicitly > > SIGKILL it after a given timeout if it's not dead yet and make sure > > reloads are timeboxed. > > > > IIUC, in 1.8 the wrapper has been replaced by the master process which > > seems to do point 2 above, but point 1 is something that should still be > > handled IMHO. > > One helpful feature I read in the documentation is the usage of the > sd_notify(.. "READY=1"). It can be useful for configuration files that takes > time to process, for example those with a lot of ssl frontends. This signal > could be send once the children has been forked. > > It's difficult to know when a reload is completely finished (old processes > killed) in case of long TCP sessions. So, if we use this system there is a > risk > to trigger a timeout in systemd on the reload isn't it? The Reload timeout is apparently controlled by TimeoutStartSec in systemd. > feature for the reload, it should be done after the fork of the new > processes, > not after the leaving of the old processes, because the processes are > ready to > receive traffic at this stage. That's true. OTOH the problem with haproxy-systemd-wrapper is that once it re-exec's itself it loses track of the old processes completely (IIRC), combined with the fact that old processes may eat up a lot of memory. There are cases where you would prefer breaking a long TCP session after 30s if it would give you back 2GB of RSS, to having the process lying around just for one client. > > However I'm not sure it's that useful, you can know when a process is ready > using the logs, and it will add specific code for systemd and a dependency. > > Are there really advantages to letting know systemd when a reload is finished > or when a process is ready? Yes, there are. systemd will only perform a single operation on a unit at a time, and will queue up the rest. When you inform systemd that something (startup/reload) is in progress, it will not let any other action happen until the first operation is finished. Now it's trivial to issue a ton of reloads in a row that will leave a ton of old processes lying around until they terminate. The other advantage with Type=notify services is that systemd will wait for READY=1 before starting units with After=haproxy (although HAProxy is really a "leaf" kind of service). Note that the dependency is really thin, and you can always make it a compile-time option. Regards, Apollon
Re: Haproxy refuses new connections when doing a reload followed by a restart
On 17:30 Thu 12 Oct , William Lallemand wrote: > On Thu, Oct 12, 2017 at 05:50:52PM +0300, Apollon Oikonomopoulos wrote: > > Yes, there are. systemd will only perform a single operation on a > > unit at a time, and will queue up the rest. When you inform systemd > > that something (startup/reload) is in progress, it will not let any > > other action happen until the first operation is finished. Now it's > > trivial to issue a ton of reloads in a row that will leave a ton of > > old processes lying around until they terminate. > > I don't think you can, either with the master-worker or the wrapper, it was > one > of the problems we had in the past. > > The master-worker waits to be ready to handle the signals, and the wrapper > waits > for a pipe to be closed on the children side to handle signals. Interesting, thanks! I guess I'm still stuck in the 1.6 era, I have some catch-up to do :) Regards, Apollon