500's with IH--- termination_state on haproxy 2.4.16
Testing an upgrade from haproxy 2.4.15 to 2.4.16 and started getting some 500's with session state IH---, only on load balancers upgraded to 2.4.16. It's only affecting requests from a couple of customers (about 0.08% of traffic) and I haven't reproduced it in a testing environment yet. I assume there's something unusual about these requests. All the failed requests are POSTS over HTTP/2.0, but so are most of the successful requests, so I don't think that's necessarily relevant. haproxy isn't logging anything unusual at the time (just the normal http request log, no exceptions being logged or anything). Any thoughts on what might've changed in 2.4.16 to cause this? Is there any way to ask haproxy to log something specifically when it hits whatever internal error causes an I state? I'm going to roll back to 2.4.15 for now. James Brown
Re: [ANNOUNCE] HTTP/2 vulnerabilities from 2.0 to 2.5-dev
. > > What happens is that on the entry point, the :scheme, :authority and :path > fields are concatenated to rebuild a full URI that is then passed along the > chain, but the Host header is set from :authority before this concatenation > is performed. As such, the Host header field used internally may not always > match the authority part of the recomposed URI. Examples: > >H2 request > :method: "GET" > :scheme: "http://localhost/?orig="; > :authority "example.org" > :path: "/" > > or: > >H2 request > :method: "GET" > :scheme: "http" > :authority "example.org" > :path: ".local/" > > An internal Host header will be build with "example.org" then the complete > URI will become "http://localhost/?orig=example.org/"; in the first > example, > or "http://example.org.local/"; in the second example, and this URI will be > used to build the HTTP/2 request on the server side, dropping the unneeded > Host header field. In HTTP/1 there is no such issue as the URI is dropped > and the Host is kept. Thus if the configuration contains some routing rules > based on the Host header field, a target HTTP/2 server might receive a > different :authority than the one that was expected to be routed there. > > A workaround consists in rewriting the URI as itself before processing the > Host header field, which will have for effect to resynchronize the Host > header field with the recomposed URI, making sure both haproxy and the > backend server will always see the same value: > > http-request set-uri %[url] > > > 3) Mismatch between ":authority" and "Host" > > The HTTP/2 specification (RFC7540) implicitly allows the "Host" header > and the ":authority" header field to differ and further mentions that the > contents of ":authority" may be used to build "Host" if this one is > missing. > This results in an ambiguous situation analogue to the one above, because > rules built based on the "Host" field will match against a possibly > different "Host" header field that will be dropped when the request is > forwarded to an HTTP/2 backend server. An HTTP/1 server will not be > affected since HTTP/2 requests are forwarded to HTTP/1 in origin form, > i.e. without the authority part. Example: > >H2 request > :method: "GET" > :scheme: "http" > :authority "victim.com" > :path: "/" > Host: "example.org" > > Internal switching rules using the "Host" header field will see " > example.org" > but when the request is passed to an H2 server, "Host" will be dropped and > "victim.com" will be used by this server to fill the missing "Host" > header. > > The new H2 specification in progress ("http2bis") addresses this issue by > proposing that "Host" is always ignored on input in favor of ":authority" > which remains more consistent with what is done along the chain. This is > the solution adopted by the fix here. > > A workaround consists in using the same rule as for the previous issue, > before the Host header field is used by any switching rule (typically in > the frontend), which will have for effect to rewrite the "Host" part > according to the contents of the ":authority" field: > > http-request set-uri %[url] > > > 4) Affected versions > > - versions 1.7 do not support H2 and are not affected > - versions 1.8 only support H2 legacy mode are not affected > - versions 2.0 prior to 2.0.24 are affected by the :method bug > - versions 2.2 prior to 2.2.16 are affected by all 4 bugs > - versions 2.3 prior to 2.3.13 are affected by all 4 bugs > - versions 2.4 prior to 2.4.3 are affected by all 4 bugs > - versions 2.5 prior to 2.5-dev4 are affected by all 4 bugs > > > 5) Instant remediation > > Several solutions are usable against all of these issues in affected > versions before upgrading: > > - disabling HTTP/2 communication with servers by removing "proto h2" from > "server" lines is sufficient to address the ":authority", ":scheme", and > ":path" issues if the servers are known *not* to be vulnerable to the > issue described in the ":method" attack above. This probably is the > easiest solution when using trusted mainstream backend servers such as > Apache, NGINX or Varnish, especially since very few configurations make > use of H2 to communicate with servers. > > - placing the two following rules at the beginning of every HTTP frontend: > > http-request reject if { method -m reg [^A-Z0-9] } > http-request set-uri %[url] > > - in version 2.0, disabling HTX processing will force the request to be > reprocessed by the internal HTTP/1 parser (but this is not compatible > with H2 servers nor FastCGI servers): > > no option http-use-htx > > - commenting out "alpn h2" advertisement on all "bind" lines in frontends, > and disabling H2 processing entirely by placing the following line in > the global section: > > tune.h2.max-concurrent-streams 0 > > - in versions 2.2 and above it is possible to refine filtering per frontend > by disabling "alpn h2" per bind line and by disabling HTTP/1 to HTTP/2 > upgrade by placing this option in the respective frontends: > > option disable-h2-upgrade > > > Many thanks to Tim for helping getting these issues resolved! > Willy > > -- James Brown Engineer
Re: http-response set-header and redirect
Thanks! On Fri, Jun 11, 2021 at 11:36 AM Tim Düsterhus wrote: > James, > > On 6/11/21 8:28 PM, James Brown wrote: > > Is there any reason (performance or otherwise) to use http-response > instead > > of just turning everything into http-after-response? > > There is a difference: If a http-response rule fails [1] then a standard > error page will be emitted. For this error page the http-after-response > rules will need be evaluated. They might fail as well, aborting the > processing and causing a very simple 500 Internal Server Error to be > emitted. This will suppress any other error (e.g. 503, 403, …). > > So complex http-after-response rules might cause additional (debugging) > issues in error situations. > > I recommend using them for the most essential stuff only. In my case > that is the Strict-Transport-Security header and a request ID response > header. > > Best regards > Tim Düsterhus > > [1] e.g. if there's insufficient memory to add the header. > -- James Brown Engineer
Re: http-response set-header and redirect
Is there any reason (performance or otherwise) to use http-response instead of just turning everything into http-after-response? On Fri, Jun 11, 2021 at 11:07 AM Tim Düsterhus wrote: > James, > > On 6/11/21 8:03 PM, James Brown wrote: > > Is there any way to set a HTTP header on a redirect being emitted by > > haproxy? > > To also match HAProxy generated responses (including redirects and error > pages) you will need to use 'http-after-response': > > > https://cbonte.github.io/haproxy-dconv/2.4/configuration.html#http-after-response > > Best regards > Tim Düsterhus > -- James Brown Engineer
http-response set-header and redirect
Is there any way to set a HTTP header on a redirect being emitted by haproxy? Given the following simplified config: global log stdout user defaults log global timeout client 9s timeout server 10s timeout connect 1s frontend test_fe mode http http-response set-header Foo Bar bind localhost: redirect prefix https://www.example.com It appears that the Foo header is not set when the redirect is emitted. Is there any way to configure HAproxy to process `http-response` statements on a redirect? -- James Brown Engineer
Re: lua function core.get_info() broken in haproxy 2.2.7
Ah, never mind, I see that this was already fixed in master in 3ddec3ee7d344112b4e4fbde317f8886a20d66a0. On Fri, Jan 29, 2021 at 6:01 PM James Brown wrote: > As of haproxy 2.2.7, the core.get_info() lua function no longer works. > Calling it raises a runtime error of the following: > > [ALERT] 029/015726 (65429) : lua init: runtime error: table index is nil > from [C] field 'get_info', test.lua:2 C function line 1. > > This worked in haproxy 2.2.6 and all previous versions. > > I did a `git bisect` of the history between v2.2.6 and v2.2.7 and it > claims that the offending commit > is ce3e98eb1f9240b7633c5de6855dd115e61708ae; I can verify that the > reproduction below is OK with b50434 and fails with ce3e98eb although I > can't for the life of me tell you why, since ce3e98eb looks pretty > innocuous to me. > > Reproduction files: > > *test.lua* > > function poc() > local info = core.get_info() > end > > core.register_init(poc) > > > *test.cfg* > > global > log stdout user > lua-load "test.lua" > > defaults > log global > timeout client 9s > timeout server 10s > timeout connect 1s > > frontend test_fe > mode http > bind localhost: > default_backend test_be > > > backend test_be > mode http > server localhost 127.0.0.1:9998 > > > -- > James Brown > Engineer > -- James Brown Engineer
lua function core.get_info() broken in haproxy 2.2.7
As of haproxy 2.2.7, the core.get_info() lua function no longer works. Calling it raises a runtime error of the following: [ALERT] 029/015726 (65429) : lua init: runtime error: table index is nil from [C] field 'get_info', test.lua:2 C function line 1. This worked in haproxy 2.2.6 and all previous versions. I did a `git bisect` of the history between v2.2.6 and v2.2.7 and it claims that the offending commit is ce3e98eb1f9240b7633c5de6855dd115e61708ae; I can verify that the reproduction below is OK with b50434 and fails with ce3e98eb although I can't for the life of me tell you why, since ce3e98eb looks pretty innocuous to me. Reproduction files: *test.lua* function poc() local info = core.get_info() end core.register_init(poc) *test.cfg* global log stdout user lua-load "test.lua" defaults log global timeout client 9s timeout server 10s timeout connect 1s frontend test_fe mode http bind localhost: default_backend test_be backend test_be mode http server localhost 127.0.0.1:9998 -- James Brown Engineer
stick table conn_cur broken with peer synchronization
I've noticed that sc0_conn_cur (tracking the current number of connections for a request tracked based on and extracted field from the body) is much higher in 2.2 than it was in 2.1 and seems to no longer be correct. For example, on one relatively un-loaded load balancer which only has around 940 total open TCP sockets (according to netstat) and where the "Sessions" section for this proxy in the stats interface shows a Cur value of 19 and a Max of 120, fetching the stick table through the control socket shows rows with conn_cur in the thousands. As far as I know, it should be impossible for conn_cur to be higher than the total number of in-flight sessions on the proxy... The relevant haproxy config looks like stick-table type string len 32 size 512 expire 5m store gpc0,gpc0_rate(5m),http_req_rate(10s),conn_cur peers lb http-request track-sc0 req.hdr(Authorization) My intuition is that this is probably a bug with peer synchronization because it only seems to happen when the "peers lb" line as at the end of the block. -- James Brown Engineer
Re: do we want to keep CentOS 6 builds?
I mean, I certainly do. And today's unstable haproxy is tomorrow's stable haproxy... On Mon, Nov 16, 2020 at 1:48 PM Илья Шипицин wrote: > we run CI only for master branch. > > do all those people want to run latest unstable haproxy on oldish RHEL 6 ? > > пн, 16 нояб. 2020 г. в 23:56, James Brown : > >> Since CentOS 6 / RHEL 6 is the last pre-systemd release, I think there >> are lots of shops planning on keeping it around with various >> community-supported backports for years to come. I would vote to keep it >> around until it becomes an undue burden for CI, because there are still >> tons of EL6 users out there that have no migration path. >> >> On Sun, Nov 15, 2020 at 1:55 PM John Lauro wrote: >> >>> CentOS 6 isn't EOL until the end of the month, so there is a couple of >>> more weeks left. >>> >>> There is at least one place to pay for support through 2024. >>> ($3/month/server) >>> >>> Might be good to keep for a a bit past EOL, as I know when migrating >>> services sometimes I'll throw a proxy server on the old server to the new >>> one... and there will likely be some that don't make the Nov 30th deadline >>> to retire all Centos 6 servers. >>> >>> >>> On Sun, Nov 15, 2020 at 11:15 AM Илья Шипицин >>> wrote: >>> >>>> Hello, >>>> >>>> we still run cirrus-ci builds. >>>> CentOS 6 is EOL. >>>> >>>> should we drop it? >>>> >>>> Ilya >>>> >>> >> >> -- >> James Brown >> Engineer >> > -- James Brown Engineer
Re: do we want to keep CentOS 6 builds?
Since CentOS 6 / RHEL 6 is the last pre-systemd release, I think there are lots of shops planning on keeping it around with various community-supported backports for years to come. I would vote to keep it around until it becomes an undue burden for CI, because there are still tons of EL6 users out there that have no migration path. On Sun, Nov 15, 2020 at 1:55 PM John Lauro wrote: > CentOS 6 isn't EOL until the end of the month, so there is a couple of > more weeks left. > > There is at least one place to pay for support through 2024. > ($3/month/server) > > Might be good to keep for a a bit past EOL, as I know when migrating > services sometimes I'll throw a proxy server on the old server to the new > one... and there will likely be some that don't make the Nov 30th deadline > to retire all Centos 6 servers. > > > On Sun, Nov 15, 2020 at 11:15 AM Илья Шипицин > wrote: > >> Hello, >> >> we still run cirrus-ci builds. >> CentOS 6 is EOL. >> >> should we drop it? >> >> Ilya >> > -- James Brown Engineer
Re: HTTP method ACLs broken in HAProxy 2.2.3
Thanks Christopher, I'll give it a shot today. On Fri, Sep 18, 2020 at 6:39 AM Christopher Faulet wrote: > Le 18/09/2020 à 10:47, Christopher Faulet a écrit : > > Le 18/09/2020 à 01:33, James Brown a écrit : > >> git bisect says that this regression was caused > >> by commit c89077713915f605eb5d716545f182c8d0bf5581 > >> > >> This makes little sense to me, since that commit doesn't touch anything > even > >> slightly related. > >> > >> As far as I can tell, the proximate issue is that PATCH is not a > "well-known > >> method" to HAproxy, despite being in RFC 5789. find_http_meth is > returning > >> HTTP_METH_OTHER (silently) for it, and there's something hinky with how > the > >> method ACL matcher handles HTTP_METH_OTHER. > >> > >> I tried to poke the pat_match_meth function with a debugger, but it's > not even > >> being called in v2.2.3 (it /is/ being called in v2.2.2). Did something > break in > >> how custom matchers are called? > >> > > > > I'm able to reproduce the bug. In fact, it was introduced by the commit > > 05f3910f5 ("BUG/MEDIUM: htx: smp_prefetch_htx() must always validate the > > direction"). I attached a patch to fix the bug. > > > > Sorry, it is the wrong patch. I attached a totally unrelated and outdated > patch. > Anyway, I pushed the fix in the 2.3-dev: > >http://git.haproxy.org/?p=haproxy.git;a=commitdiff;h=d2414a23 > > It is not backported yet to the 2.2, but you can apply it. > > -- > Christopher Faulet > > -- James Brown Engineer
Re: HTTP method ACLs broken in HAProxy 2.2.3
git bisect says that this regression was caused by commit c89077713915f605eb5d716545f182c8d0bf5581 This makes little sense to me, since that commit doesn't touch anything even slightly related. As far as I can tell, the proximate issue is that PATCH is not a "well-known method" to HAproxy, despite being in RFC 5789. find_http_meth is returning HTTP_METH_OTHER (silently) for it, and there's something hinky with how the method ACL matcher handles HTTP_METH_OTHER. I tried to poke the pat_match_meth function with a debugger, but it's not even being called in v2.2.3 (it *is* being called in v2.2.2). Did something break in how custom matchers are called? On Thu, Sep 17, 2020 at 3:48 PM James Brown wrote: > One of our configurations includes the following snippet: > > acl allowed_method method HEAD GET POST PUT PATCH DELETE OPTIONS > ttp-request deny if !allowed_method > > In HAproxy 2.2.2 and below, this correctly blocks all requests that are > not HEAD, GET, PUT, POST, PATCH, DELETE, or OPTIONS. > > In HAproxy 2.2.3, this blocks PATCH requests. It seems to *only* be > broken for PATCH requests. > > The word "PATCH" does not occur in the diff between 2.2.2 and 2.2.3, which > is concerning. > > -- > James Brown > Engineer > -- James Brown Engineer
HTTP method ACLs broken in HAProxy 2.2.3
One of our configurations includes the following snippet: acl allowed_method method HEAD GET POST PUT PATCH DELETE OPTIONS ttp-request deny if !allowed_method In HAproxy 2.2.2 and below, this correctly blocks all requests that are not HEAD, GET, PUT, POST, PATCH, DELETE, or OPTIONS. In HAproxy 2.2.3, this blocks PATCH requests. It seems to *only* be broken for PATCH requests. The word "PATCH" does not occur in the diff between 2.2.2 and 2.2.3, which is concerning. -- James Brown Engineer
missing date() fetcher in lua api in haproxy 2.2
In earlier versions of haproxy, the txn.f object had a `date()` method when called from a fetch context which returned the current unix timestamp. In HAproxy 2.2, this method is removed. The `date()` fetch is still documented and works fine from bare haproxy config, so I'm not sure why Lua can't hit it. For a minimal reproducer, stick the following in "helpers.lua": core.register_fetches("date_wrapper", function(txn) core.Debug("fetching r_u_i\n") local date = txn.f:date() core.Debug(string.format("current date is %d", date)) return string.format("%08x", date) end) Then run the following haproxy config: global maxconn 65536 lua-load "helpers.lua" hard-stop-after 30m defaults timeout server 10s timeout client 10s timeout connect 1s frontend test_fe mode http bind 127.0.0.1:9993 http-response add-header Lua-Fetch %[lua.date_wrapper] http-response add-header Regular-Fetch %[date] default_backend test_be backend test_be mode http server localhost 127.0.0.1:9992 This will work fine in haproxy 2.0.x (didn't test 2.1.x yet) and will fail with "Lua sample-fetch 'date_wrapper': runtime error: helpers.lua:3: attempt to call a nil value (method 'date') from helpers.lua:3 C function line 1" on haproxy 2.2.x. My best guess is that this is related to ae6f125c <https://github.com/haproxy/haproxy/commit/ae6f125c7b33454770aaa363101384e8daafc2a2> but I don't really understand how sample fetchers are mapped into the Lua API, so this is just a guess. Is it possible that two-optional-argument sample fetchers just don't work? I can work around this by doing `os.time()` but I'd just as soon be reusing haproxy's unix timestamp rather than doing another system call, and afaik os.time() isn't portably guaranteed to be a unix timestamp or to be in UTC. Unrelatedly: the "Lua reference manual" linked in the documentation <http://www.haproxy.org/#docs>z` ( http://www.arpalert.org/src/haproxy-lua-api/2.0dev/index.html) is now a 403. I believe the correct URL is https://www.arpalert.org/src/haproxy-lua-api/2.2dev/index.html. -- James Brown Engineer
Plaintext HTTP/2 and HTTP/1.1 in the same proxy with HAproxy 2.2
CEATLAS -51DEGREES -WURFL -SYSTEMD -OBSOLETE_LINKER +PRCTL +THREAD_DUMP -EVPORTS Default settings : bufsize = 16384, maxrewrite = 1024, maxpollevents = 200 Built with multi-threading support (MAX_THREADS=64, default=2). Built with OpenSSL version : OpenSSL 1.0.2u 20 Dec 2019 Running on OpenSSL version : OpenSSL 1.0.2u 20 Dec 2019 OpenSSL library supports TLS extensions : yes OpenSSL library supports SNI : yes OpenSSL library supports : SSLv3 TLSv1.0 TLSv1.1 TLSv1.2 Built with Lua version : Lua 5.3.5 Built with zlib version : 1.2.3 Running on zlib version : 1.2.3 Compression algorithms supported : identity("identity"), deflate("deflate"), raw-deflate("deflate"), gzip("gzip") Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBIND Built with PCRE version : 7.8 2008-09-05 Running on PCRE version : 7.8 2008-09-05 PCRE library supports JIT : no (USE_PCRE_JIT not set) Encrypted password support via crypt(3): yes Built with gcc compiler version 4.4.7 20120313 (Red Hat 4.4.7-23) Available polling systems : epoll : pref=300, test result OK poll : pref=200, test result OK select : pref=150, test result OK Total: 3 (3 usable), will use epoll. Available multiplexer protocols : (protocols marked as cannot be specified using 'proto' keyword) fcgi : mode=HTTP side=BEmux=FCGI : mode=HTTP side=FE|BE mux=H1 h2 : mode=HTTP side=FE|BE mux=H2 : mode=TCPside=FE|BE mux=PASS Available services : none Available filters : [SPOE] spoe [COMP] compression [TRACE] trace [CACHE] cache [FCGI] fcgi-app -- James Brown Engineer
Re: haproxy 2.0.14 failing to bind peer sockets
It seems to also fail on cold start when there's a `peers` block that is unused. It's very mysterious! We aren't actually using the peers for anything in this config any more, so I'm going to strip it out for now and proceed with testing 2.0.14. On Mon, Apr 6, 2020 at 1:56 PM Willy Tarreau wrote: > On Mon, Apr 06, 2020 at 01:50:56PM -0700, James Brown wrote: > > I actually messed up testing last week; reverting Tim's commit appears to > > fix it. > > OK that's very useful, thanks! However you didn't respond to my other > question: > > > > > James, just to confirm, does it fail to start from a cold start or > only > > > > on reloads ? > > Willy > -- James Brown Engineer
Re: haproxy 2.0.14 failing to bind peer sockets
I actually messed up testing last week; reverting Tim's commit appears to fix it. On Fri, Apr 3, 2020 at 5:41 AM Willy Tarreau wrote: > On Fri, Apr 03, 2020 at 02:27:05PM +0200, Willy Tarreau wrote: > > On Thu, Apr 02, 2020 at 12:32:32PM -0700, James Brown wrote: > > > I reverted that commit, but it doesn't appear to have fixed the issue. > > > > > > I also tried adding a stick-table using this peers group to my config > (this > > > test cluster didn't actually have any stick-tables), but it still > fails at > > > startup with the same error. > > > > James, just to confirm, does it fail to start from a cold start or only > > on reloads ? > > I'm trying with this config and this command: > >global > stats socket /tmp/sock1 mode 666 level admin expose-fd listeners > stats timeout 1d > >peers p > peer peer1 127.0.0.1:8521 > peer peer2 127.0.0.1:8522 > >listen l > mode http > bind 127.0.0.1:2501 > timeout client 10s > timeout server 10s > timeout connect 10s > stick-table size 200 expire 10s type ip peers p store server_id > stick on src > server s 127.0.0.1:8000 > >$ ./haproxy -D -L peer1 -f peers.cfg -p /tmp/haproxy.pid >$ ./haproxy -D -L peer1 -f peers.cfg -p /tmp/haproxy.pid -sf $(pidof > haproxy) -x /tmp/sock1 >$ ./haproxy -D -L peer1 -f peers.cfg -p /tmp/haproxy.pid -sf $(pidof > haproxy) -x /tmp/sock1 >$ ./haproxy -D -L peer1 -f peers.cfg -p /tmp/haproxy.pid -sf $(pidof > haproxy) -x /tmp/sock1 > > For now I can't figure how to reproduce it :-/ If you manage to modify > this config to trigger the issue that would be great! > > Willy > -- James Brown Engineer
Re: haproxy 2.0.14 failing to bind peer sockets
I reverted that commit, but it doesn't appear to have fixed the issue. I also tried adding a stick-table using this peers group to my config (this test cluster didn't actually have any stick-tables), but it still fails at startup with the same error. On Thu, Apr 2, 2020 at 11:28 AM Tim Düsterhus wrote: > James, > > Am 02.04.20 um 19:53 schrieb James Brown: > > I'm upgrading one of our test clusters from 2.0.13 to 2.0.14 and our > > regular graceful-restart process is failing with: > > > > [ALERT] 092/174647 (114374) : [/usr/sbin/haproxy.main()] Some protocols > > failed to start their listeners! Exiting. > > I suppose this commit might be at fault here: > > https://github.com/haproxy/haproxy/commit/a2cfd7e356f4d744294b510b05d88bf58304db25 > > Try reverting it to see whether it fixes the issue. > > Best regards > Tim Düsterhus > -- James Brown Engineer
haproxy 2.0.14 failing to bind peer sockets
I'm upgrading one of our test clusters from 2.0.13 to 2.0.14 and our regular graceful-restart process is failing with: [ALERT] 092/174647 (114374) : [/usr/sbin/haproxy.main()] Some protocols failed to start their listeners! Exiting. Looking at strace, it looks like the bind(2) call for the peer socket is failing. Did something change about the order in which peer sockets are bound? Our peers block is pretty straightforward and hasn't changed in several years. peers lb peer devlb1west 10.132.46.130:7778 peer devlb2west 10.132.37.135:7778 Our graceful restart command looks like /usr/sbin/haproxy -f /path/to/haproxy.config -p /home/srvelb/run/haproxy.pid -sf 70409 -x /path/to/admin/mode/socket and also hasn't changed since the addition of domain-socket FD passing in 1.8. I notice a bunch of peer-related commits got pulled into 2.0.14... Anyone else seen this? -- James Brown Engineer
Re: Recommendations for deleting headers by regexp in 2.x?
So how should we move this proposal forward? I'm glad to contribute more patches... On Fri, Jan 24, 2020 at 2:15 AM Willy Tarreau wrote: > On Fri, Jan 24, 2020 at 10:26:34AM +0100, Christopher Faulet wrote: > > Le 24/01/2020 à 09:17, Willy Tarreau a écrit : > > > On Fri, Jan 24, 2020 at 08:28:33AM +0100, Christopher Faulet wrote: > > > > Le 23/01/2020 à 19:59, James Brown a écrit : > > > > > I spent a couple of minutes and made the attached (pretty bad) > patch to > > > > > add a del-header-by-prefix. > > > > > > > > > > > > > Just an idea. Instead of adding a new action, it could be cleaner to > extend > > > > the del-header action adding some keywords. Something like: > > > > > > > >http-request del-header begin-with > > > >http-request del-header end-with > > > >http-request del-header match > > > > > > > > It could be also extended to replace-header and replace-value > actions. > > > > > > I would also prefer to extend existing syntax, however it's problematic > > > to insert optional words *before* arguments. This will complicate the > > > parsing, and even config manipulation scripts. > > > > > > That's why I thought we could instead just append an extra optional > > > keyword appended after the name, eventhough it's less elegant. > > > > > > > From the configuration parsing point of view, it is more or less the same > > thing. You must test if the second argument is defined or not. And in > fact, > > moving it after the header name is not a "better" solution because there > is > > an optional condition too at the end. So this one will not be the last > one. > > No, it's more complicated this way because you have to check each and every > word to figure the syntax. Example: how do you mention that you want to > remove the header field matching regex "unless" ? You'd have to do this : > > http-request del-header match unless > > And it's ambiguous, as it can either mean : > >- delete header name "match" unless a condition which needs to be > parsed, > and once figured invalid, you can roll back ; >- delete header described by regex "unless" with no condition > > When you do it the other way around it's way easier, because the name > always being the first argument makes sure the second one is limited to > a small subset (match/prefix/if/unless for example): > > http-request del-header unless match > > A variant of this could be to use the same syntax as the options we already > use on ACL matches, which are "-m reg", "-m beg", "-m end". But these will > also need to be placed after to avoid the same ambiguity (since "-m" is a > token hence a valid header name). That would give for example : > > http-request del-header server > http-request del-header x-private- -m beg > http-request del-header x-.*company -m reg > http-request del-header -tracea -m end > > Willy > -- James Brown Engineer
Re: Recommendations for deleting headers by regexp in 2.x?
(2 usable), will use epoll. > > > > Available filters : > > [SPOE] spoe > > [CACHE] cache > > [FCGI] fcgi-app > > [TRACE] trace > > [COMP] compression > > Using epoll() as the polling mechanism. > > [WARNING] 022/210543 (19765) : [./haproxy.main()] Cannot raise FD limit > to 2071, limit is 1024. This will fail in >= v2.3 > > [ALERT] 022/210543 (19765) : [./haproxy.main()] FD limit (1024) too low > for maxconn=1024/maxsock=2071. Please raise 'ulimit-n' to 2071 or more to > avoid any trouble.This will fail in >= v2.3 > > ==19765== Thread 2: > > ==19765== Syscall param timer_create(evp.sigev_value) points to > uninitialised byte(s) > > ==19765==at 0x5292FE0: timer_create@@GLIBC_2.3.3 (timer_create.c:78) > > ==19765==by 0x53824D: init_wdt_per_thread (wdt.c:146) > > ==19765==by 0x4B1D84: run_thread_poll_loop (haproxy.c:2723) > > ==19765==by 0x50796B9: start_thread (pthread_create.c:333) > > ==19765==by 0x559E41C: clone (clone.S:109) > > ==19765== Address 0x643ea64 is on thread 2's stack > > ==19765== in frame #1, created by init_wdt_per_thread (wdt.c:131) > > ==19765== > > ==19765== Thread 1: > > ==19765== Syscall param timer_create(evp.sigev_value) points to > uninitialised byte(s) > > ==19765==at 0x5292FE0: timer_create@@GLIBC_2.3.3 (timer_create.c:78) > > ==19765==by 0x53824D: init_wdt_per_thread (wdt.c:146) > > ==19765==by 0x4B1D84: run_thread_poll_loop (haproxy.c:2723) > > ==19765==by 0x40760C: main (haproxy.c:3483) > > ==19765== Address 0xffefffe84 is on thread 1's stack > > ==19765== in frame #1, created by init_wdt_per_thread (wdt.c:131) > > ==19765== > > :test_fe.accept(0004)=0010 from [:::127.0.0.1:48036] > ALPN= > > :test_fe.clireq[0010:]: GET / HTTP/1.1 > > :test_fe.clihdr[0010:]: host: localhost: > > :test_fe.clihdr[0010:]: user-agent: curl/7.47.0 > > :test_fe.clihdr[0010:]: accept: */* > > 0001:test_fe.accept(0004)=0010 from [:::127.0.0.1:48036] > ALPN= > > 0001:test_fe.clicls[0011:] > > 0001:test_fe.closed[0011:] > > ==19765== Invalid read of size 8 > > ==19765==at 0x499DD5: back_handle_st_con (backend.c:1937) > > ==19765==by 0x427353: process_stream (stream.c:1662) > > ==19765==by 0x5023E9: process_runnable_tasks (task.c:461) > > ==19765==by 0x4B1E78: run_poll_loop (haproxy.c:2630) > > ==19765==by 0x4B1E78: run_thread_poll_loop (haproxy.c:2783) > > ==19765==by 0x40760C: main (haproxy.c:3483) > > ==19765== Address 0x18 is not stack'd, malloc'd or (recently) free'd > > ==19765== > > ==19765== > > ==19765== Process terminating with default action of signal 11 (SIGSEGV) > > ==19765== Access not within mapped region at address 0x18 > > ==19765==at 0x499DD5: back_handle_st_con (backend.c:1937) > > ==19765==by 0x427353: process_stream (stream.c:1662) > > ==19765==by 0x5023E9: process_runnable_tasks (task.c:461) > > ==19765==by 0x4B1E78: run_poll_loop (haproxy.c:2630) > > ==19765==by 0x4B1E78: run_thread_poll_loop (haproxy.c:2783) > > ==19765==by 0x40760C: main (haproxy.c:3483) > > ==19765== If you believe this happened as a result of a stack > > ==19765== overflow in your program's main thread (unlikely but > > ==19765== possible), you can try to increase the size of the > > ==19765== main thread stack using the --main-stacksize= flag. > > ==19765== The main thread stack size used in this run was 8388608. > > ==19765== > > ==19765== HEAP SUMMARY: > > ==19765== in use at exit: 2,005,950 bytes in 224 blocks > > ==19765== total heap usage: 269 allocs, 45 frees, 2,115,657 bytes > allocated > > ==19765== > > ==19765== LEAK SUMMARY: > > ==19765==definitely lost: 0 bytes in 0 blocks > > ==19765==indirectly lost: 0 bytes in 0 blocks > > ==19765== possibly lost: 864 bytes in 3 blocks > > ==19765==still reachable: 2,005,086 bytes in 221 blocks > > ==19765== suppressed: 0 bytes in 0 blocks > > ==19765== Rerun with --leak-check=full to see details of leaked memory > > ==19765== > > ==19765== For counts of detected and suppressed errors, rerun with: -v > > ==19765== Use --track-origins=yes to see where uninitialised values come > from > > ==19765== ERROR SUMMARY: 5 errors from 3 contexts (suppressed: 0 from 0) > > fish: “valgrind ./haproxy -d -f ./cras…” terminated by signal SIGKILL > (Forced quit) > > Best regards > Tim Düsterhus > -- James Brown Engineer
Re: Recommendations for deleting headers by regexp in 2.x?
Glad to do any other debugging you'd like. Just running `make TARGET=linux-glibc USE_NS=` or `make TARGET=osx`; nothing fancy. On Thu, Jan 23, 2020 at 12:00 PM Willy Tarreau wrote: > On Thu, Jan 23, 2020 at 11:54:17AM -0800, James Brown wrote: > > Where master == f22758d12af5e9f3919f24bf913b883a62df7d93, the following > > config fails on both linux-glibc and osx: > > > > global > > maxconn 1024 > > > > defaults > > timeout client 9s > > timeout server 9s > > timeout connect 1s > > > > frontend test_fe > > mode http > > bind ::: > > use_backend test_be > > > > backend test_be > > mode http > > server localhost 127.0.0.1:1 > > > > Every request crashes immediately before connecting to the backend. > > I'm impressed, I'm unable to reproduce it! > > $ telnet 0 > Trying 0.0.0.0... > Connected to 0. > Escape character is '^]'. > GET / HTTP/1.1 > > HTTP/1.1 200 OK > connection: close > > > Backtrace: > > > > Program received signal SIGSEGV, Segmentation fault. > > back_handle_st_con (s=0x94abd0) at src/backend.c:1937 > > 1937 if (!conn->mux && !(conn->flags & CO_FL_WAIT_XPRT)) { > > (gdb) bt > > #0 back_handle_st_con (s=0x94abd0) at src/backend.c:1937 > > #1 0x0042ae75 in process_stream (t=0x94b020, context=0x94abd0, > > state=) at src/stream.c:1662 > > #2 0x005083c2 in process_runnable_tasks () at src/task.c:461 > > #3 0x004bb36b in run_poll_loop (data=) at > > src/haproxy.c:2630 > > #4 run_thread_poll_loop (data=) at > src/haproxy.c:2783 > > #5 0x004bdba5 in main (argc=, argv= > optimized out>) at src/haproxy.c:3483 > > > > Segfault is on the same line on OS X and Linux. > > I'm pretty sure the connection is null (or almost null as derived from > the CS) though that should not happen at this place. I'll have another > look at this one tomorrow. Additionally this "if" block will be entirely > removed :-) But I really want to understand how we manage to enter there > with an invalid connection. > > Thank you! > Willy > -- James Brown Engineer
Re: Recommendations for deleting headers by regexp in 2.x?
Where master == f22758d12af5e9f3919f24bf913b883a62df7d93, the following config fails on both linux-glibc and osx: global maxconn 1024 defaults timeout client 9s timeout server 9s timeout connect 1s frontend test_fe mode http bind ::: use_backend test_be backend test_be mode http server localhost 127.0.0.1:1 Every request crashes immediately before connecting to the backend. Backtrace: Program received signal SIGSEGV, Segmentation fault. back_handle_st_con (s=0x94abd0) at src/backend.c:1937 1937 if (!conn->mux && !(conn->flags & CO_FL_WAIT_XPRT)) { (gdb) bt #0 back_handle_st_con (s=0x94abd0) at src/backend.c:1937 #1 0x0042ae75 in process_stream (t=0x94b020, context=0x94abd0, state=) at src/stream.c:1662 #2 0x005083c2 in process_runnable_tasks () at src/task.c:461 #3 0x004bb36b in run_poll_loop (data=) at src/haproxy.c:2630 #4 run_thread_poll_loop (data=) at src/haproxy.c:2783 #5 0x004bdba5 in main (argc=, argv=) at src/haproxy.c:3483 Segfault is on the same line on OS X and Linux. On Thu, Jan 23, 2020 at 11:49 AM Willy Tarreau wrote: > On Thu, Jan 23, 2020 at 11:05:57AM -0800, James Brown wrote: > > Update: I rebased back to the last non-segfaulting commit and this > patch's > > functionality appears to work in very limited testing. > > Cool, thanks for doing it. I've quickly read it and I'm also convinced it > must work. I'll take more time tomorrow doing a more in-depth review and > suggesting some parts to split into different patches. > > I'm also interested in knowing more about this segfault in master, because > for me all reg-tests were OK before I pushed last update, thus if you have > a reproducer I'm very interested :-) > > Thanks, > Willy > -- James Brown Engineer
Re: Recommendations for deleting headers by regexp in 2.x?
Update: I rebased back to the last non-segfaulting commit and this patch's functionality appears to work in very limited testing. On Thu, Jan 23, 2020 at 10:59 AM James Brown wrote: > I spent a couple of minutes and made the attached (pretty bad) patch to > add a del-header-by-prefix. > > Unfortunately, I made it off of master before noticing that master > segfaults on every request, so I haven't tested it yet. At least it > compiles... Feel free to use it, throw it away, or whatever else suits your > fancy. > > > > On Thu, Jan 23, 2020 at 9:26 AM James Brown wrote: > >> Yes, they’re all identified by a prefix. >> >> On Thu, Jan 23, 2020 at 02:03 Willy Tarreau wrote: >> >>> Hi James, >>> >>> On Wed, Jan 22, 2020 at 04:19:41PM -0800, James Brown wrote: >>> > We're upgrading from 1.8 to 2.x and one of the things I've noticed is >>> that >>> > reqidel and rspidel seem to be totally gone in 2.1... What's the new >>> > recommendation to delete headers from request/response based on a >>> regular >>> > expression? Do I have to write a Lua action to do this now? I read >>> through >>> > the documentation for http-request and http-response and there doesn't >>> seem >>> > to be an `http-request del-header-by-regex`... >>> > >>> > Our use case is that we have dozens of different internal headers >>> behind a >>> > prefix, and we promise that we'll strip them all for incoming requests >>> and >>> > outgoing responses at the edge load balancer. That is harder to do if >>> we >>> > can't delete all headers matching a certain regex... >>> >>> That's an intereting use case, which I find totally legitimate and that >>> we need to figure how to address. In 2.0 you can still rely on rspdel >>> but we then need to have a solution for 2.2. Probably that in the short >>> term using Lua will be the easiest solution. And maybe we'd need to add >>> a new action such as "del-headers" which would take a regex or a prefix. >>> By the way, are all your headers identified by the same prefix ? I'm >>> asking because if that's the case, maybe we could append an optional >>> argument to del-header to mention that we want to delete all those >>> starting with this prefix and not just this exact one. >>> >>> Willy >>> >> -- >> James Brown >> Engineer >> > > > -- > James Brown > Engineer > -- James Brown Engineer
Re: Recommendations for deleting headers by regexp in 2.x?
I spent a couple of minutes and made the attached (pretty bad) patch to add a del-header-by-prefix. Unfortunately, I made it off of master before noticing that master segfaults on every request, so I haven't tested it yet. At least it compiles... Feel free to use it, throw it away, or whatever else suits your fancy. On Thu, Jan 23, 2020 at 9:26 AM James Brown wrote: > Yes, they’re all identified by a prefix. > > On Thu, Jan 23, 2020 at 02:03 Willy Tarreau wrote: > >> Hi James, >> >> On Wed, Jan 22, 2020 at 04:19:41PM -0800, James Brown wrote: >> > We're upgrading from 1.8 to 2.x and one of the things I've noticed is >> that >> > reqidel and rspidel seem to be totally gone in 2.1... What's the new >> > recommendation to delete headers from request/response based on a >> regular >> > expression? Do I have to write a Lua action to do this now? I read >> through >> > the documentation for http-request and http-response and there doesn't >> seem >> > to be an `http-request del-header-by-regex`... >> > >> > Our use case is that we have dozens of different internal headers >> behind a >> > prefix, and we promise that we'll strip them all for incoming requests >> and >> > outgoing responses at the edge load balancer. That is harder to do if we >> > can't delete all headers matching a certain regex... >> >> That's an intereting use case, which I find totally legitimate and that >> we need to figure how to address. In 2.0 you can still rely on rspdel >> but we then need to have a solution for 2.2. Probably that in the short >> term using Lua will be the easiest solution. And maybe we'd need to add >> a new action such as "del-headers" which would take a regex or a prefix. >> By the way, are all your headers identified by the same prefix ? I'm >> asking because if that's the case, maybe we could append an optional >> argument to del-header to mention that we want to delete all those >> starting with this prefix and not just this exact one. >> >> Willy >> > -- > James Brown > Engineer > -- James Brown Engineer 0001-add-http-request-del-header-prefix-and-http-response.patch Description: Binary data
Re: Recommendations for deleting headers by regexp in 2.x?
Yes, they’re all identified by a prefix. On Thu, Jan 23, 2020 at 02:03 Willy Tarreau wrote: > Hi James, > > On Wed, Jan 22, 2020 at 04:19:41PM -0800, James Brown wrote: > > We're upgrading from 1.8 to 2.x and one of the things I've noticed is > that > > reqidel and rspidel seem to be totally gone in 2.1... What's the new > > recommendation to delete headers from request/response based on a regular > > expression? Do I have to write a Lua action to do this now? I read > through > > the documentation for http-request and http-response and there doesn't > seem > > to be an `http-request del-header-by-regex`... > > > > Our use case is that we have dozens of different internal headers behind > a > > prefix, and we promise that we'll strip them all for incoming requests > and > > outgoing responses at the edge load balancer. That is harder to do if we > > can't delete all headers matching a certain regex... > > That's an intereting use case, which I find totally legitimate and that > we need to figure how to address. In 2.0 you can still rely on rspdel > but we then need to have a solution for 2.2. Probably that in the short > term using Lua will be the easiest solution. And maybe we'd need to add > a new action such as "del-headers" which would take a regex or a prefix. > By the way, are all your headers identified by the same prefix ? I'm > asking because if that's the case, maybe we could append an optional > argument to del-header to mention that we want to delete all those > starting with this prefix and not just this exact one. > > Willy > -- James Brown Engineer
Recommendations for deleting headers by regexp in 2.x?
We're upgrading from 1.8 to 2.x and one of the things I've noticed is that reqidel and rspidel seem to be totally gone in 2.1... What's the new recommendation to delete headers from request/response based on a regular expression? Do I have to write a Lua action to do this now? I read through the documentation for http-request and http-response and there doesn't seem to be an `http-request del-header-by-regex`... Our use case is that we have dozens of different internal headers behind a prefix, and we promise that we'll strip them all for incoming requests and outgoing responses at the edge load balancer. That is harder to do if we can't delete all headers matching a certain regex... -- James Brown Engineer
Re: HTTP/2 header issue: "Accept-Ranges" -> "Accept-Language"
I have tested that patch and it seems to work. https://www.easypost.com is now returning the correct headers over HTTP/2. Thanks for the quick turnaround! On Mon, Nov 19, 2018 at 7:54 PM Willy Tarreau wrote: > On Mon, Nov 19, 2018 at 11:55:04PM +0100, Willy Tarreau wrote: > > > I assume this is a bug in the HPACK encoder, given that in the static > > > table definition [1], accept-language has index 17, while > > > accept-ranges has 18, which is correctly documented in > > > src/hpack-tbl.c, but the comment on line 105 in src/hpack-enc.c makes > > > me doubt our implementation: > > > > > > out->str[len++] = 0x51; // literal with indexing -- > > > name="accept-ranges" (idx 17) > > > > > > > > > Too much string magic going on there for me to provide prompt > > > solution, but I assume this will be a quick fix for Willy. > > > > Gloups! I'm quite ashamed, totally ashamed, almost red. I'll take a > > look at this tomorrow. Thanks for the report! > > I've just pushed the fix. I'm attaching the backported version for your > convenience (as it will not apply as-is to 1.8). > > Thanks! > Willy > -- James Brown Engineer
HTTP/2 header issue: "Accept-Ranges" -> "Accept-Language"
Here's a strange thing I've noticed: When using HTTP/2, HAproxy is rewriting the "Accept-Ranges" response header into "Accept-Language". You can see this yourself by comparing the output of curl --http2 -vs -o /dev/null https://www.easypost.com curl --http1.1 -vs -o /dev/null https://www.easypost.com Here's what I see % diff <(curl --http1.1 -vs -o /dev/null https://www.easypost.com 2>&1 | egrep -o '< [^:]+:' | sed -e 's/< //' | tr '[:upper:]' '[:lower:]') <(curl --http2 -vs -o /dev/null https://www.easypost.com 2>&1 | egrep -o '< [^:]+:' | sed -e 's/< //' | tr '[:upper:]' '[:lower:]') 14c14 < accept-ranges: --- > accept-language: The backend in this case is HTTP/1.1; HAproxy is doing the 1.1 -> 2 conversion itself. This is with HAproxy 1.8.14. I have not tested with HAproxy 1.9. Any thoughts? -- James Brown Systems Engineer
Re: 'stick': unknown fetch method 'res.cook_beg'
I think the preferred format now is req.cook(cookie_name) -m beg cookie_value Check out §7.1.3 of the manual <http://cbonte.github.io/haproxy-dconv/1.8/configuration.html#7.1.3> for more information. On Tue, Oct 30, 2018 at 9:55 AM Gibson, Brian (IMS) wrote: > I’m attempting to use a stick table to get all of a users sessions when > using shibboleth to point to the same backend server to simplify a few > configurations I have on the backend. > > > > Here is the specific code I’m using > > stick-table type string len 64 size 100k expire 15m peers mypeers > > stick store-response res.cook_beg(_shibboleth_) > > stick match req.cook_beg(_shibboleth_) > > > > When I attempt to load that configuration file I get an error saying the > message in the subject line. > > > > For reference here is the output of haproxy –vv > > > > HA-Proxy version 1.8.13 2018/07/30 > > Copyright 2000-2018 Willy Tarreau > > > > Build options : > > TARGET = linux2628 > > CPU = generic > > CC = gcc > > CFLAGS = -O2 -g -fno-strict-aliasing -Wdeclaration-after-statement > -fwrapv -fno-strict-overflow -Wno-unused-label > > OPTIONS = USE_LINUX_TPROXY=1 USE_ZLIB=1 USE_REGPARM=1 USE_OPENSSL=1 > USE_SYSTEMD=1 USE_PCRE2=1 USE_PCRE2_JIT=1 > > > > Default settings : > > maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200 > > > > Built with OpenSSL version : OpenSSL 1.1.1 11 Sep 2018 > > Running on OpenSSL version : OpenSSL 1.1.1 11 Sep 2018 > > OpenSSL library supports TLS extensions : yes > > OpenSSL library supports SNI : yes > > OpenSSL library supports : TLSv1.0 TLSv1.1 TLSv1.2 TLSv1.3 > > Built with transparent proxy support using: IP_TRANSPARENT > IPV6_TRANSPARENT IP_FREEBIND > > Encrypted password support via crypt(3): yes > > Built with multi-threading support. > > Built with PCRE2 version : 10.31 2018-02-12 > > PCRE2 library supports JIT : yes > > Built with zlib version : 1.2.7 > > Running on zlib version : 1.2.7 > > Compression algorithms supported : identity("identity"), > deflate("deflate"), raw-deflate("deflate"), gzip("gzip") > > Built with network namespace support. > > > > Available polling systems : > > epoll : pref=300, test result OK > >poll : pref=200, test result OK > > select : pref=150, test result OK > > Total: 3 (3 usable), will use epoll. > > > > Available filters : > > [SPOE] spoe > > [COMP] compression > > [TRACE] trace > > -- > > Information in this e-mail may be confidential. It is intended only for > the addressee(s) identified above. If you are not the addressee(s), or an > employee or agent of the addressee(s), please note that any dissemination, > distribution, or copying of this communication is strictly prohibited. If > you have received this e-mail in error, please notify the sender of the > error. > -- James Brown Engineer
Re: Lots of PR state failed connections with HTTP/2 on HAProxy 1.8.14
Y'all are quite right: one of the machines inverted the order of restarting with the new config and updating the package and was advertising the h2 ALPN with HAProxy 1.7.11. Sorry to take up so much time with a silly question. Cheers! On Wed, Oct 24, 2018 at 12:21 AM Aleksandar Lazic wrote: > Am 24.10.2018 um 09:18 schrieb Igor Cicimov: > > > > > > On Wed, 24 Oct 2018 5:06 pm Aleksandar Lazic > <mailto:al-hapr...@none.at>> wrote: > > > > Hi. > > > > Am 24.10.2018 um 03:02 schrieb Igor Cicimov: > > > On Wed, Oct 24, 2018 at 9:16 AM James Brown > <mailto:jbr...@easypost.com>> wrote: > > >> > > >> I tested enabling HTTP/2 on the frontend for some of our sites > today and > > immediately started getting a flurry of failures. Browsers (at least > Chrome) > > showed a lot of SPDY protocol errors and the HAProxy logs had a lot > of lines > > ending in > > >> > > >> https_domain_redacted/ -1/-1/-1/-1/100 400 187 - - PR-- > 49/2/0/0/0 0/0 > > >> > > > > > > Possible reasons: > > > > > > 1. You don't have openssl v1.0.2 installed (assuming you use > openssl) > > > on a server(s) > > > 2. You have changed your config for h2 suport but your server(s) is > > > still running haproxy 1.7 (i.e. hasn't been restarted after upgrade > > > and still using the old 1.7 binary instead 1.8) > > > > That's one of the reason why we need to know the exact version. > > > > James can you post the output of `haproxy -vv` and some more > information about > > your setup. > > > > > > This can return the correct version but it still does not mean the runnig > > process is actually using it (has not been restarted after upgrade). > > Full Ack. That's the reason why we need some more information's about the > setup ;-) > > > Regards > > Aleks > > > > >> There were no useful or interesting errors logged to syslog. No > sign of > > any resources being exhausted (conntrack seems fine, etc). The times > varied > > but Ta was always low (usually around 100ms). I have not been able to > > reproduce this issue in a staging environment, so it may be > something "real > > browsers" do that doesn't show up with h2load et al. > > >> > > >> Turning off HTTP/2 (setting "alpn http/1.1") completely solves > the problem. > > >> > > >> The following timeouts are set on all of the affected frontends: > > >> > > >> retries 3 > > >> timeout client 9s > > >> timeout connect 3s > > >> timeout http-keep-alive 5m > > >> tcp-request inspect-delay 4s > > >> option http-server-close > > >> > > >> Additionally, we set maxconn to a very high value (20480). > > >> > > >> Backends generally have timeout server set to a largeish value > (90-300 > > seconds, depending on the backend). > > >> > > >> Anything jump out at anyone? > > >> -- > > >> James Brown > > >> Systems & Network Engineer > > >> EasyPost > > > > > > > -- James Brown Engineer
Re: Lots of PR state failed connections with HTTP/2 on HAProxy 1.8.14
The config is several thousand lines long and includes a bunch of material non-public info, but if there are parts you think may be relevant I can try to snip them out. The HAProxy version (as noted in the subject of the e-mail) is 1.8.14 (which is, I believe, the latest 1.8 release). Chrome treats HTTP/2 as SPDY and shows the SPDY error page when an HTTP/2 protocol error occurs. I think this may actually have been a configuration issue on a single server (looking at the logs in a bit more detail); I'm going to do some more testing to see what's different there. On Tue, Oct 23, 2018 at 3:23 PM Aleksandar Lazic wrote: > Hi. > > SPDY is not HTTP/2 . > > Please can you share the config and the haproxy version. > > Best regards > Aleks > > ------ > *Von:* James Brown > *Gesendet:* 24. Oktober 2018 00:13:37 MESZ > *An:* HAProxy > *CC:* jared > *Betreff:* Lots of PR state failed connections with HTTP/2 on HAProxy > 1.8.14 > > I tested enabling HTTP/2 on the frontend for some of our sites today and > immediately started getting a flurry of failures. Browsers (at least > Chrome) showed a lot of SPDY protocol errors and the HAProxy logs had a lot > of lines ending in > > https_domain_redacted/ -1/-1/-1/-1/100 400 187 - - PR-- 49/2/0/0/0 > 0/0 > > There were no useful or interesting errors logged to syslog. No sign of > any resources being exhausted (conntrack seems fine, etc). The times varied > but Ta was always low (usually around 100ms). I have not been able to > reproduce this issue in a staging environment, so it may be something "real > browsers" do that doesn't show up with h2load et al. > > Turning off HTTP/2 (setting "alpn http/1.1") completely solves the problem. > > The following timeouts are set on all of the affected frontends: > > retries 3 > timeout client 9s > timeout connect 3s > timeout http-keep-alive 5m > tcp-request inspect-delay 4s > option http-server-close > > Additionally, we set maxconn to a very high value (20480). > > Backends generally have timeout server set to a largeish value (90-300 > seconds, depending on the backend). > > Anything jump out at anyone? > -- > James Brown > Systems & Network Engineer > EasyPost > -- James Brown Engineer
Lots of PR state failed connections with HTTP/2 on HAProxy 1.8.14
I tested enabling HTTP/2 on the frontend for some of our sites today and immediately started getting a flurry of failures. Browsers (at least Chrome) showed a lot of SPDY protocol errors and the HAProxy logs had a lot of lines ending in https_domain_redacted/ -1/-1/-1/-1/100 400 187 - - PR-- 49/2/0/0/0 0/0 There were no useful or interesting errors logged to syslog. No sign of any resources being exhausted (conntrack seems fine, etc). The times varied but Ta was always low (usually around 100ms). I have not been able to reproduce this issue in a staging environment, so it may be something "real browsers" do that doesn't show up with h2load et al. Turning off HTTP/2 (setting "alpn http/1.1") completely solves the problem. The following timeouts are set on all of the affected frontends: retries 3 timeout client 9s timeout connect 3s timeout http-keep-alive 5m tcp-request inspect-delay 4s option http-server-close Additionally, we set maxconn to a very high value (20480). Backends generally have timeout server set to a largeish value (90-300 seconds, depending on the backend). Anything jump out at anyone? -- James Brown Systems & Network Engineer EasyPost
Re: Possibility to modify PROXY protocol header
I think if you use the `http-request set-src` directive it'll populate the PROXY headers in addition to the internal logging &c. On Fri, Jul 27, 2018 at 7:05 AM bjun...@gmail.com wrote: > Hi, > > is there any possibilty to modify the client ip in the PROXY Protocol > header before it is send to a backend server? > > My use case is a local integration/functional testing suite (multiple > local docker containers for testing the whole stack - haproxy, cache layer, > webserver, etc.). > > I would like to test functionalities which are dependent of/need specific > IP ranges or IP addresses. > > > Best Regards / Mit freundlichen Grüßen > > Bjoern > -- James Brown Engineer
Reloading maps?
Is there any good way to reload a map, short of either (a) reloading haproxy every time the map changes, or (b) feeding the entire map into the control socket as a series of `set map` statements? I've got a map generated by an external program; we're currently doing (b) and it feels a little fragile... -- James Brown Engineer
Re: action on server state change
We address this kind of thing using an external daemon which receives and parses syslog messages from haproxy. On Fri, May 5, 2017 at 1:10 AM, Stephan Mueller wrote: > Hi, > > is it possible to define some custom action on a server state > change, e.g. execution of a script? > > Background: Recently, I had some issues with a flapping service - it > worked well without load ;p - but rapidly degraded in UP state. In > result it was flapping UP/DOWN. rise/fall could only scale my problem > in time and also agent-checks could not help as is the service > considered itself functional all the time (other story). > > As a first countermeasure it would be nice, to have some kind of > flapping detection direct in haproxy itself. For example it would be > helpful to have a separate log for server UP/DOWN messages. Then its > much easier to see frequent state changes. > > Thanks for your thoughts > > -- James Brown Engineer
Re: Bug: send-proxy-v2 sends PROXY protocol on agent checks
I had to move it down a couple of lines because I'm on 1.7.5 not master but it seems to work fine. Thanks for the quick response as always, Willy. On Wed, May 3, 2017 at 10:23 PM, Willy Tarreau wrote: > On Wed, May 03, 2017 at 08:21:12PM -0700, James Brown wrote: > > If the send-proxy-v2 flag is set on a server, the PROXY (v2) is emitted > on > > agent checks. > > > > If send-proxy is set on a server, no PROXY protocol is emitted on agent > > checks. > > > > I rather think that the correct behavior is not to send the PROXY > protocol > > on agent checks... > > You're totally right, this is stupid. Could you please check if the > following patch fixes it ? > > diff --git a/src/checks.c b/src/checks.c > index 778fc6a..bee7101 100644 > --- a/src/checks.c > +++ b/src/checks.c > @@ -1559,7 +1559,7 @@ static int connect_conn_chk(struct task *t) > ret = SF_ERR_INTERNAL; > if (proto->connect) > ret = proto->connect(conn, check->type, quickack ? 2 : 0); > - if (s->check.send_proxy) { > + if (s->check.send_proxy && !(check->state & CHK_ST_AGENT)) { > conn->send_proxy_ofs = 1; > conn->flags |= CO_FL_SEND_PROXY; > } > > Thanks, > Willy > -- James Brown Engineer
Bug: send-proxy-v2 sends PROXY protocol on agent checks
If the send-proxy-v2 flag is set on a server, the PROXY (v2) is emitted on agent checks. If send-proxy is set on a server, no PROXY protocol is emitted on agent checks. I rather think that the correct behavior is not to send the PROXY protocol on agent checks... -- James Brown Engineer
Re: haproxy deleting domain socket on graceful reload if backlog overflows
Hi Andrew: Thanks for you feedback, but I'm describing a very specific bug wherein the old haproxy will unlink the new haproxy's bound unix domain socket upon reload due to a race condition in the domain socket cleanup code if a listen overflow occurs while the graceful is in process. On Wed, Apr 12, 2017 at 11:39 AM, Andrew Smalley wrote: > HI James > > When you do a graceful reload of haproxy this is what happens. > > 1. the old process will accept no more connections and the stats page is > stopped and so is the socket > 2. a new haproxy instance is started where new clients get connected to, > and this has the live socket > 3. when the old haproxy instance has no more clients left it dies silently > leaving all the clients on the new haproxy instance. > > This is expected behavior as you want the first haproxy to die when the > last client leaves. > > > Regards > > Andrew Smalley > > Loadbalancer.org Ltd. > > > > On 12 April 2017 at 19:32, James Brown wrote: > >> This just hit us again on a different set of load balancers... if there's >> a listen socket overflow on a domain socket during graceful, haproxy >> completely deletes the domain socket and becomes inaccessible. >> >> On Tue, Feb 21, 2017 at 6:47 PM, James Brown wrote: >> >>> Under load, we're sometimes seeing a situation where HAProxy will >>> completely delete a bound unix domain socket after a reload. >>> >>> The "bad flow" looks something like the following: >>> >>> >>>- haproxy is running on pid A, bound to /var/run/domain.sock (via a >>>bind line in a frontend) >>>- we run `haproxy -sf A`, which starts a new haproxy on pid B >>>- pid B binds to /var/run/domain.sock.B >>>- pid B moves /var/run/domain.sock.B to /var/run/domain.sock (in >>>uxst_bind_listener) >>>- in the mean time, there are a zillion connections to >>>/var/run/domain.sock and pid B isn't started up yet; backlog is exhausted >>>- pid B signals pid A to shut down >>>- pid A runs the destroy_uxst_socket function and tries to connect >>>to /var/run/domain.sock to see if it's still in use. The connection fails >>>(because the backlog is full). Pid A unlinks /var/run/domain.sock. >>>Everything is sad forever now. >>> >>> I'm thinking about just commenting out the call to destroy_uxst_socket >>> since this is all on a tmpfs and we don't really care if spare sockets are >>> leaked when/if we change configuration in the future. Arguably, the >>> solution should be something where we don't overflow the listen socket at >>> all; I'm thinking about also binding to a TCP port on localhost and just >>> using that for the few seconds it takes to reload (since otherwise we run >>> out of ephemeral sockets to 127.0.0.1); it still seems wrong for haproxy to >>> unlink the socket, though. >>> >>> This has proven extremely irritating to reproduce (since it only occurs >>> if there's enough load to fill up the backlog on the socket between when >>> pid B starts up and when pid A shuts down), but I'm pretty confident that >>> what I described above is happening, since periodically on reloads the >>> domain socket isn't there and this code fits. >>> >>> Our configs are quite large, so I'm not reproducing them here. The >>> reason we bind on a domain socket at all is because we're running two sets >>> of haproxies — one in multi-process mode doing TCP-mode SSL termination >>> pointing back over a domain socket to a single-process haproxy applying all >>> of our actual config. >>> >>> -- >>> James Brown >>> Systems >>> Engineer >>> >> >> >> >> -- >> James Brown >> Engineer >> > > -- James Brown Engineer
Re: haproxy deleting domain socket on graceful reload if backlog overflows
This just hit us again on a different set of load balancers... if there's a listen socket overflow on a domain socket during graceful, haproxy completely deletes the domain socket and becomes inaccessible. On Tue, Feb 21, 2017 at 6:47 PM, James Brown wrote: > Under load, we're sometimes seeing a situation where HAProxy will > completely delete a bound unix domain socket after a reload. > > The "bad flow" looks something like the following: > > >- haproxy is running on pid A, bound to /var/run/domain.sock (via a >bind line in a frontend) >- we run `haproxy -sf A`, which starts a new haproxy on pid B >- pid B binds to /var/run/domain.sock.B >- pid B moves /var/run/domain.sock.B to /var/run/domain.sock (in >uxst_bind_listener) >- in the mean time, there are a zillion connections to >/var/run/domain.sock and pid B isn't started up yet; backlog is exhausted >- pid B signals pid A to shut down >- pid A runs the destroy_uxst_socket function and tries to connect to >/var/run/domain.sock to see if it's still in use. The connection fails >(because the backlog is full). Pid A unlinks /var/run/domain.sock. >Everything is sad forever now. > > I'm thinking about just commenting out the call to destroy_uxst_socket > since this is all on a tmpfs and we don't really care if spare sockets are > leaked when/if we change configuration in the future. Arguably, the > solution should be something where we don't overflow the listen socket at > all; I'm thinking about also binding to a TCP port on localhost and just > using that for the few seconds it takes to reload (since otherwise we run > out of ephemeral sockets to 127.0.0.1); it still seems wrong for haproxy to > unlink the socket, though. > > This has proven extremely irritating to reproduce (since it only occurs if > there's enough load to fill up the backlog on the socket between when pid B > starts up and when pid A shuts down), but I'm pretty confident that what I > described above is happening, since periodically on reloads the domain > socket isn't there and this code fits. > > Our configs are quite large, so I'm not reproducing them here. The reason > we bind on a domain socket at all is because we're running two sets of > haproxies — one in multi-process mode doing TCP-mode SSL termination > pointing back over a domain socket to a single-process haproxy applying all > of our actual config. > > -- > James Brown > Systems > Engineer > -- James Brown Engineer
Re: Feature request: routing a TCP stream based on Cipher Suites in a TLS ClientHello
Unfortunately, that feature only works with OpenSSL 1.0.2 (which, incidentally, would be a good thing to note in the documentation)... On Wed, Feb 22, 2017 at 4:39 PM, Lukas Tribus wrote: > Hello James, > > > Am 23.02.2017 um 01:11 schrieb James Brown: > >> Right now, the "best" way I'm aware of to serve both an RSA and an ECDSA >> certificate on the same IP to different clients is to use req.ssl_ec_ext < >> http://cbonte.github.io/haproxy-dconv/1.7/configuration. >> html#7.3.5-req.ssl_ec_ext> >> to determine if a set of supported elliptic curves was passed in the >> ClientHello. >> > > No, you don't have to do this anymore. > > Forget the TCP frontend with req.ssl_ec_ext, you can configure multiple > cert types > directly as per [1]. > > Its a simple as naming the actual files "example.pem.rsa" and > "example.pem.ecdsa" and > point to it by its base name "ssl crt example.pem". > > > Regards, > Lukas > > [1] http://cbonte.github.io/haproxy-dconv/1.7/configuration.html#5.1-crt > -- James Brown Engineer
Feature request: routing a TCP stream based on Cipher Suites in a TLS ClientHello
Right now, the "best" way I'm aware of to serve both an RSA and an ECDSA certificate on the same IP to different clients is to use req.ssl_ec_ext <http://cbonte.github.io/haproxy-dconv/1.7/configuration.html#7.3.5-req.ssl_ec_ext> to determine if a set of supported elliptic curves was passed in the ClientHello. Unfortunately, if clients disable ECDSA cipher suites (either manually or through poor defaults), the EC extension block will still be present, but the user will be unable to negotiate a handshake with an ECDSA-using server. It would be nice to be able to direct users with no ECDSA cipher suites to the RSA backend instead. It would be nice to have a set of booleans available at the same level as req.ssl_ec_ext for determining if various families of cipher suites are present. I envision something like req.ssl_rsa_supported, req.ssl_dsa_supported, and req.ssl_ecdsa_supported. I suppose we could also just add a fetcher that exposes the entire client cipher-suite list as a string and then use a regexp to determine if, e..g, the string "-ECDSA" occurs in that list, but that seems somewhat failure-prone. Thoughts? -- James Brown Engineer
haproxy deleting domain socket on graceful reload if backlog overflows
Under load, we're sometimes seeing a situation where HAProxy will completely delete a bound unix domain socket after a reload. The "bad flow" looks something like the following: - haproxy is running on pid A, bound to /var/run/domain.sock (via a bind line in a frontend) - we run `haproxy -sf A`, which starts a new haproxy on pid B - pid B binds to /var/run/domain.sock.B - pid B moves /var/run/domain.sock.B to /var/run/domain.sock (in uxst_bind_listener) - in the mean time, there are a zillion connections to /var/run/domain.sock and pid B isn't started up yet; backlog is exhausted - pid B signals pid A to shut down - pid A runs the destroy_uxst_socket function and tries to connect to /var/run/domain.sock to see if it's still in use. The connection fails (because the backlog is full). Pid A unlinks /var/run/domain.sock. Everything is sad forever now. I'm thinking about just commenting out the call to destroy_uxst_socket since this is all on a tmpfs and we don't really care if spare sockets are leaked when/if we change configuration in the future. Arguably, the solution should be something where we don't overflow the listen socket at all; I'm thinking about also binding to a TCP port on localhost and just using that for the few seconds it takes to reload (since otherwise we run out of ephemeral sockets to 127.0.0.1); it still seems wrong for haproxy to unlink the socket, though. This has proven extremely irritating to reproduce (since it only occurs if there's enough load to fill up the backlog on the socket between when pid B starts up and when pid A shuts down), but I'm pretty confident that what I described above is happening, since periodically on reloads the domain socket isn't there and this code fits. Our configs are quite large, so I'm not reproducing them here. The reason we bind on a domain socket at all is because we're running two sets of haproxies — one in multi-process mode doing TCP-mode SSL termination pointing back over a domain socket to a single-process haproxy applying all of our actual config. -- James Brown Systems Engineer
Re: HTTP 429 Too Many Requests
+1 I am also using a fake backend with no servers and a 503 errorfile, and it confuses everybody who looks at the config or the metrics. Being able to directly emit a 429 would be fantastic. On Fri, Jun 24, 2016 at 10:30 AM, Daniel Schneller < daniel.schnel...@centerdevice.com> wrote: > Hello! > > We use haproxy as an L7 rate limiter based on tracking certain header > fields and URLs. A more detailed description of what we do can be found > in a blog post I wrote about this some time ago: > > https://blog.codecentric.de/en/2014/12/haproxy-http-header-rate-limiting > > Our exact setup has changed a bit since then, but the gist remains the > same: > > * Calculate the rate of requests by tracking those with identical > authorization header values > * If they exceed a threshold, slow the client down (tarpit) and ask > them to come back after a certain period by sending them HTTP 429: > > HTTP/1.1 429 Too Many Requests > Cache-Control: no-cache > Connection: close > Content-Type: text/plain > Retry-After: 60 > > Too Many Requests (HAP429). > > I am currently refactoring our haproxy config to make it more readable > and maintainable; while doing so, I would like to get rid of the > somewhat crude pseudo backend in which I specify the errorfile for > status code 500, replacing 500 with 429 when sending it out to the > client. This, of course, leads to the status code being 500 the logs > and other inconveniences. > > My suggestion about how to handle this would be an extension to the > "http-request deny" directive. Currently it will always respond with > HTTP status code 403. If there were a configuration setting allowing me > to specify different code (like 429 in my case) as the reason for the > rejection, that would be an elegant solution. Using an "http-request > set-header" would even allow me to specify different values for the > "Retry-After:" header to inform well-written clients after which time > they should come back and try again. > > Does that sound like a sensible addition? > > Cheers, > Daniel > > > > -- > Daniel Schneller > Principal Cloud Engineer > > CenterDevice GmbH > https://www.centerdevice.de > > > > -- James Brown Engineer
Managing `Expect: 100-continue` in HAProxy?
We've got HAProxy sitting in front of a menagerie of web servers, none of which handle `Expect: 100-continue` in any way, shape, or form. When someone hits us with a POST from cURL, there's a kind of irritating 1 second delay while cURL waits for the "HTTP/1.1 100 Continue" response. Rather than try to solve it in every application server, I was wondering if there's any way to force HAProxy to send a "100 Continue" response when it gets a POST with "Expect: 100-continue" (and then delete the Expect from the proxied request, of course). It seems like there's already code for sending a 100 Continue if the `http-buffer-request` option is set, so I guess I'm just asking about the feasibility of making that behavior a stand-alone option without having to put the whole request in RAM. -- James Brown Engineer
Re: "show servers state" shows nothing?
(gentle bump) On Mon, Apr 25, 2016 at 11:36 AM, James Brown wrote: > Here's the top of the file. None of the backends override the > load-server-state-from-file setting that's made in `defaults`. There > are 106 backends defined. > > > global > log ${LOG_DGRAM_SYSLOG} local0 > log /var/run/epservices/syslog_bridge.sock local0 > daemon > maxconn 4096 > stats socket /var/run/epservices/lbng/haproxy.sock level admin > tune.ssl.default-dh-param 1024 > server-state-file /srv/var/lbng/state > > defaults > mode http > option httplog > load-server-state-from-file global > log global > retries 3 > timeout client 9s > timeout connect 3s > timeout server 90s > > On Sun, Apr 24, 2016 at 9:07 AM, Baptiste wrote: >> On Thu, Apr 21, 2016 at 2:54 AM, James Brown wrote: >>> I'm trying to set up state-file saving on 1.6.4, but "show servers state" >>> doesn't return anything. It works fine if I specify an individual backend >>> (e.g., "show servers state foo_be"), but not if I run it "bare" (which the >>> manual suggests should print out states for all backends). >>> >>> Any thoughts? >>> >>> -- >>> James Brown >>> Engineer >> >> >> Hi, >> >> Could you share the relevent part of the configuration? >> >> Baptiste > > > > -- > James Brown > Engineer -- James Brown Engineer
Re: "show servers state" shows nothing?
Here's the top of the file. None of the backends override the load-server-state-from-file setting that's made in `defaults`. There are 106 backends defined. global log ${LOG_DGRAM_SYSLOG} local0 log /var/run/epservices/syslog_bridge.sock local0 daemon maxconn 4096 stats socket /var/run/epservices/lbng/haproxy.sock level admin tune.ssl.default-dh-param 1024 server-state-file /srv/var/lbng/state defaults mode http option httplog load-server-state-from-file global log global retries 3 timeout client 9s timeout connect 3s timeout server 90s On Sun, Apr 24, 2016 at 9:07 AM, Baptiste wrote: > On Thu, Apr 21, 2016 at 2:54 AM, James Brown wrote: >> I'm trying to set up state-file saving on 1.6.4, but "show servers state" >> doesn't return anything. It works fine if I specify an individual backend >> (e.g., "show servers state foo_be"), but not if I run it "bare" (which the >> manual suggests should print out states for all backends). >> >> Any thoughts? >> >> -- >> James Brown >> Engineer > > > Hi, > > Could you share the relevent part of the configuration? > > Baptiste -- James Brown Engineer
agent-check sends PROXY protocol
It appears that if a server is configured to send the PROXY protocol *and* the server does not have a `check port` set, the agent check will always send the PROXY protocol. This doesn't seem to be documented anywhere, and it's kind of strange (especially since non-agent checks have the check-send-proxy flag available to control whether the PROXY protocol is emitted). It's not hard to make my agent support receiving the PROXY protocol, but it's kind of strange, since nothing's actually being proxied. Thoughts? -- James Brown Engineer
"show servers state" shows nothing?
I'm trying to set up state-file saving on 1.6.4, but "show servers state" doesn't return anything. It works fine if I specify an individual backend (e.g., "show servers state foo_be"), but not if I run it "bare" (which the manual suggests should print out states for all backends). Any thoughts? -- James Brown Engineer
Re: Increased CPU usage after upgrading 1.5.15 to 1.5.16
Calling DES functions is kind of suspicious? I'd expect any clients made in the last decade or so to be negotiating AES (which is much, *much* faster than DES) with either the default settings or any reasonably-secure custom settings. Can you check what cipher suites you've negotiated in production? If something is causing you to negotiate a 3DES-based cipher suite instead of an AES (preferably AES-GCM)-based cipher suite, that would definitely explain increased CPU usage. On Thu, Apr 7, 2016 at 5:25 AM, Lukas Tribus wrote: > Hi, > > Am 05.04.2016 um 10:17 schrieb Nenad Merdanovic: > >> >> I am not sure, as I haven't even be able to reliably reproduce it on 1.5 >> (though we are running with some backports from 1.6) as it seems to be >> traffic-pattern related. On one workload I exhibit instant and constant >> jump in CPU usage (from 40% to 80-100%, about 50:50 sys:usr), but on >> other, there are just some very short spikes to 100%. >> > > I've played around with an unscientific testcase (single session, large > 10MB response), perf > and ltrace, and while the number of SSL_Write calls are the same, OpenSSL > seems to > be doing more low level stuff in functions like _x86_DES_encrypt and > _x86_DES_decrypt. > > So this commit does make OpenSSL uncomfortable in some way, although it is > probably > not related to the number of SSL_write calls. > > Not sure if this is helpful. > > > cheers, > lukas > > > -- James Brown Engineer
Re: servers multiple sources
Templating out (or entirely-procedurally-generating) your HAproxy config file is a must once you exceed the bare minimum of complexity. :-) Best of luck! On Tue, Mar 22, 2016 at 3:16 AM, Beluc wrote: > well, it's can become a real mess with lot of server and source :) > but sure, it works ! > > 2016-03-21 19:21 GMT+01:00 James Brown : > > Why not just add each server multiple times with a different src > parameter > > and a different name. > > > > Something like > > > > backend my_be > > mode tcp > > server server1_src1 10.1.0.1 source 10.0.0.1 > > server server1_src2 10.1.0.1 source 10.0.0.2 > > server server2_src1 10.1.0.2 source 10.0.0.1 > > server server2_src2 10.1.0.2 source 10.0.0.2 > > > > On Mon, Mar 21, 2016 at 8:20 AM, Beluc wrote: > >> > >> Hi, > >> > >> We're trying to find a way to have multiple sources per server and > >> thus bypass 64k connections per server. > >> > >> We already tried with SNAT iptables : > >> iptables -t nat -A POSTROUTING -o eth2 -j SNAT --to 10.0.0.1-10.0.10 > >> > >> without success because kernel is hashing real source ip and real > >> destination ip, so only one source ip nated is used (aka same as using > >> one different source per server). > >> > >> Any idea on achieving this ? maybe in lua ? > >> > >> Regards, > >> > > > > > > > > -- > > James Brown > > Engineer > -- James Brown Engineer
Re: servers multiple sources
Why not just add each server multiple times with a different src parameter and a different name. Something like backend my_be mode tcp server server1_src1 10.1.0.1 source 10.0.0.1 server server1_src2 10.1.0.1 source 10.0.0.2 server server2_src1 10.1.0.2 source 10.0.0.1 server server2_src2 10.1.0.2 source 10.0.0.2 On Mon, Mar 21, 2016 at 8:20 AM, Beluc wrote: > Hi, > > We're trying to find a way to have multiple sources per server and > thus bypass 64k connections per server. > > We already tried with SNAT iptables : > iptables -t nat -A POSTROUTING -o eth2 -j SNAT --to 10.0.0.1-10.0.10 > > without success because kernel is hashing real source ip and real > destination ip, so only one source ip nated is used (aka same as using > one different source per server). > > Any idea on achieving this ? maybe in lua ? > > Regards, > > -- James Brown Engineer
Re: gpc0_rate computing incorrectly with peer replication turned in [in 1.6.3]
I believe that there used to be server IDs in here, but no longer. Peering in 1.5 seems to just ignore it, but 1.6 does something strange under load. Still haven't reproduced in a test environment, though. On Wednesday, February 24, 2016, Bryan Talbot wrote: > On Wed, Feb 24, 2016 at 6:05 PM, James Brown > wrote: > > > > We use a gpc0 counter for rate-limiting certain requests in our > application. It was working fine with 1.5.14, but as soon as I upgraded to > 1.6.3, we started seeing the gpc0_rate value go crazy – it's currently > showing values in the hundreds of thousands when the underlying gpc0 > counter has > > stick-table type string len 32 size 512 expire 5m store > gpc0,gpc0_rate(5m),http_req_rate(10s) peers lbsj > > > > > > > I didn't realize that stick tables without a server-id entry like this > would be replicated to remotes. My reading of the docs for 1.5 and 1.6 > stick-table peers option makes it seem like ONLY stick-table entries with a > server-id are replicated to remotes. Maybe this is not the case? > > Entries which associate keys to server IDs are kept synchronized with the > remote peers declared in this section. > > > > https://cbonte.github.io/haproxy-dconv/configuration-1.5.html#stick-table > https://cbonte.github.io/haproxy-dconv/configuration-1.6.html#stick-table > > -Bryan > > > P.S. Now I think I know why we got a bunch of 'too many errors' responses > from EasyPost today! > > -- James Brown Engineer
gpc0_rate computing incorrectly with peer replication turned in [in 1.6.3]
We use a gpc0 counter for rate-limiting certain requests in our application. It was working fine with 1.5.14, but as soon as I upgraded to 1.6.3, we started seeing the gpc0_rate value go crazy – it's currently showing values in the hundreds of thousands when the underlying gpc0 counter has *never* been incremented (and has a value of 0). This only occurs if I have peer replication turned on for the relevant stick table -- disabling peer replication and resetting the table puts everything back to normal. Some examples from the `show table` command on the relevant table (with sensitive data x'd out): 0x305ef9c: key=Bearer\ use=0 exp=294451 gpc0=0 gpc0_rate(30)=2281 http_req_rate(1)=7 0x306058c: key=Bearer\ use=0 exp=299766 gpc0=0 gpc0_rate(30)=2737 http_req_rate(1)=18 0x30615cc: key=Bearer\ use=1 exp=298819 gpc0=0 gpc0_rate(30)=3285 http_req_rate(1)=2 0x306101c: key=Bearer\ use=0 exp=296676 gpc0=0 gpc0_rate(30)=3959 http_req_rate(1)=1 0x305edfc: key=Bearer\ use=0 exp=258288 gpc0=0 gpc0_rate(30)=5190 http_req_rate(1)=0 0x305df5c: key=Bearer\ use=0 exp=254686 gpc0=0 gpc0_rate(30)=5936 http_req_rate(1)=0 0x321a2dc: key=Bearer\ use=0 exp=136673 gpc0=0 gpc0_rate(30)=22826 http_req_rate(1)=0 0x3061aac: key=Bearer\ use=0 exp=299854 gpc0=0 gpc0_rate(30)=261368 http_req_rate(1)=2 0x305fafc: key=Bearer\ use=0 exp=299041 gpc0=0 gpc0_rate(30)=262366 http_req_rate(1)=1 0x306183c: key=Bearer\ use=0 exp=226854 gpc0=0 gpc0_rate(30)=299375 http_req_rate(1)=0 0x3060cdc: key=Bearer\ use=0 exp=281845 gpc0=0 gpc0_rate(30)=489641 http_req_rate(1)=0 And the relevant config snipped (again, somewhat trimmed for this audience): frontend https_easypost_com mode http # [... lots of config snipped here ...] # rate-limiting by auth code tcp-request inspect-delay 4s tcp-request content track-sc0 req.hdr(Authorization) stick-table type string len 32 size 512 expire 5m store gpc0,gpc0_rate(5m),http_req_rate(10s) peers lbsj # Limit: 200 r/s/client acl high_rate sc0_http_req_rate gt 2000 acl force_rate_limit req.hdr(X-Force-Status) eq rate_limit acl force_flooding req.hdr(X-Force-Status) eq flooding acl punish sc0_inc_gpc0 ge 0 acl is_403 status 403 acl is_402 status 402 # allow no more than 10 401's or 402's every 5 minutes from a given client acl flooding4XXs sc0_gpc0_rate ge 10 http-response allow if is_402 punish http-response allow if is_403 punish use_backend four_twenty_nine_bad_status if flooding4XXs or force_flooding default_backend easypost_be The rate-limiting based on the http_req_rate counter seems fine (and none of the values in the table look nuts). I've been trying to reproduce this in our development environment with no luck (even when running with peer replication enabled), but within seconds of issuing a `clear table` in production, there are gpc0_rate values in the hundreds of thousands. Has anyone seen any weird behavior like this? -- James Brown Engineer
Re: Multiplexing multiple services behind one agent (feature suggestion; patch attached)
Attached is a `git-format-patch`-formatted patch with some extra strduping and freeing. On Fri, Oct 30, 2015 at 11:39 PM, Willy Tarreau wrote: > Hi James, > > On Wed, Oct 28, 2015 at 10:27:22AM -0700, James Brown wrote: > > Sorry for being thickheaded, Willy, but what's your decision here ??? do > you > > want me to make it per-Backend instead of per-Server, or do you want to > > merge it as-is? > > Well, I think we can take it as-is then. The per-server setting doesn't > block the ability to later add a per-backend setting anyway. However you > need to fix one point in the patch : the string must be allocated per > server (so that we don't cause double-free when releasing it on exit). > Please use strdup() to allocate the string from the default server, and > please call free() on the server's string before assigning a new one, so > that we don't leak small memory chunks when setting multiple default-server > entries. Same when creating a new proxy (look for "fwdfor_hdr_name" as a > hit about where you should have to free(defproxy->agent_send). Also please > ensure that you properly assign the string from the default proxy's > default-server to the current proxy's. fwdfor_hdr_name is properly set > regarding this so you won't have to search too long. > > Last point, please build your patch using "git format-patch", so that I > can simply apply it. You used "git show", which is sufficient for a review > but requires manual modifications. If you have a single patch in your > branch, you can simply use "git format-patch -1" and you'll get the patch > for the latest commit. > > Thanks! > Willy > > -- James Brown Engineer 0001-Add-agent-send-server-parameter.patch Description: Binary data
Re: Multiplexing multiple services behind one agent (feature suggestion; patch attached)
Sorry for being thickheaded, Willy, but what's your decision here — do you want me to make it per-Backend instead of per-Server, or do you want to merge it as-is? I'm glad to defer completely to your judgement on this. On Thu, Oct 22, 2015 at 1:12 PM, Willy Tarreau wrote: > On Thu, Oct 22, 2015 at 01:04:28PM -0700, James Brown wrote: > > It would be... more convenient for my use case to be able to encode the > > string in the config (we may have several backends for a different > service > > to correspond to phased deployment rollouts, and I'd prefer to encode the > > logic for mapping backend name to service name in my existing haproxy > > config templating script rather than adding it to the agent script). If > you > > think it's significantly better, I could make it a static string, though, > > and bite the bullet for complexity in the agent. > > OK. Stick to your configurable string then. If we later want something > automatic we can use a special keyword. You may even reserve the string > "auto" or something like this as an argument to indicate that the string > will be automatically filled. > > > We could also do something where we defined it per-backend (like the way > > `option httpchk` works now; I could easily change this to a backend > > parameter set with `option agent-send` or whatever). > > I was thinking about something like this as well. It's true that if you > want something automatic then you don't want to configure it on each and > every server! > > OK thus I think we can merge your patch then. > > Regards, > Willy > > -- James Brown Engineer
Re: Multiplexing multiple services behind one agent (feature suggestion; patch attached)
It would be... more convenient for my use case to be able to encode the string in the config (we may have several backends for a different service to correspond to phased deployment rollouts, and I'd prefer to encode the logic for mapping backend name to service name in my existing haproxy config templating script rather than adding it to the agent script). If you think it's significantly better, I could make it a static string, though, and bite the bullet for complexity in the agent. We could also do something where we defined it per-backend (like the way `option httpchk` works now; I could easily change this to a backend parameter set with `option agent-send` or whatever). On Thu, Oct 22, 2015 at 12:51 PM, Willy Tarreau wrote: > On Thu, Oct 22, 2015 at 11:37:47AM +0200, Baptiste wrote: > > This is interesting. > > Yep, that's in line with the HTTP option to send some server info > with HTTP checks (name, weight and I don't remember what). > > > That said, I'm suggesting an improvement: use the log format varialble. > > No, please no! > > Log-format only works with a session. Here checks don't have any session > and are completely out of band. Most tags and fetches will either not work > or simply crash the system. Not to mention the added complexity in the > config parser to deal with these cases. > > In order to simplify the config, I'd suggest just to always send the > exact same string (eg: "backend=XXX; server=YYY\n") which is easy to > expand with new tags in the future if needed (such as the LB node for > example). > > Regards, > Willy > > -- James Brown Engineer
Re: Multiplexing multiple services behind one agent (feature suggestion; patch attached)
That would definitely be more useful, although it would also require rewriting a lot of the code around how log variables are interpolated. Right now, it looks like interpolation is pretty tightly tied to the actual act of logging (the build_logline function is pretty attached to the idea of having an active session available, and I think it'd be a lot of work to detangle it into a "build_logline_from_backend" that just took a backend or somesuch). On Thu, Oct 22, 2015 at 2:37 AM, Baptiste wrote: > On Thu, Oct 22, 2015 at 3:59 AM, James Brown wrote: > > Hello haproxy@: > > > > My name is James Brown; I wrote a small piece of software called hacheck > > (https://github.com/Roguelazer/hacheck) which is designed to be a > healthcheck > > proxy for decentralized load balancer control (remove a node from a load > > balancer without knowing where the load balancers are; helpful once you > > start to have a truly, stupidly large number of load balancers). > > > > I am interested in using agent-checks instead of co-opting the existing > > httpchk mechanism; unfortunately, it looks like there's no convenient > way to > > multiplex multiple services onto a single agent-port and reasonably > > disambiguate them. For example, it'd be great if I could have a server > which > > runs one agent-check responder and can `MAINT` any of a dozen (or a > hundred) > > different services running on this box. > > > > I've attached a small patch which adds a new server parameter > (agent-send) > > which is a static string which will be sent to the agent on every server. > > This allows me to generate configs that look like > > > > backend foo > > server web1 10.1.2.1:8001 agent-check agent-port 3334 > agent-send "foo/web1\n" > > server web2 10.1.2.2:8001 agent-check agent-port 3334 > agent-send "foo/web2\n" > > > > backend bar > > server web1 10.1.2.1:8002 agent-check agent-port 3334 > agent-send "bar/web1\n" > > server web2 10.1.2.2:8002 agent-check agent-port 3334 > agent-send "bar/web2\n" > > > > And have a single service (running on port 3334) which can easily MAINT > or > > UP either "foo" or "bar" depending on the value that it receives. > > > > The patch seems to work in my limited testing (that is to say, HAProxy > sends > > the string and doesn't segfault or leak infinite amounts of RAM). > > > > Does this sound useful to anyone else? Is it worth upstreaming the > patch? I > > welcome your thoughts. > > -- > > James Brown > > Engineer > > EasyPost > > Hi James, > > This is interesting. > That said, I'm suggesting an improvement: use the log format varialble. > > So your configuration would become: > > backend foo > default-server agent-send "%b/%s\n" > server web1 10.1.2.1:8001 agent-check agent-port 3334 > server web2 10.1.2.2:8001 agent-check agent-port 3334 > > Baptiste > -- James Brown Engineer
Re: HAproxy version 1.5 on centos 6.5
The .SPEC file included in the source (under examples/haproxy.spec) should build cleanly for CentOS 6 into an RPM (with rpmbuild). We modify it to change the TARGET to linux2628 and to enable openssl, but otherwise it seems to work fine. On Thu, Oct 22, 2015 at 1:00 AM, Wilence Yao wrote: > Hi, > I am a software developer from China. HAProxy is widely used in our > company and it help build our system stable and available. Thank you very > much for your efforts. > To make our system more stable and high availablity, our engineers > combine haproxy and keepalived to suffer from one point failure of > loadbalancer. > It's very excited to know haproxy peers to synchronize session. > > Unfortunately, our most production environments are centos 6.5. Rpm > installation output: > > >>> > $ rpm -ivh haproxy-1.5.14-3.1.x86_64.rpm > > warning: haproxy-1.5.14-3.1.x86_64.rpm: Header V3 RSA/SHA256 Signature, > key ID 8e1431d5: NOKEY > > error: Failed dependencies: > > libc.so.6(GLIBC_2.14)(64bit) is needed by haproxy-1.5.14-1.fc22.x86_64 > > libc.so.6(GLIBC_2.15)(64bit) is needed by haproxy-1.5.14-1.fc22.x86_64 > > libpcre.so.1()(64bit) is needed by haproxy-1.5.14-1.fc22.x86_64 > > systemd is needed by haproxy-1.5.14-1.fc22.x86_64 > >>> > Because of systemd dependency, we just can't install haproxy v1.5 in > centos 6.5. > > Do you have any solution or idea about this problem? > > > Thanks for any response. > > Best Regards. > > > Wilence Yao > -- James Brown Engineer
Multiplexing multiple services behind one agent (feature suggestion; patch attached)
Hello haproxy@: My name is James Brown; I wrote a small piece of software called hacheck (https://github.com/Roguelazer/hacheck) which is designed to be a healthcheck proxy for decentralized load balancer control (remove a node from a load balancer without knowing where the load balancers are; helpful once you start to have a truly, stupidly large number of load balancers). I am interested in using agent-checks instead of co-opting the existing httpchk mechanism; unfortunately, it looks like there's no convenient way to multiplex multiple services onto a single agent-port and reasonably disambiguate them. For example, it'd be great if I could have a server which runs one agent-check responder and can `MAINT` any of a dozen (or a hundred) different services running on this box. I've attached a small patch which adds a new server parameter (agent-send) which is a static string which will be sent to the agent on every server. This allows me to generate configs that look like backend foo server web1 10.1.2.1:8001 agent-check agent-port 3334 agent-send "foo/web1\n" server web2 10.1.2.2:8001 agent-check agent-port 3334 agent-send "foo/web2\n" backend bar server web1 10.1.2.1:8002 agent-check agent-port 3334 agent-send "bar/web1\n" server web2 10.1.2.2:8002 agent-check agent-port 3334 agent-send "bar/web2\n" And have a single service (running on port 3334) which can easily MAINT or UP either "foo" or "bar" depending on the value that it receives. The patch seems to work in my limited testing (that is to say, HAProxy sends the string and doesn't segfault or leak infinite amounts of RAM). Does this sound useful to anyone else? Is it worth upstreaming the patch? I welcome your thoughts. -- James Brown Engineer EasyPost commit dde76f0aadfcd386c36b27fb4ff49a5163bc9b93 Author: James Brown Date: Wed Oct 21 18:19:05 2015 -0700 Add agent-send server parameter Causes HAProxy to emit a static string to the agent on every check, so that you can independently control multiple services running behind a single agent port. diff --git a/doc/configuration.txt b/doc/configuration.txt index b509238..2452223 100644 --- a/doc/configuration.txt +++ b/doc/configuration.txt @@ -10056,6 +10056,13 @@ agent-check Supported in default-server: No +agent-send + If this option is specified, haproxy will send the given string (verbatim) + to the agent server upon connection. You could, for example, encode + the backend name into this string, which would enable your agent to send + different responses based on the backend. Make sure to include a '\n' if + you want to terminate your request with a newline. + agent-inter The "agent-inter" parameter sets the interval between two agent checks to milliseconds. If left unspecified, the delay defaults to 2000 ms. diff --git a/include/types/checks.h b/include/types/checks.h index 02fc743..dd20184 100644 --- a/include/types/checks.h +++ b/include/types/checks.h @@ -176,6 +176,8 @@ struct check { * rise to rise+fall-1 = good */ int rise, fall; /* time in iterations */ int type; /* Check type, one of PR_O2_*_CHK */ + char *send_string; /* optionally send a string when connecting to the agent */ + int send_string_len;/* length of agent command string */ struct server *server; /* back-pointer to server */ char **argv;/* the arguments to use if running a process-based check */ char **envp;/* the environment to use if running a process-based check */ diff --git a/src/checks.c b/src/checks.c index ade2428..0e72a32 100644 --- a/src/checks.c +++ b/src/checks.c @@ -1459,6 +1459,10 @@ static int connect_conn_chk(struct task *t) } } + if ((check->type & PR_O2_LB_AGENT_CHK) && check->send_string_len) { + bo_putblk(check->bo, check->send_string, check->send_string_len); + } + /* prepare a new connection */ conn_init(conn); diff --git a/src/server.c b/src/server.c index 8ddff00..55c2678 100644 --- a/src/server.c +++ b/src/server.c @@ -984,6 +984,8 @@ int parse_server(const char *file, int linenum, char **args, struct proxy *curpr newsrv->check.downinter = curproxy->defsrv.check.downinter; newsrv->agent.use_ssl = curproxy->defsrv.agent.use_ssl; newsrv->agent.port = curproxy->defsrv.agent.port; + newsrv->agent.send_string = curproxy->defsrv.agent.send_string; + newsrv->agent.send_string_len = curproxy->