Re: Haproxy 1.8.25 segfault
On Wed, May 27, 2020 at 11:48:05AM +1000, Igor Cicimov wrote: > Hi Willy, > > On Tue, May 26, 2020 at 4:43 PM Willy Tarreau wrote: > > > On Sun, May 24, 2020 at 10:35:10AM +1000, Igor Cicimov wrote: > > > We are getting segfaults with haproxy 1.8.25 > > > > By the way, does this mean you didn't get them with a previous version > > (presumably 1.8.24) ? There aren't that many fixes between 1.8.24 and > > 1.8.25, only 23. > > > > Yes, it started happening recently for some reason def on 1.8.25 only: OK, very useful! > Unfortunately (in context of figuring out the issue) we are not using lua > :-/ So I'm hardly seeing a candidate here, unless we have another latent bug that was woken up by one of the recent fixes. > One thing I noticed though was that there was an OCSP file that had landed > by mistake inside the SSL directory HAP is loading the certificates from. > Do you think something like that can cause this to happen over the course > of time? I don't but thought worth mentioning since that was the only diff > I could see from our standard config elsewhere. I don't think so, but thanks for mentioning it, we never know and any track should be considered! Cheers, Willy
Re: Haproxy 1.8.25 segfault
Hi Willy, On Tue, May 26, 2020 at 4:43 PM Willy Tarreau wrote: > On Sun, May 24, 2020 at 10:35:10AM +1000, Igor Cicimov wrote: > > We are getting segfaults with haproxy 1.8.25 > > By the way, does this mean you didn't get them with a previous version > (presumably 1.8.24) ? There aren't that many fixes between 1.8.24 and > 1.8.25, only 23. > Yes, it started happening recently for some reason def on 1.8.25 only: # zgrep -i segfault /var/log/syslog.*.gz /var/log/syslog.4.gz:May 23 00:36:52 ip-172-31-37-74 kernel: [30284682.620567] haproxy[14736]: segfault at 5609a853 ip 7f1b93928c10 sp 7ffd5e731fd8 error 4 in libc-2.19.so [7f1b9388e000+1be000] /var/log/syslog.5.gz:May 22 01:18:55 ip-172-31-37-74 kernel: [30200805.498707] haproxy[7361]: segfault at 5575725c8fff ip 7f7d35bd4c10 sp 7b58a078 error 4 in libc-2.19.so [7f7d35b3a000+1be000] /var/log/syslog.5.gz:May 22 12:15:55 ip-172-31-37-74 kernel: [30240225.673643] haproxy[15054]: segfault at 555f41a03fff ip 7f594d00fc10 sp 7ffeac111c98 error 4 in libc-2.19.so [7f594cf75000+1be000] /var/log/syslog.6.gz:May 21 12:08:13 ip-172-31-37-74 kernel: [30153363.801627] haproxy[28398]: segfault at 55b4b43ccfff ip 7f5f33b53c10 sp 7ffe7fc290e8 error 4 in libc-2.19.so [7f5f33ab9000+1be000] /var/log/syslog.7.gz:May 20 16:04:54 ip-172-31-37-74 kernel: [30081165.011057] haproxy[4830]: segfault at 563a387fafff ip 7fa11e6f2c10 sp 7fffaacd82c8 error 4 in libc-2.19.so [7fa11e658000+1be000] > > The only one among them that I'm seeing capable of possibly having a > side effet in unclear code parts would be this one: > >3d69a6029 ("BUG/MINOR: lua: Ignore the reserve to know if a channel is > full or not") > > Do you use some Lua code which would involve the is_full() attribute on > a channel ? > > Willy > Unfortunately (in context of figuring out the issue) we are not using lua :-/ One thing I noticed though was that there was an OCSP file that had landed by mistake inside the SSL directory HAP is loading the certificates from. Do you think something like that can cause this to happen over the course of time? I don't but thought worth mentioning since that was the only diff I could see from our standard config elsewhere. Thanks, Igor
Re: Haproxy 1.8.25 segfault
Hi Willy, On Tue, May 26, 2020 at 4:31 PM Willy Tarreau wrote: > Hi Igor, > > On Sun, May 24, 2020 at 10:35:10AM +1000, Igor Cicimov wrote: > > Hi guys, > > > > We are getting segfaults with haproxy 1.8.25 and thought I would ask if > > this rings any bell: > > > > segfault at 5609a853 ip 7f1b93928c10 sp 7ffd5e731fd8 error 4 > in > > libc-2.19.so[7f1b9388e000+1be000] > > At this point, no unfortunately. This could be a memcpy() on a NULL > pointer or a use after free for example. > > > It is running on Ubuntu-14.04.2 (kernel 4.4.0-144-generic) and is > happening > > only on this particular one out of many dozens we have on Ubuntu-14.04 > and > > 16.04 > > > > I have attached strace so more details upon the next crash. > > I doubt you'll see much more using strace. You'd rather attach gdb to > it and let it run. This way when it crashes again you can issue "bt full" > and see the whole trace. > > Done. Hopefully I get something useful on the next segfault. > It is even possible to force a core to be dumped from gdb for later > inspection using "generate-core-file". Some people also know how to script > it so that it automatically dumps and detaches upon crash, and limits the > service interruption time, but I never remember how to do this, and the > help embedded in it is next to inexistent :-/ > Nice, good to know thanks will dig around for details. > > Regards, > Willy > Cheers, Igor
stable-bot: Bugfixes waiting for a release 2.1 (52), 2.0 (45)
Hi, This is a friendly bot that watches fixes pending for the next haproxy-stable release! One such e-mail is sent periodically once patches are waiting in the last maintenance branch, and an ideal release date is computed based on the severity of these fixes and their merge date. Responses to this mail must be sent to the mailing list. Last release 2.1.4 was issued on 2020-04-02. There are currently 52 patches in the queue cut down this way: - 1 MAJOR, first one merged on 2020-05-20 - 20 MEDIUM, first one merged on 2020-05-01 - 31 MINOR, first one merged on 2020-04-02 Thus the computed ideal release date for 2.1.5 would be 2020-04-30, which was four weeks ago. Last release 2.0.14 was issued on 2020-04-02. There are currently 45 patches in the queue cut down this way: - 1 MAJOR, first one merged on 2020-05-22 - 18 MEDIUM, first one merged on 2020-05-07 - 26 MINOR, first one merged on 2020-04-02 Thus the computed ideal release date for 2.0.15 would be 2020-04-30, which was four weeks ago. The current list of patches in the queue is: - 2.0 - MAJOR : stream-int: always detach a faulty endpoint on connect failure - 2.1 - MAJOR : mux-fcgi: Stop sending loop if FCGI stream is blocked for any reason - 2.0, 2.1 - MEDIUM : lua: Fix dumping of stick table entries for STD_T_DICT - 2.0, 2.1 - MEDIUM : shctx: bound the number of loops that can happen around the lock - 2.1 - MEDIUM : h1: Don't compare host and authority if only h1 headers are parsed - 2.0, 2.1 - MEDIUM : streams: Remove SF_ADDR_SET if we're retrying due to L7 retry. - 2.0, 2.1 - MEDIUM : http: the "unique-id" sample fetch could crash without a steeam - 2.0 - MEDIUM : backend: don't access a non-existing mux from a previous connection - 2.0, 2.1 - MEDIUM : http_ana: make the detection of NTLM variants safer - 2.0, 2.1 - MEDIUM : http: the "http_first_req" sample fetch could crash without a steeam - 2.0, 2.1 - MEDIUM : http-ana: Handle NTLM messages correctly. - 2.0, 2.1 - MEDIUM : shctx: really check the lock's value while waiting - 2.0, 2.1 - MEDIUM : capture: capture-req/capture-res converters crash without a stream - 2.0, 2.1 - MEDIUM : capture: capture.{req,res}.* crash without a stream - 2.0 - MEDIUM : checks: Always initialize checks before starting them - 2.0, 2.1 - MEDIUM : server/checks: Init server check during config validity check - 2.1 - MEDIUM : mux-fcgi: Fix wrong test on FCGI_CF_KEEP_CONN in fcgi_detach() - 2.1 - MEDIUM : ring: write-lock the ring while attaching/detaching - 2.0, 2.1 - MEDIUM : sample: make the CPU and latency sample fetches check for a stream - 2.1 - MEDIUM : mux_fcgi: Free the FCGI connection at the end of fcgi_release() - 2.0, 2.1 - MEDIUM : connections: force connections cleanup on server changes - 2.0, 2.1 - MEDIUM : listener: mark the thread as not stuck inside the loop - 2.0, 2.1 - MEDIUM : ssl: fix the id length check within smp_fetch_ssl_fc_session_id() - 2.0, 2.1 - MEDIUM : stream: Only allow L7 retries when using HTTP. - 2.0, 2.1 - MINOR : checks: Respect check-ssl param when a port or an addr is specified - 2.0, 2.1 - MINOR : checks: Remove a warning about http health checks - 2.0, 2.1 - MINOR : obj_type: Handle stream object in obj_base_ptr() function - 2.0, 2.1 - MINOR : checks/server: use_ssl member must be signed - 2.0, 2.1 - MINOR : connection: make sure to correctly tag local PROXY connections" - 2.0, 2.1 - MINOR : checks: Respect the no-check-ssl option - 2.0, 2.1 - MINOR : pollers: remove uneeded free in global init - 2.0, 2.1 - MINOR : checks: Compute the right HTTP request length for HTTP health checks - 2.0, 2.1 - MINOR : soft-stop: always wake up waiting threads on stopping - 2.0, 2.1 - MINOR : ssl: default settings for ssl server options are not used - 2.0, 2.1 - MINOR : sample: Set the correct type when a binary is converted to a string - 2.0, 2.1 - MINOR : tools: fix the i386 version of the div64_32 function - 2.0, 2.1 - MINOR : cfgparse: Abort parsing the current line if an invalid \x sequence is encountered - 2.0, 2.1 - MINOR : threads: fix multiple use of argument inside HA_ATOMIC_UPDATE_{MIN,MAX}() - 2.1 - MINOR : ssl:
Re: [PATCH] cleanup coverity findging (make it silent)
вт, 26 мая 2020 г. в 12:02, Willy Tarreau : > Hi Ilya, > > On Sat, May 23, 2020 at 03:47:58PM +0500, ??? wrote: > > From: Ilya Shipitsin > > Date: Sat, 23 May 2020 15:35:36 +0500 > > Subject: [PATCH] CLEANUP: src/checks.c: ignore return value using > DISGUISE(..) > > > > we do not want to check status of extchk_setenv, but static analyzsers > > like Coverity are not happy about it. Let calm coverity down. > > Are you really sure we don't want to check them ? I'm seeing that > prepare_external_check() uses EXTCHK_SETENV() to purposely add checks > there, so it's unclear to me why we want to silently fail here. Maybe > the calls should instead be changed to have a check and a jump to an > error label doing the exit(). > > I don't know if anyone has an opinion on this, I'm not using external > checks :-/ > well, I meant to keep current behaviour, but also silence coverity warning. ok, we can investigate and discuss would it be better to change current behaviour or to keep it. > > Willy >
Re: [PR] The agent-check fail state is represented as "fail"
Hi Jack, On Wed, May 20, 2020 at 05:23:08PM +0200, PR Bot wrote: > Author: Jack Neely > Number of patches: 1 > > This is an automated relay of the Github pull request: >The agent-check fail state is represented as "fail" > > Patch title(s): >The agent-check fail state is represented as "fail" > > Link: >https://github.com/haproxy/haproxy/pull/642 > > Edit locally: >wget https://github.com/haproxy/haproxy/pull/642.patch && vi 642.patch > > Apply locally: >curl https://github.com/haproxy/haproxy/pull/642.patch | git am - > > Description: >Documentation has stated the string is "failed" and this doesn't match >the source code. An agent-check returning "failed" causes HAProxy to >not make state changes. Very interesting catch! However I don't agree with only changing the doc because the doc is a spec that developers use to create their code. Someone might have written an agent that works well against the doc and not necessarily against haproxy and we can't change that afterwards. But looking at this part, I'm seeing that the initial spec did mention "fail", just like the code and comments everywhere, and it's just when I reworded that part to extend the agent's language in 1.5-dev26 in 2013 (commit 81f5d94a0be) that I mistakenly wrote "failed" instead of "fail" in the doc. So now the situation is: - the code has always exclusively checked for "fail" - the doc prior to 1.5-dev26 used to mention "fail" - the doc since 1.5-dev26 has always mentioned "failed" There definitely are agents that pre-date this extension, and it's almost certain that some people wrote their own agents since 2013 without having an immediate access to haproxy for a test. Thus what I'd suggest would be the following plan: 1) we modify the code to also validate "failed" as a valid keyword (there's a single "strcasecmp" invocation to update) and we put a comment there referencing this discussion for the background. 2) we update the doc to document "fail" as you did with no mention of "failed" so that those who figure this potential incompatibility are encouraged to update their code (and test it), and are not encouraged to keep the old keyword that will not work with any of the supported versions in field. 3) we backport this to all branches. Please also have a look at CONTRIBUTING for your next patch. Do not hesitate to create an issue so as not to lose it if you don't have time to rework your patch right now. Thanks! Willy
Re: [PATCH] BUG/MEDIUM: contrib/spoa: do not register python3.8 if --embed fail
Hi Bertrand, On Fri, May 22, 2020 at 05:03:46PM +0100, Bertrand Jacquin wrote: > spoa server fails to build when python3.8 is not available. If > python3-config --embed fails, the output of the command is registered in > check_python_config. However when it's later used to define > PYTHON_DEFAULT_INC and PYTHON_DEFAULT_LIB it's content does not match > and fallback to python2.7 > > Content of check_python_config when building with python3.6: > > Usage: bin/python3-config > --prefix|--exec-prefix|--includes|--libs|--cflags|--ldflags|--extension-suffix|--help|--abiflags|--configdir > python3 > > As we are only looking for return code, this commit ensure we always > ignore the output of python3-config or hash commands. Makes sense, now applied, thank you! Willy
Re: Redefine 401 error page
Le 26/05/2020 à 08:56, Willy Tarreau a écrit : Hi Joao, On Thu, May 21, 2020 at 09:41:19PM -0300, Joao Morais wrote: Hello list, the 401 is one of the http status code haproxy generates itself: https://github.com/haproxy/haproxy/blob/v2.1.0/doc/configuration.txt#L363 This cannot however be overwritten using the errorfile keyword as stated in the doc: https://github.com/haproxy/haproxy/blob/v2.1.0/doc/configuration.txt#L3558 and also testing myself: [WARNING] 142/002731 (1) : parsing [/tmp/haproxy/h.cfg:9] : status code 401 not handled by 'errorfile', error customization will be ignored. I'm aware that a Lua script can generate a custom page and an arbitrary http status code which could work around this: core.register_service("send-401", "http", function(applet) send(applet, 401, [[ My custom 401 page ]]) end) ... but is there a way to, instead, customize the output of `http-request auth`? I had a look in the code and not, it's not possible right now, the message is constructed in function http_reply_40x_unauthorized(). It's sad because what this function does is only a subset of what is possible to do with the new configurable error files. But it takes one extra argument that needs to be emitted and that makes it impossible to express as a regular error file. In an ideal world this function should call what is used normally to build an error message, then just concatenate the addition of the "basic realm=" and auth_realm into either www-authenticate or proxy-authenticate and that should work. I'm CCing Christopher in case he sees a trivial way to do this. Digging into this also made me realize that all but two of the HTTP_xxx messages defined in http.c are not used anymore and could be removed. I'm even suspecting the remaining ones (302 and 303) might be accidental leftovers. In HAProxy 2.2, I guess 401/407 responses may be generated using an http-request return rule, making http-request auth rule more or less deprecated. The only mess is to handle 2 different responses depending on the request path when the http-use-proxy-header option is used. In addition, http_reply_40x_unauthorized() is also called for the stats page authentication. This part cannot be replaced by an http-request return rule. So maybe a good solution is to use error files (customizable since 2.2). And just add the realm at the right place, as Willy said. I will investigate. For HTTP_30X templates, you're right, they should be removed. And HTTP_10X too. -- Christopher Faulet
Re: [PATCH] cleanup coverity findging (make it silent)
Hi Ilya, On Sat, May 23, 2020 at 03:47:58PM +0500, ??? wrote: > From: Ilya Shipitsin > Date: Sat, 23 May 2020 15:35:36 +0500 > Subject: [PATCH] CLEANUP: src/checks.c: ignore return value using DISGUISE(..) > > we do not want to check status of extchk_setenv, but static analyzsers > like Coverity are not happy about it. Let calm coverity down. Are you really sure we don't want to check them ? I'm seeing that prepare_external_check() uses EXTCHK_SETENV() to purposely add checks there, so it's unclear to me why we want to silently fail here. Maybe the calls should instead be changed to have a check and a jump to an error label doing the exit(). I don't know if anyone has an opinion on this, I'm not using external checks :-/ Willy
Re: Redefine 401 error page
Hi Joao, On Thu, May 21, 2020 at 09:41:19PM -0300, Joao Morais wrote: > > Hello list, the 401 is one of the http status code haproxy generates itself: > > https://github.com/haproxy/haproxy/blob/v2.1.0/doc/configuration.txt#L363 > > This cannot however be overwritten using the errorfile keyword as stated in > the doc: > > https://github.com/haproxy/haproxy/blob/v2.1.0/doc/configuration.txt#L3558 > > and also testing myself: > > [WARNING] 142/002731 (1) : parsing [/tmp/haproxy/h.cfg:9] : status code > 401 not > handled by 'errorfile', error customization will be ignored. > > I'm aware that a Lua script can generate a custom page and an arbitrary http > status code which could work around this: > > core.register_service("send-401", "http", function(applet) > send(applet, 401, [[ > My custom 401 page > ]]) > end) > > ... but is there a way to, instead, customize the output of `http-request > auth`? I had a look in the code and not, it's not possible right now, the message is constructed in function http_reply_40x_unauthorized(). It's sad because what this function does is only a subset of what is possible to do with the new configurable error files. But it takes one extra argument that needs to be emitted and that makes it impossible to express as a regular error file. In an ideal world this function should call what is used normally to build an error message, then just concatenate the addition of the "basic realm=" and auth_realm into either www-authenticate or proxy-authenticate and that should work. I'm CCing Christopher in case he sees a trivial way to do this. Digging into this also made me realize that all but two of the HTTP_xxx messages defined in http.c are not used anymore and could be removed. I'm even suspecting the remaining ones (302 and 303) might be accidental leftovers. Willy
Re: Haproxy 1.8.25 segfault
Hi Igor, On Sun, May 24, 2020 at 10:35:10AM +1000, Igor Cicimov wrote: > Hi guys, > > We are getting segfaults with haproxy 1.8.25 and thought I would ask if > this rings any bell: > > segfault at 5609a853 ip 7f1b93928c10 sp 7ffd5e731fd8 error 4 in > libc-2.19.so[7f1b9388e000+1be000] At this point, no unfortunately. This could be a memcpy() on a NULL pointer or a use after free for example. > It is running on Ubuntu-14.04.2 (kernel 4.4.0-144-generic) and is happening > only on this particular one out of many dozens we have on Ubuntu-14.04 and > 16.04 > > I have attached strace so more details upon the next crash. I doubt you'll see much more using strace. You'd rather attach gdb to it and let it run. This way when it crashes again you can issue "bt full" and see the whole trace. It is even possible to force a core to be dumped from gdb for later inspection using "generate-core-file". Some people also know how to script it so that it automatically dumps and detaches upon crash, and limits the service interruption time, but I never remember how to do this, and the help embedded in it is next to inexistent :-/ Regards, Willy