Re: Haproxy 1.8.25 segfault

2020-05-26 Thread Willy Tarreau
On Wed, May 27, 2020 at 11:48:05AM +1000, Igor Cicimov wrote:
> Hi Willy,
> 
> On Tue, May 26, 2020 at 4:43 PM Willy Tarreau  wrote:
> 
> > On Sun, May 24, 2020 at 10:35:10AM +1000, Igor Cicimov wrote:
> > > We are getting segfaults with haproxy 1.8.25
> >
> > By the way, does this mean you didn't get them with a previous version
> > (presumably 1.8.24) ? There aren't that many fixes between 1.8.24 and
> > 1.8.25, only 23.
> >
> 
> Yes, it started happening recently for some reason def on 1.8.25 only:

OK, very useful!

> Unfortunately (in context of figuring out the issue) we are not using lua
> :-/

So I'm hardly seeing a candidate here, unless we have another latent bug
that was woken up by one of the recent fixes.

> One thing I noticed though was that there was an OCSP file that had landed
> by mistake inside the SSL directory HAP is loading the certificates from.
> Do you think something like that can cause this to happen over the course
> of time? I don't but thought worth mentioning since that was the only diff
> I could see from our standard config elsewhere.

I don't think so, but thanks for mentioning it, we never know and any track
should be considered!

Cheers,
Willy



Re: Haproxy 1.8.25 segfault

2020-05-26 Thread Igor Cicimov
Hi Willy,

On Tue, May 26, 2020 at 4:43 PM Willy Tarreau  wrote:

> On Sun, May 24, 2020 at 10:35:10AM +1000, Igor Cicimov wrote:
> > We are getting segfaults with haproxy 1.8.25
>
> By the way, does this mean you didn't get them with a previous version
> (presumably 1.8.24) ? There aren't that many fixes between 1.8.24 and
> 1.8.25, only 23.
>

Yes, it started happening recently for some reason def on 1.8.25 only:

# zgrep -i segfault /var/log/syslog.*.gz
/var/log/syslog.4.gz:May 23 00:36:52 ip-172-31-37-74 kernel:
[30284682.620567] haproxy[14736]: segfault at 5609a853 ip
7f1b93928c10 sp 7ffd5e731fd8 error 4 in libc-2.19.so
[7f1b9388e000+1be000]
/var/log/syslog.5.gz:May 22 01:18:55 ip-172-31-37-74 kernel:
[30200805.498707] haproxy[7361]: segfault at 5575725c8fff ip
7f7d35bd4c10 sp 7b58a078 error 4 in libc-2.19.so
[7f7d35b3a000+1be000]
/var/log/syslog.5.gz:May 22 12:15:55 ip-172-31-37-74 kernel:
[30240225.673643] haproxy[15054]: segfault at 555f41a03fff ip
7f594d00fc10 sp 7ffeac111c98 error 4 in libc-2.19.so
[7f594cf75000+1be000]
/var/log/syslog.6.gz:May 21 12:08:13 ip-172-31-37-74 kernel:
[30153363.801627] haproxy[28398]: segfault at 55b4b43ccfff ip
7f5f33b53c10 sp 7ffe7fc290e8 error 4 in libc-2.19.so
[7f5f33ab9000+1be000]
/var/log/syslog.7.gz:May 20 16:04:54 ip-172-31-37-74 kernel:
[30081165.011057] haproxy[4830]: segfault at 563a387fafff ip
7fa11e6f2c10 sp 7fffaacd82c8 error 4 in libc-2.19.so
[7fa11e658000+1be000]


>
> The only one among them that I'm seeing capable of possibly having a
> side effet in unclear code parts would be this one:
>
>3d69a6029 ("BUG/MINOR: lua: Ignore the reserve to know if a channel is
> full or not")
>
> Do you use some Lua code which would involve the is_full() attribute on
> a channel ?
>
> Willy
>

Unfortunately (in context of figuring out the issue) we are not using lua
:-/
One thing I noticed though was that there was an OCSP file that had landed
by mistake inside the SSL directory HAP is loading the certificates from.
Do you think something like that can cause this to happen over the course
of time? I don't but thought worth mentioning since that was the only diff
I could see from our standard config elsewhere.

Thanks,
Igor


Re: Haproxy 1.8.25 segfault

2020-05-26 Thread Igor Cicimov
Hi Willy,

On Tue, May 26, 2020 at 4:31 PM Willy Tarreau  wrote:

> Hi Igor,
>
> On Sun, May 24, 2020 at 10:35:10AM +1000, Igor Cicimov wrote:
> > Hi guys,
> >
> > We are getting segfaults with haproxy 1.8.25 and thought I would ask if
> > this rings any bell:
> >
> > segfault at 5609a853 ip 7f1b93928c10 sp 7ffd5e731fd8 error 4
> in
> > libc-2.19.so[7f1b9388e000+1be000]
>
> At this point, no unfortunately. This could be a memcpy() on a NULL
> pointer or a use after free for example.
>
> > It is running on Ubuntu-14.04.2 (kernel 4.4.0-144-generic) and is
> happening
> > only on this particular one out of many dozens we have on Ubuntu-14.04
> and
> > 16.04
> >
> > I have attached strace so more details upon the next crash.
>
> I doubt you'll see much more using strace. You'd rather attach gdb to
> it and let it run. This way when it crashes again you can issue "bt full"
> and see the whole trace.
>
>
Done. Hopefully I get something useful on the next segfault.

> It is even possible to force a core to be dumped from gdb for later
> inspection using "generate-core-file". Some people also know how to script
> it so that it automatically dumps and detaches upon crash, and limits the
> service interruption time, but I never remember how to do this, and the
> help embedded in it is next to inexistent :-/
>

Nice, good to know thanks will dig around for details.

>
> Regards,
> Willy
>

Cheers,
Igor


stable-bot: Bugfixes waiting for a release 2.1 (52), 2.0 (45)

2020-05-26 Thread stable-bot
Hi,

This is a friendly bot that watches fixes pending for the next haproxy-stable 
release!  One such e-mail is sent periodically once patches are waiting in the 
last maintenance branch, and an ideal release date is computed based on the 
severity of these fixes and their merge date.  Responses to this mail must be 
sent to the mailing list.


Last release 2.1.4 was issued on 2020-04-02.  There are currently 52 patches in 
the queue cut down this way:
- 1 MAJOR, first one merged on 2020-05-20
- 20 MEDIUM, first one merged on 2020-05-01
- 31 MINOR, first one merged on 2020-04-02

Thus the computed ideal release date for 2.1.5 would be 2020-04-30, which was 
four weeks ago.

Last release 2.0.14 was issued on 2020-04-02.  There are currently 45 patches 
in the queue cut down this way:
- 1 MAJOR, first one merged on 2020-05-22
- 18 MEDIUM, first one merged on 2020-05-07
- 26 MINOR, first one merged on 2020-04-02

Thus the computed ideal release date for 2.0.15 would be 2020-04-30, which was 
four weeks ago.

The current list of patches in the queue is:
 - 2.0   - MAJOR   : stream-int: always detach a faulty 
endpoint on connect failure
 - 2.1   - MAJOR   : mux-fcgi: Stop sending loop if FCGI 
stream is blocked for any reason
 - 2.0, 2.1  - MEDIUM  : lua: Fix dumping of stick table 
entries for STD_T_DICT
 - 2.0, 2.1  - MEDIUM  : shctx: bound the number of loops that 
can happen around the lock
 - 2.1   - MEDIUM  : h1: Don't compare host and authority 
if only h1 headers are parsed
 - 2.0, 2.1  - MEDIUM  : streams: Remove SF_ADDR_SET if we're 
retrying due to L7 retry.
 - 2.0, 2.1  - MEDIUM  : http: the "unique-id" sample fetch 
could crash without a steeam
 - 2.0   - MEDIUM  : backend: don't access a non-existing 
mux from a previous connection
 - 2.0, 2.1  - MEDIUM  : http_ana: make the detection of NTLM 
variants safer
 - 2.0, 2.1  - MEDIUM  : http: the "http_first_req" sample 
fetch could crash without a steeam
 - 2.0, 2.1  - MEDIUM  : http-ana: Handle NTLM messages 
correctly.
 - 2.0, 2.1  - MEDIUM  : shctx: really check the lock's value 
while waiting
 - 2.0, 2.1  - MEDIUM  : capture: capture-req/capture-res 
converters crash without a stream
 - 2.0, 2.1  - MEDIUM  : capture: capture.{req,res}.* crash 
without a stream
 - 2.0   - MEDIUM  : checks: Always initialize checks 
before starting them
 - 2.0, 2.1  - MEDIUM  : server/checks: Init server check 
during config validity check
 - 2.1   - MEDIUM  : mux-fcgi: Fix wrong test on 
FCGI_CF_KEEP_CONN in fcgi_detach()
 - 2.1   - MEDIUM  : ring: write-lock the ring while 
attaching/detaching
 - 2.0, 2.1  - MEDIUM  : sample: make the CPU and latency 
sample fetches check for a stream
 - 2.1   - MEDIUM  : mux_fcgi: Free the FCGI connection at 
the end of fcgi_release()
 - 2.0, 2.1  - MEDIUM  : connections: force connections cleanup 
on server changes
 - 2.0, 2.1  - MEDIUM  : listener: mark the thread as not stuck 
inside the loop
 - 2.0, 2.1  - MEDIUM  : ssl: fix the id length check within 
smp_fetch_ssl_fc_session_id()
 - 2.0, 2.1  - MEDIUM  : stream: Only allow L7 retries when 
using HTTP.
 - 2.0, 2.1  - MINOR   : checks: Respect check-ssl param when a 
port or an addr is specified
 - 2.0, 2.1  - MINOR   : checks: Remove a warning about http 
health checks
 - 2.0, 2.1  - MINOR   : obj_type: Handle stream object in 
obj_base_ptr() function
 - 2.0, 2.1  - MINOR   : checks/server: use_ssl member must be 
signed
 - 2.0, 2.1  - MINOR   : connection: make sure to correctly tag 
local PROXY connections"
 - 2.0, 2.1  - MINOR   : checks: Respect the no-check-ssl option
 - 2.0, 2.1  - MINOR   : pollers: remove uneeded free in global 
init
 - 2.0, 2.1  - MINOR   : checks: Compute the right HTTP request 
length for HTTP health checks
 - 2.0, 2.1  - MINOR   : soft-stop: always wake up waiting 
threads on stopping
 - 2.0, 2.1  - MINOR   : ssl: default settings for ssl server 
options are not used
 - 2.0, 2.1  - MINOR   : sample: Set the correct type when a 
binary is converted to a string
 - 2.0, 2.1  - MINOR   : tools: fix the i386 version of the 
div64_32 function
 - 2.0, 2.1  - MINOR   : cfgparse: Abort parsing the current 
line if an invalid \x sequence is encountered
 - 2.0, 2.1  - MINOR   : threads: fix multiple use of argument 
inside HA_ATOMIC_UPDATE_{MIN,MAX}()
 - 2.1   - MINOR   : ssl: 

Re: [PATCH] cleanup coverity findging (make it silent)

2020-05-26 Thread Илья Шипицин
вт, 26 мая 2020 г. в 12:02, Willy Tarreau :

> Hi Ilya,
>
> On Sat, May 23, 2020 at 03:47:58PM +0500,  ??? wrote:
> > From: Ilya Shipitsin 
> > Date: Sat, 23 May 2020 15:35:36 +0500
> > Subject: [PATCH] CLEANUP: src/checks.c: ignore return value using
> DISGUISE(..)
> >
> > we do not want to check status of extchk_setenv, but static analyzsers
> > like Coverity are not happy about it. Let calm coverity down.
>
> Are you really sure we don't want to check them ? I'm seeing that
> prepare_external_check() uses EXTCHK_SETENV() to purposely add checks
> there, so it's unclear to me why we want to silently fail here. Maybe
> the calls should instead be changed to have a check and a jump to an
> error label doing the exit().
>
> I don't know if anyone has an opinion on this, I'm not using external
> checks :-/
>

well, I meant to keep current behaviour, but also silence coverity warning.

ok, we can investigate and discuss would it be better to change current
behaviour or to keep it.


>
> Willy
>


Re: [PR] The agent-check fail state is represented as "fail"

2020-05-26 Thread Willy Tarreau
Hi Jack,

On Wed, May 20, 2020 at 05:23:08PM +0200, PR Bot wrote:
> Author: Jack Neely 
> Number of patches: 1
> 
> This is an automated relay of the Github pull request:
>The agent-check fail state is represented as "fail"
> 
> Patch title(s): 
>The agent-check fail state is represented as "fail"
> 
> Link:
>https://github.com/haproxy/haproxy/pull/642
> 
> Edit locally:
>wget https://github.com/haproxy/haproxy/pull/642.patch && vi 642.patch
> 
> Apply locally:
>curl https://github.com/haproxy/haproxy/pull/642.patch | git am -
> 
> Description:
>Documentation has stated the string is "failed" and this doesn't match
>the source code.  An agent-check returning "failed" causes HAProxy to
>not make state changes.

Very interesting catch!

However I don't agree with only changing the doc because the doc is a
spec that developers use to create their code. Someone might have
written an agent that works well against the doc and not necessarily
against haproxy and we can't change that afterwards.

But looking at this part, I'm seeing that the initial spec did mention
"fail", just like the code and comments everywhere, and it's just when
I reworded that part to extend the agent's language in 1.5-dev26 in 2013
(commit 81f5d94a0be) that I mistakenly wrote "failed" instead of "fail"
in the doc.

So now the situation is:
  - the code has always exclusively checked for "fail"
  - the doc prior to 1.5-dev26 used to mention "fail"
  - the doc since 1.5-dev26 has always mentioned "failed"

There definitely are agents that pre-date this extension, and it's almost
certain that some people wrote their own agents since 2013 without having
an immediate access to haproxy for a test.

Thus what I'd suggest would be the following plan:
  1) we modify the code to also validate "failed" as a valid keyword
 (there's a single "strcasecmp" invocation to update) and we put
 a comment there referencing this discussion for the background.

  2) we update the doc to document "fail" as you did with no mention
 of "failed" so that those who figure this potential incompatibility
 are encouraged to update their code (and test it), and are not
 encouraged to keep the old keyword that will not work with any
 of the supported versions in field.

  3) we backport this to all branches.

Please also have a look at CONTRIBUTING for your next patch. Do not
hesitate to create an issue so as not to lose it if you don't have
time to rework your patch right now.

Thanks!
Willy



Re: [PATCH] BUG/MEDIUM: contrib/spoa: do not register python3.8 if --embed fail

2020-05-26 Thread Willy Tarreau
Hi Bertrand,

On Fri, May 22, 2020 at 05:03:46PM +0100, Bertrand Jacquin wrote:
> spoa server fails to build when python3.8 is not available. If
> python3-config --embed fails, the output of the command is registered in
> check_python_config.  However when it's later used to define
> PYTHON_DEFAULT_INC and PYTHON_DEFAULT_LIB it's content does not match
> and fallback to python2.7
> 
> Content of check_python_config when building with python3.6:
> 
>   Usage: bin/python3-config 
> --prefix|--exec-prefix|--includes|--libs|--cflags|--ldflags|--extension-suffix|--help|--abiflags|--configdir
>  python3
> 
> As we are only looking for return code, this commit ensure we always
> ignore the output of python3-config or hash commands.

Makes sense, now applied, thank you!
Willy



Re: Redefine 401 error page

2020-05-26 Thread Christopher Faulet

Le 26/05/2020 à 08:56, Willy Tarreau a écrit :

Hi Joao,

On Thu, May 21, 2020 at 09:41:19PM -0300, Joao Morais wrote:


Hello list, the 401 is one of the http status code haproxy generates itself:

 https://github.com/haproxy/haproxy/blob/v2.1.0/doc/configuration.txt#L363

This cannot however be overwritten using the errorfile keyword as stated in the 
doc:

 https://github.com/haproxy/haproxy/blob/v2.1.0/doc/configuration.txt#L3558

and also testing myself:

 [WARNING] 142/002731 (1) : parsing [/tmp/haproxy/h.cfg:9] : status code 
401 not
 handled by 'errorfile', error customization will be ignored.

I'm aware that a Lua script can generate a custom page and an arbitrary http 
status code which could work around this:

 core.register_service("send-401", "http", function(applet)
 send(applet, 401, [[
 My custom 401 page
 ]])
 end)

... but is there a way to, instead, customize the output of `http-request auth`?


I had a look in the code and not, it's not possible right now, the
message is constructed in function http_reply_40x_unauthorized(). It's
sad because what this function does is only a subset of what is possible
to do with the new configurable error files. But it takes one extra
argument that needs to be emitted and that makes it impossible to
express as a regular error file.

In an ideal world this function should call what is used normally to
build an error message, then just concatenate the addition of the
"basic realm=" and auth_realm into either www-authenticate or
proxy-authenticate and that should work. I'm CCing Christopher in case
he sees a trivial way to do this.

Digging into this also made me realize that all but two of the HTTP_xxx
messages defined in http.c are not used anymore and could be removed.
I'm even suspecting the remaining ones (302 and 303) might be accidental
leftovers.



In HAProxy 2.2, I guess 401/407 responses may be generated using an http-request 
return rule, making http-request auth rule more or less deprecated. The only 
mess is to handle 2 different responses depending on the request path when the 
http-use-proxy-header option is used. In addition, http_reply_40x_unauthorized() 
is also called for the stats page authentication. This part cannot be replaced 
by an http-request return rule. So maybe a good solution is to use error files 
(customizable since 2.2). And just add the realm at the right place, as Willy 
said. I will investigate.


For HTTP_30X templates, you're right, they should be removed. And HTTP_10X too.

--
Christopher Faulet



Re: [PATCH] cleanup coverity findging (make it silent)

2020-05-26 Thread Willy Tarreau
Hi Ilya,

On Sat, May 23, 2020 at 03:47:58PM +0500,  ??? wrote:
> From: Ilya Shipitsin 
> Date: Sat, 23 May 2020 15:35:36 +0500
> Subject: [PATCH] CLEANUP: src/checks.c: ignore return value using DISGUISE(..)
> 
> we do not want to check status of extchk_setenv, but static analyzsers
> like Coverity are not happy about it. Let calm coverity down.

Are you really sure we don't want to check them ? I'm seeing that
prepare_external_check() uses EXTCHK_SETENV() to purposely add checks
there, so it's unclear to me why we want to silently fail here. Maybe
the calls should instead be changed to have a check and a jump to an
error label doing the exit().

I don't know if anyone has an opinion on this, I'm not using external
checks :-/

Willy



Re: Redefine 401 error page

2020-05-26 Thread Willy Tarreau
Hi Joao,

On Thu, May 21, 2020 at 09:41:19PM -0300, Joao Morais wrote:
> 
> Hello list, the 401 is one of the http status code haproxy generates itself:
> 
> https://github.com/haproxy/haproxy/blob/v2.1.0/doc/configuration.txt#L363
> 
> This cannot however be overwritten using the errorfile keyword as stated in 
> the doc:
> 
> https://github.com/haproxy/haproxy/blob/v2.1.0/doc/configuration.txt#L3558
> 
> and also testing myself:
> 
> [WARNING] 142/002731 (1) : parsing [/tmp/haproxy/h.cfg:9] : status code 
> 401 not
> handled by 'errorfile', error customization will be ignored.
> 
> I'm aware that a Lua script can generate a custom page and an arbitrary http 
> status code which could work around this:
> 
> core.register_service("send-401", "http", function(applet)
> send(applet, 401, [[
> My custom 401 page
> ]])
> end)
> 
> ... but is there a way to, instead, customize the output of `http-request 
> auth`?

I had a look in the code and not, it's not possible right now, the
message is constructed in function http_reply_40x_unauthorized(). It's
sad because what this function does is only a subset of what is possible
to do with the new configurable error files. But it takes one extra
argument that needs to be emitted and that makes it impossible to
express as a regular error file.

In an ideal world this function should call what is used normally to
build an error message, then just concatenate the addition of the
"basic realm=" and auth_realm into either www-authenticate or
proxy-authenticate and that should work. I'm CCing Christopher in case
he sees a trivial way to do this.

Digging into this also made me realize that all but two of the HTTP_xxx
messages defined in http.c are not used anymore and could be removed.
I'm even suspecting the remaining ones (302 and 303) might be accidental
leftovers.

Willy



Re: Haproxy 1.8.25 segfault

2020-05-26 Thread Willy Tarreau
Hi Igor,

On Sun, May 24, 2020 at 10:35:10AM +1000, Igor Cicimov wrote:
> Hi guys,
> 
> We are getting segfaults with haproxy 1.8.25 and thought I would ask if
> this rings any bell:
> 
> segfault at 5609a853 ip 7f1b93928c10 sp 7ffd5e731fd8 error 4 in
> libc-2.19.so[7f1b9388e000+1be000]

At this point, no unfortunately. This could be a memcpy() on a NULL
pointer or a use after free for example.

> It is running on Ubuntu-14.04.2 (kernel 4.4.0-144-generic) and is happening
> only on this particular one out of many dozens we have on Ubuntu-14.04 and
> 16.04
> 
> I have attached strace so more details upon the next crash.

I doubt you'll see much more using strace. You'd rather attach gdb to
it and let it run. This way when it crashes again you can issue "bt full"
and see the whole trace.

It is even possible to force a core to be dumped from gdb for later
inspection using "generate-core-file". Some people also know how to script
it so that it automatically dumps and detaches upon crash, and limits the
service interruption time, but I never remember how to do this, and the
help embedded in it is next to inexistent :-/

Regards,
Willy