Re: [PATCH] BUG/MEDIUM: sink: fix crash when null sink is used in __do_send_log

2020-06-22 Thread Christopher Faulet

Le 22/06/2020 à 10:56, Emeric Brun a écrit :

Hi Daniel, Willy,

On 6/19/20 9:22 PM, Willy Tarreau wrote:

Hi Daniel,

On Thu, Jun 18, 2020 at 12:35:29AM -0400, Daniel Corbett wrote:

Hello,


When using a ring log in combination with proto fcgi, it was possible
to cause a crash by sending a request for a non-existent fastcgi file
to php-fpm, causing it to produce the error "Primary script unknown".
When php-fpm produces this error for some reason a sink is not able to be
established and __do_send_log gets passed a null sink.

This commit verifies that the sink exists in __do_send_log before attempting
to use it.


Thanks for the fix and the detailed report, that's very useful! However
the problem is somewhere else, and I suspect is slightly harder to solve.
It's normally not possible to have a null sink on a log server if its
type is a buffer, so we have an inconsistency somewhere that we must
address.

What I'm suspecting (but I could be wrong) is that the fcgi duplicates
the global log settings before they are resolved, then the global ones
are resolved, and the copy in the fcgi part keeps an incompletely
initialized log server. I haven't looked at this area yet since we
support post-check resolution of sink names, but I suspect that it's
applied to frontends only, while as can be seen in your examples there
seem to be other areas requiring log resolution (I wasn't even aware
that we had other log references). So this means that right now we
have to make sure they're all properly addressed, and that in the
future it might be worth keeping a pointer to the global log servers
instead of copying them.


Indeed, we currently resolve ring names doing a loop on proxies to initialize 
their
server lists but I ignored that in the case of fastcgi-app, there is a 
log-server list
not related to a proxy, I've just talk to Christopher and he told me he already
planned to submit a fix for this bug, initializing correctly those server lists 
on
fastcgi-app's configuration post parsing .



I pushed a fix (commit c97406f7). Thanks Daniel !

--
Christopher Faulet



Re: [PATCH] BUG/MEDIUM: sink: fix crash when null sink is used in __do_send_log

2020-06-22 Thread Emeric Brun
Hi Daniel, Willy,

On 6/19/20 9:22 PM, Willy Tarreau wrote:
> Hi Daniel,
> 
> On Thu, Jun 18, 2020 at 12:35:29AM -0400, Daniel Corbett wrote:
>> Hello,
>>
>>
>> When using a ring log in combination with proto fcgi, it was possible
>> to cause a crash by sending a request for a non-existent fastcgi file
>> to php-fpm, causing it to produce the error "Primary script unknown".
>> When php-fpm produces this error for some reason a sink is not able to be
>> established and __do_send_log gets passed a null sink.
>>
>> This commit verifies that the sink exists in __do_send_log before attempting
>> to use it.
> 
> Thanks for the fix and the detailed report, that's very useful! However
> the problem is somewhere else, and I suspect is slightly harder to solve.
> It's normally not possible to have a null sink on a log server if its
> type is a buffer, so we have an inconsistency somewhere that we must
> address.
> 
> What I'm suspecting (but I could be wrong) is that the fcgi duplicates
> the global log settings before they are resolved, then the global ones
> are resolved, and the copy in the fcgi part keeps an incompletely
> initialized log server. I haven't looked at this area yet since we
> support post-check resolution of sink names, but I suspect that it's
> applied to frontends only, while as can be seen in your examples there
> seem to be other areas requiring log resolution (I wasn't even aware
> that we had other log references). So this means that right now we
> have to make sure they're all properly addressed, and that in the
> future it might be worth keeping a pointer to the global log servers
> instead of copying them.

Indeed, we currently resolve ring names doing a loop on proxies to initialize 
their
server lists but I ignored that in the case of fastcgi-app, there is a 
log-server list
not related to a proxy, I've just talk to Christopher and he told me he already
planned to submit a fix for this bug, initializing correctly those server lists 
on
fastcgi-app's configuration post parsing .

R,
Emeric



Re: [PATCH] BUG/MEDIUM: sink: fix crash when null sink is used in __do_send_log

2020-06-20 Thread Daniel Corbett

Hello,


On 6/19/20 3:22 PM, Willy Tarreau wrote:

Hi Daniel,
...
Thanks for the fix and the detailed report, that's very useful! However
the problem is somewhere else, and I suspect is slightly harder to solve.
It's normally not possible to have a null sink on a log server if its
type is a buffer, so we have an inconsistency somewhere that we must
address.



I suspected this to be the case but could not really trace the root of 
the issue and chose the path of least resistance :)




I'm CCing Emeric so that he can have a look, and am keeping the rest
intact below.



If I can provide any more info please let me know.


Thanks,

-- Daniel





Re: [PATCH] BUG/MEDIUM: sink: fix crash when null sink is used in __do_send_log

2020-06-19 Thread Willy Tarreau
Hi Daniel,

On Thu, Jun 18, 2020 at 12:35:29AM -0400, Daniel Corbett wrote:
> Hello,
> 
> 
> When using a ring log in combination with proto fcgi, it was possible
> to cause a crash by sending a request for a non-existent fastcgi file
> to php-fpm, causing it to produce the error "Primary script unknown".
> When php-fpm produces this error for some reason a sink is not able to be
> established and __do_send_log gets passed a null sink.
> 
> This commit verifies that the sink exists in __do_send_log before attempting
> to use it.

Thanks for the fix and the detailed report, that's very useful! However
the problem is somewhere else, and I suspect is slightly harder to solve.
It's normally not possible to have a null sink on a log server if its
type is a buffer, so we have an inconsistency somewhere that we must
address.

What I'm suspecting (but I could be wrong) is that the fcgi duplicates
the global log settings before they are resolved, then the global ones
are resolved, and the copy in the fcgi part keeps an incompletely
initialized log server. I haven't looked at this area yet since we
support post-check resolution of sink names, but I suspect that it's
applied to frontends only, while as can be seen in your examples there
seem to be other areas requiring log resolution (I wasn't even aware
that we had other log references). So this means that right now we
have to make sure they're all properly addressed, and that in the
future it might be worth keeping a pointer to the global log servers
instead of copying them.

I'm CCing Emeric so that he can have a look, and am keeping the rest
intact below.

Thanks,
Willy

---
> Sample below to cause the crash. Note, it requires that you have a php-fpm
> server running and make a request for a file not found.
> 
> global
>     log ring@requests0 local1 info
> 
> ring requests0
>     description "request logs"
>     format rfc3164
>     maxlen 1200
>     size 32764
>     timeout connect 5s
>     timeout server 10s
>     server request-log 127.0.0.1:6514
> 
> fcgi-app php-fpm
>     log-stderr global
>     option keep-conn
>     docroot /var/www/html
>     index app.php
>     path-info ^(/.+\.php)(/.*)?$
> 
> 
> listen be_dynamic
>     mode http
>     bind :80
>     use-fcgi-app php-fpm
>     option httpchk
>     http-check connect proto fcgi
>     http-check send meth GET uri /ping
>     http-check expect string pong
>     server php-fpm1 172.31.31.151:18081 proto fcgi
> 
> 
> 
> $ curl 192.168.1.25/404.php
> curl: (52) Empty reply from server
> 
> 0014:be_dynamic.accept(0006)=0010 from [76.110.41.73:51892] ALPN=
> 0014:be_dynamic.clireq[0010:]: GET /404.php HTTP/1.1
> 0014:be_dynamic.clihdr[0010:]: host: demo.haproxy.net
> 0014:be_dynamic.clihdr[0010:]: user-agent: curl/7.58.0
> 0014:be_dynamic.clihdr[0010:]: accept: */*
> Segmentation fault (core dumped)
> 
> 
> 
> Backtrace:
> 
> (gdb) bt
> #0  0x5586ffea472b in sink_write (sink=0x0, msg=0x7ffcd94951b0, nmsg=1,
> level=3, facility=17, tag=0x7ffcd94951c0, pid=0x7ffcd94951d0,
> sd=0x7ffcd94951e0) at include/haproxy/sink.h:55
> #1  0x5586ffea8fe2 in __do_send_log (logsrv=0x5587008e9f30, nblogger=1,
> pid_str=0x7f7d48069e7c "21018", pid_size=5, level=3, message=0x558700916340
> "Primary script unknown\n\n", size=22,
>     sd=0x55870025e6c6  "- ", sd_size=2,
> tag_str=0x558700917d70 "php-fpm", tag_size=7) at src/log.c:1742
> #2  0x5586ffea9717 in __send_log (logsrvs=0x558700917ed8,
> tag=0x7ffcd94954e0, level=3, message=0x558700916340 "Primary script
> unknown\n\n", size=24, sd=0x55870025e6c6  "-
> ",
>     sd_size=2) at src/log.c:1851
> #3  0x5586ffeaea21 in app_log (logsrvs=0x558700917ed8,
> tag=0x7ffcd94954e0, level=3, format=0x5586fffb3006 "%s") at src/log.c:3097
> #4  0x5586ffda3ed8 in fcgi_strm_handle_stderr (fconn=0x5587009a4200,
> fstrm=0x558700987e50) at src/mux_fcgi.c:2376
> #5  0x5586ffda5b33 in fcgi_process_demux (fconn=0x5587009a4200) at
> src/mux_fcgi.c:2563
> #6  0x5586ffda8eb4 in fcgi_process (fconn=0x5587009a4200) at
> src/mux_fcgi.c:2977
> #7  0x5586ffda8b37 in fcgi_io_cb (t=0x558700987e00, ctx=0x5587009a4200,
> status=0) at src/mux_fcgi.c:2948
> #8  0x5586fff5e870 in run_tasks_from_list (list=0x5587003893f0
> , max=67) at src/task.c:345
> #9  0x5586fff5ecfb in process_runnable_tasks () at src/task.c:446
> #10 0x5586ffefa415 in run_poll_loop () at src/haproxy.c:2884
> #11 0x00005586ffefa956 in run_thread_poll_loop (data=0x0) at
> src/haproxy.c:3056
> #12 0x5586ffefc622 in main (argc=4, argv=0x7ffcd94960a8) at
> src/haproxy.c:3758
> 
&g

Re: [PATCH] BUG/MEDIUM: sink: fix crash when null sink is used in __do_send_log

2020-06-17 Thread Daniel Corbett

Hello,


On 6/18/20 12:35 AM, Daniel Corbett wrote:

Hello,


When using a ring log in combination with proto fcgi, it was possible
to cause a crash by sending a request for a non-existent fastcgi file
to php-fpm, causing it to produce the error "Primary script unknown".
When php-fpm produces this error for some reason a sink is not able to be
established and __do_send_log gets passed a null sink.

This commit verifies that the sink exists in __do_send_log before 
attempting

to use it.



My apologies, I mentioned the wrong function in the commit message, the 
fix was applied in sink_write and not __do_send_log. Attached is an 
amended commit.



Thanks,

-- Daniel


>From 90e1a21db447da3053bd80cede5b45628004117f Mon Sep 17 00:00:00 2001
From: Daniel Corbett 
Date: Thu, 18 Jun 2020 00:10:17 -0400
Subject: [PATCH] BUG/MEDIUM: sink: fix crash when null sink is used in
 sink_write

When using a ring log in combination with proto fcgi, it was possible
to cause a crash by sending a request for a non-existent fastcgi file
to php-fpm, causing it to produce the error "Primary script unknown".
When php-fpm produces this error for some reason a sink is not able to be
established and sink_write gets passed a null sink.

This commit verifies that the sink exists in sink_write before attempting
to use it.
---
 include/haproxy/sink.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/include/haproxy/sink.h b/include/haproxy/sink.h
index 025fa4185..b290d2189 100644
--- a/include/haproxy/sink.h
+++ b/include/haproxy/sink.h
@@ -49,6 +49,9 @@ static inline ssize_t sink_write(struct sink *sink, const struct ist msg[], size
 {
 	ssize_t sent;
 
+	if (!sink)
+		goto fail;
+
 	if (unlikely(sink->ctx.dropped > 0)) {
 		/* We need to take an exclusive lock so that other producers
 		 * don't do the same thing at the same time and above all we
-- 
2.17.1



[PATCH] BUG/MEDIUM: sink: fix crash when null sink is used in __do_send_log

2020-06-17 Thread Daniel Corbett

Hello,


When using a ring log in combination with proto fcgi, it was possible
to cause a crash by sending a request for a non-existent fastcgi file
to php-fpm, causing it to produce the error "Primary script unknown".
When php-fpm produces this error for some reason a sink is not able to be
established and __do_send_log gets passed a null sink.

This commit verifies that the sink exists in __do_send_log before attempting
to use it.


Sample below to cause the crash. Note, it requires that you have a 
php-fpm server running and make a request for a file not found.


global
    log ring@requests0 local1 info

ring requests0
    description "request logs"
    format rfc3164
    maxlen 1200
    size 32764
    timeout connect 5s
    timeout server 10s
    server request-log 127.0.0.1:6514

fcgi-app php-fpm
    log-stderr global
    option keep-conn
    docroot /var/www/html
    index app.php
    path-info ^(/.+\.php)(/.*)?$


listen be_dynamic
    mode http
    bind :80
    use-fcgi-app php-fpm
    option httpchk
    http-check connect proto fcgi
    http-check send meth GET uri /ping
    http-check expect string pong
    server php-fpm1 172.31.31.151:18081 proto fcgi



$ curl 192.168.1.25/404.php
curl: (52) Empty reply from server

0014:be_dynamic.accept(0006)=0010 from [76.110.41.73:51892] ALPN=
0014:be_dynamic.clireq[0010:]: GET /404.php HTTP/1.1
0014:be_dynamic.clihdr[0010:]: host: demo.haproxy.net
0014:be_dynamic.clihdr[0010:]: user-agent: curl/7.58.0
0014:be_dynamic.clihdr[0010:]: accept: */*
Segmentation fault (core dumped)



Backtrace:

(gdb) bt
#0  0x5586ffea472b in sink_write (sink=0x0, msg=0x7ffcd94951b0, 
nmsg=1, level=3, facility=17, tag=0x7ffcd94951c0, pid=0x7ffcd94951d0, 
sd=0x7ffcd94951e0) at include/haproxy/sink.h:55
#1  0x5586ffea8fe2 in __do_send_log (logsrv=0x5587008e9f30, 
nblogger=1, pid_str=0x7f7d48069e7c "21018", pid_size=5, level=3, 
message=0x558700916340 "Primary script unknown\n\n", size=22,
    sd=0x55870025e6c6  "- ", sd_size=2, 
tag_str=0x558700917d70 "php-fpm", tag_size=7) at src/log.c:1742
#2  0x5586ffea9717 in __send_log (logsrvs=0x558700917ed8, 
tag=0x7ffcd94954e0, level=3, message=0x558700916340 "Primary script 
unknown\n\n", size=24, sd=0x55870025e6c6  
"- ",

    sd_size=2) at src/log.c:1851
#3  0x5586ffeaea21 in app_log (logsrvs=0x558700917ed8, 
tag=0x7ffcd94954e0, level=3, format=0x5586fffb3006 "%s") at src/log.c:3097
#4  0x5586ffda3ed8 in fcgi_strm_handle_stderr (fconn=0x5587009a4200, 
fstrm=0x558700987e50) at src/mux_fcgi.c:2376
#5  0x5586ffda5b33 in fcgi_process_demux (fconn=0x5587009a4200) at 
src/mux_fcgi.c:2563
#6  0x5586ffda8eb4 in fcgi_process (fconn=0x5587009a4200) at 
src/mux_fcgi.c:2977
#7  0x5586ffda8b37 in fcgi_io_cb (t=0x558700987e00, 
ctx=0x5587009a4200, status=0) at src/mux_fcgi.c:2948
#8  0x5586fff5e870 in run_tasks_from_list (list=0x5587003893f0 
, max=67) at src/task.c:345

#9  0x5586fff5ecfb in process_runnable_tasks () at src/task.c:446
#10 0x5586ffefa415 in run_poll_loop () at src/haproxy.c:2884
#11 0x5586ffefa956 in run_thread_poll_loop (data=0x0) at 
src/haproxy.c:3056
#12 0x5586ffefc622 in main (argc=4, argv=0x7ffcd94960a8) at 
src/haproxy.c:3758




Thanks,

-- Daniel


>From c5793a0ab7372329b7bc69e823b4bb960311a58a Mon Sep 17 00:00:00 2001
From: Daniel Corbett 
Date: Thu, 18 Jun 2020 00:10:17 -0400
Subject: [PATCH] BUG/MEDIUM: sink: fix crash when null sink is used in 
 __do_send_log

When using a ring log in combination with proto fcgi, it was possible
to cause a crash by sending a request for a non-existent fastcgi file
to php-fpm, causing it to produce the error "Primary script unknown".
When php-fpm produces this error for some reason a sink is not able to be
established and __do_send_log gets passed a null sink.

This commit verifies that the sink exists in __do_send_log before attempting
to use it.
---
 include/haproxy/sink.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/include/haproxy/sink.h b/include/haproxy/sink.h
index 025fa4185..b290d2189 100644
--- a/include/haproxy/sink.h
+++ b/include/haproxy/sink.h
@@ -49,6 +49,9 @@ static inline ssize_t sink_write(struct sink *sink, const struct ist msg[], size
 {
 	ssize_t sent;
 
+	if (!sink)
+		goto fail;
+
 	if (unlikely(sink->ctx.dropped > 0)) {
 		/* We need to take an exclusive lock so that other producers
 		 * don't do the same thing at the same time and above all we
-- 
2.17.1