Re: ap_proxy_acquire_connection crash? (was Re: Broken: apache/httpd#1739 (trunk - f087735))

2021-07-12 Thread Stefan Eissing



> Am 12.07.2021 um 12:34 schrieb Yann Ylavic :
> 
> On Mon, Jul 12, 2021 at 10:01 AM Joe Orton  wrote:
>> 
>> Thread 1 (Thread 0x7f101cff1700 (LWP 3534)):
>> 3622#0  ap_proxy_acquire_connection 
>> (proxy_function=proxy_function@entry=0x7f105467de9f "http", 
>> conn=conn@entry=0x7f101cff0a88, worker=worker@entry=0x55e9deb5c710, 
>> s=0x55e9def94a58) at proxy_util.c:2484
>> 3623#1  0x7f105467bb5f in proxy_http_handler (r=0x7f0ff40234a0, 
>> worker=0x55e9deb5c710, conf=0x55e9decd0e98, url=0x7f0ff400d1c0 
>> "http://localhost:8529/";, proxyname=0x0, proxyport=) at 
>> mod_proxy_http.c:1920
>> 3624#2  0x7f1054c9be0c in proxy_run_scheme_handler 
>> (r=r@entry=0x7f0ff40234a0, worker=0x55e9deb5c710, 
>> conf=conf@entry=0x55e9decd0e98, url=0x7f0ff400d1c0 "http://localhost:8529/";, 
>> proxyhost=proxyhost@entry=0x0, proxyport=proxyport@entry=0) at 
>> mod_proxy.c:3479
>> 3625#3  0x7f1054c9de7e in proxy_handler (r=0x7f0ff40234a0) at 
>> mod_proxy.c:1521
>> 3626#4  0x55e9dcff8480 in ap_run_handler (r=r@entry=0x7f0ff40234a0) at 
>> config.c:169
>> 
>> this is new?
> 
> Should be fixed in r1891477, I can't reproduce anymore with this patch 
> applied.

Thanks, Yann!

> 
> Cheers;
> Yann.



Re: ap_proxy_acquire_connection crash? (was Re: Broken: apache/httpd#1739 (trunk - f087735))

2021-07-12 Thread Yann Ylavic
On Mon, Jul 12, 2021 at 10:01 AM Joe Orton  wrote:
>
> Thread 1 (Thread 0x7f101cff1700 (LWP 3534)):
> 3622#0  ap_proxy_acquire_connection 
> (proxy_function=proxy_function@entry=0x7f105467de9f "http", 
> conn=conn@entry=0x7f101cff0a88, worker=worker@entry=0x55e9deb5c710, 
> s=0x55e9def94a58) at proxy_util.c:2484
> 3623#1  0x7f105467bb5f in proxy_http_handler (r=0x7f0ff40234a0, 
> worker=0x55e9deb5c710, conf=0x55e9decd0e98, url=0x7f0ff400d1c0 
> "http://localhost:8529/";, proxyname=0x0, proxyport=) at 
> mod_proxy_http.c:1920
> 3624#2  0x7f1054c9be0c in proxy_run_scheme_handler 
> (r=r@entry=0x7f0ff40234a0, worker=0x55e9deb5c710, 
> conf=conf@entry=0x55e9decd0e98, url=0x7f0ff400d1c0 "http://localhost:8529/";, 
> proxyhost=proxyhost@entry=0x0, proxyport=proxyport@entry=0) at 
> mod_proxy.c:3479
> 3625#3  0x7f1054c9de7e in proxy_handler (r=0x7f0ff40234a0) at 
> mod_proxy.c:1521
> 3626#4  0x55e9dcff8480 in ap_run_handler (r=r@entry=0x7f0ff40234a0) at 
> config.c:169
>
> this is new?

Should be fixed in r1891477, I can't reproduce anymore with this patch applied.

Cheers;
Yann.


Re: ap_proxy_acquire_connection crash? (was Re: Broken: apache/httpd#1739 (trunk - f087735))

2021-07-12 Thread Yann Ylavic
On Mon, Jul 12, 2021 at 11:11 AM Stefan Eissing
 wrote:
>
> > Am 12.07.2021 um 11:07 schrieb Yann Ylavic :
> >
> > On Mon, Jul 12, 2021 at 10:13 AM Stefan Eissing
> >  wrote:
> >>
> >> Is this the "worker->cp" being NULL?
> >
> > Is there a discussion about this already Stefan?
>
> I have not analyzed all possible paths to this. I just saw that there are 
> several places where worker->cp seems to be lazy initialized.

OK thanks, I thought it was some kind of restart issue but it was not
the case in my backtrace.


Re: ap_proxy_acquire_connection crash? (was Re: Broken: apache/httpd#1739 (trunk - f087735))

2021-07-12 Thread Yann Ylavic
On Mon, Jul 12, 2021 at 10:27 AM Ruediger Pluem  wrote:
>
> Two questions:
>
> 1. If I have a look at the GDB output, I see the following warning:
>
> Does it make sense that we create a /home/travis/.gdbinit whish sets and 
> appropriate auto-load safe-path?

+1

>
> 2. Should we apply the below patch to get more information?

+1 (even if in this case "bt full" wouldn't help much, it may be the
case next time..)


Regards;
Yann.


Re: ap_proxy_acquire_connection crash? (was Re: Broken: apache/httpd#1739 (trunk - f087735))

2021-07-12 Thread Stefan Eissing



> Am 12.07.2021 um 11:07 schrieb Yann Ylavic :
> 
> On Mon, Jul 12, 2021 at 10:13 AM Stefan Eissing
>  wrote:
>> 
>> Is this the "worker->cp" being NULL?
> 
> Is there a discussion about this already Stefan?

I have not analyzed all possible paths to this. I just saw that there are 
several places where worker->cp seems to be lazy initialized.

> 
> It's indeed worker->cp == NULL, I can reproduce on 2.4.x + apr-trunk with:
> $ while t/TEST t/modules/proxy_balancer.t; do :; done
> ...
> (gdb) bt
> #0  0x7f4eaa06716c in ap_proxy_acquire_connection
> (proxy_function=0x7f4eaa03e015 "http", conn=0x7f4e31ffaa90,
> worker=0x55ee904554a0, s=0x55ee90649260) at proxy_util.c:2417
> #1  0x7f4eaa03d28d in proxy_http_handler (r=0x7f4e14002c10,
> worker=0x55ee904554a0, conf=0x55ee90421aa0, url=0x7f4e14007538
> "http://localhost:8529/";, proxyname=0x0, proxyport=0) at
> mod_proxy_http.c:1878
> #2  0x7f4eaa0606fc in proxy_run_scheme_handler (r=0x7f4e14002c10,
> worker=0x55ee904554a0, conf=0x55ee90421aa0, url=0x7f4e14007538
> "http://localhost:8529/";, proxyhost=0x0, proxyport=0) at
> mod_proxy.c:3183
> #3  0x7f4eaa05a65b in proxy_handler (r=0x7f4e14002c10) at mod_proxy.c:1262
> #4  0x55ee8f18a2e5 in ap_run_handler (r=0x7f4e14002c10) at config.c:169
> #5  0x55ee8f18ae1f in ap_invoke_handler (r=0x7f4e14002c10) at config.c:443
> #6  0x55ee8f1a8bea in ap_process_async_request (r=0x7f4e14002c10)
> at http_request.c:452
> #7  0x55ee8f1a4bf8 in ap_process_http_async_connection
> (c=0x7f4e1f98) at http_core.c:155
> #8  0x55ee8f1a4df5 in ap_process_http_connection
> (c=0x7f4e1f98) at http_core.c:246
> #9  0x55ee8f19865a in ap_run_process_connection (c=0x7f4e1f98)
> at connection.c:42
> #10 0x7f4eaa6f5dde in process_socket (thd=0x7f4e84020eb8,
> p=0x7f4e1c68, sock=0x7f4e1cf0, cs=0x7f4e1ef0,
> my_child_num=0, my_thread_num=22) at event.c:1038
> #11 0x7f4eaa6f7f93 in worker_thread (thd=0x7f4e84020eb8,
> dummy=0x7f4e84031cf0) at event.c:2070
> #12 0x7f4eaa770662 in dummy_worker (opaque=0x7f4e84020eb8) at
> threadproc/unix/thread.c:148
> #13 0x7f4eaa6cdea7 in start_thread (arg=) at
> pthread_create.c:477
> #14 0x7f4eaa5fddef in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
> (gdb) l
> 2412 proxy_function, worker->s->hostname_ex);
> 2413return HTTP_SERVICE_UNAVAILABLE;
> 2414}
> 2415}
> 2416
> 2417if (worker->s->hmax && worker->cp->res) {
> 2418rv = apr_reslist_acquire(worker->cp->res, (void **)conn);
> 2419}
> 2420else {
> 2421/* create the new connection if the previous was destroyed */
> (gdb) p *worker
> $5 = {hash = {def = 917659709, fnv = 2489242135}, local_status =
> 1970041715, cp = 0x0, s = 0x7f4ea99dd020, balancer = 0x55ee90422128,
> tmutex = 0x0, context = 0x0, section_config = 0x55ee90455490}



Re: ap_proxy_acquire_connection crash? (was Re: Broken: apache/httpd#1739 (trunk - f087735))

2021-07-12 Thread Yann Ylavic
On Mon, Jul 12, 2021 at 10:13 AM Stefan Eissing
 wrote:
>
> Is this the "worker->cp" being NULL?

Is there a discussion about this already Stefan?

It's indeed worker->cp == NULL, I can reproduce on 2.4.x + apr-trunk with:
$ while t/TEST t/modules/proxy_balancer.t; do :; done
...
(gdb) bt
#0  0x7f4eaa06716c in ap_proxy_acquire_connection
(proxy_function=0x7f4eaa03e015 "http", conn=0x7f4e31ffaa90,
worker=0x55ee904554a0, s=0x55ee90649260) at proxy_util.c:2417
#1  0x7f4eaa03d28d in proxy_http_handler (r=0x7f4e14002c10,
worker=0x55ee904554a0, conf=0x55ee90421aa0, url=0x7f4e14007538
"http://localhost:8529/";, proxyname=0x0, proxyport=0) at
mod_proxy_http.c:1878
#2  0x7f4eaa0606fc in proxy_run_scheme_handler (r=0x7f4e14002c10,
worker=0x55ee904554a0, conf=0x55ee90421aa0, url=0x7f4e14007538
"http://localhost:8529/";, proxyhost=0x0, proxyport=0) at
mod_proxy.c:3183
#3  0x7f4eaa05a65b in proxy_handler (r=0x7f4e14002c10) at mod_proxy.c:1262
#4  0x55ee8f18a2e5 in ap_run_handler (r=0x7f4e14002c10) at config.c:169
#5  0x55ee8f18ae1f in ap_invoke_handler (r=0x7f4e14002c10) at config.c:443
#6  0x55ee8f1a8bea in ap_process_async_request (r=0x7f4e14002c10)
at http_request.c:452
#7  0x55ee8f1a4bf8 in ap_process_http_async_connection
(c=0x7f4e1f98) at http_core.c:155
#8  0x55ee8f1a4df5 in ap_process_http_connection
(c=0x7f4e1f98) at http_core.c:246
#9  0x55ee8f19865a in ap_run_process_connection (c=0x7f4e1f98)
at connection.c:42
#10 0x7f4eaa6f5dde in process_socket (thd=0x7f4e84020eb8,
p=0x7f4e1c68, sock=0x7f4e1cf0, cs=0x7f4e1ef0,
my_child_num=0, my_thread_num=22) at event.c:1038
#11 0x7f4eaa6f7f93 in worker_thread (thd=0x7f4e84020eb8,
dummy=0x7f4e84031cf0) at event.c:2070
#12 0x7f4eaa770662 in dummy_worker (opaque=0x7f4e84020eb8) at
threadproc/unix/thread.c:148
#13 0x7f4eaa6cdea7 in start_thread (arg=) at
pthread_create.c:477
#14 0x7f4eaa5fddef in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:95
(gdb) l
2412 proxy_function, worker->s->hostname_ex);
2413return HTTP_SERVICE_UNAVAILABLE;
2414}
2415}
2416
2417if (worker->s->hmax && worker->cp->res) {
2418rv = apr_reslist_acquire(worker->cp->res, (void **)conn);
2419}
2420else {
2421/* create the new connection if the previous was destroyed */
(gdb) p *worker
$5 = {hash = {def = 917659709, fnv = 2489242135}, local_status =
1970041715, cp = 0x0, s = 0x7f4ea99dd020, balancer = 0x55ee90422128,
tmutex = 0x0, context = 0x0, section_config = 0x55ee90455490}


Re: ap_proxy_acquire_connection crash? (was Re: Broken: apache/httpd#1739 (trunk - f087735))

2021-07-12 Thread Ruediger Pluem



On 7/12/21 10:00 AM, Joe Orton wrote:
> On Fri, Jul 09, 2021 at 06:00:26PM +, Travis CI wrote:
>> Build Update for apache/httpd
>> -
>>
>> Build: #1739
>> Status: Broken
>>
>> Duration: 21 mins and 57 secs
>> Commit: f087735 (trunk)
>> Author: Rich Bowen
>> Message: s/freenode/libera/
>>
>> git-svn-id: https://svn.apache.org/repos/asf/httpd/httpd/trunk@1891404 
>> 13f79535-47bb-0310-9956-ffa450edef68
>>
>> View the changeset: 
>> https://github.com/apache/httpd/compare/bc06af472169...f087735b4d6e
>>
>> View the full build log and details: 
>> https://app.travis-ci.com/github/apache/httpd/builds/232281086?utm_medium=notification&utm_source=email
> 
> I'm guessing this is the same crash seen in 2.4.x builds recently, but 
> in the trunk build it has a useful backtrace:
> 
> Thread 1 (Thread 0x7f101cff1700 (LWP 3534)):
> 3622#0  ap_proxy_acquire_connection 
> (proxy_function=proxy_function@entry=0x7f105467de9f "http", 
> conn=conn@entry=0x7f101cff0a88, worker=worker@entry=0x55e9deb5c710, 
> s=0x55e9def94a58) at proxy_util.c:2484
> 3623#1  0x7f105467bb5f in proxy_http_handler (r=0x7f0ff40234a0, 
> worker=0x55e9deb5c710, conf=0x55e9decd0e98, url=0x7f0ff400d1c0 
> "http://localhost:8529/";, proxyname=0x0, proxyport=) at 
> mod_proxy_http.c:1920
> 3624#2  0x7f1054c9be0c in proxy_run_scheme_handler 
> (r=r@entry=0x7f0ff40234a0, worker=0x55e9deb5c710, 
> conf=conf@entry=0x55e9decd0e98, url=0x7f0ff400d1c0 "http://localhost:8529/";, 
> proxyhost=proxyhost@entry=0x0, proxyport=proxyport@entry=0) at 
> mod_proxy.c:3479
> 3625#3  0x7f1054c9de7e in proxy_handler (r=0x7f0ff40234a0) at 
> mod_proxy.c:1521
> 3626#4  0x55e9dcff8480 in ap_run_handler (r=r@entry=0x7f0ff40234a0) at 
> config.c:169
> 
> this is new?

Two questions:

1. If I have a look at the GDB output, I see the following warning:

warning: File "/home/travis/build/apache/httpd/.gdbinit" auto-loading has been 
declined by your `auto-load safe-path' set to
"$debugdir:$datadir/auto-load".

To enable execution of this file add

add-auto-load-safe-path /home/travis/build/apache/httpd/.gdbinit

line to your configuration file "/home/travis/.gdbinit".

To completely disable this security protection add

set auto-load safe-path /

line to your configuration file "/home/travis/.gdbinit".


Does it make sense that we create a /home/travis/.gdbinit whish sets and 
appropriate auto-load safe-path?

2. Should we apply the below patch to get more information?

Index: test/travis_run_linux.sh
===
--- test/travis_run_linux.sh(revision 1891472)
+++ test/travis_run_linux.sh(working copy)
@@ -186,7 +186,7 @@

 shopt -s nullglob
 for core in test/perl-framework/t/core* ; do
-gdb -ex 'thread apply all backtrace' -batch ./httpd "$core"
+gdb -ex 'thread apply all backtrace full' -batch ./httpd "$core"
 RV=5
 done

Regards

RĂ¼diger

> 
> 


Re: ap_proxy_acquire_connection crash? (was Re: Broken: apache/httpd#1739 (trunk - f087735))

2021-07-12 Thread Stefan Eissing



> Am 12.07.2021 um 10:00 schrieb Joe Orton :
> 
> On Fri, Jul 09, 2021 at 06:00:26PM +, Travis CI wrote:
>> Build Update for apache/httpd
>> -
>> 
>> Build: #1739
>> Status: Broken
>> 
>> Duration: 21 mins and 57 secs
>> Commit: f087735 (trunk)
>> Author: Rich Bowen
>> Message: s/freenode/libera/
>> 
>> git-svn-id: https://svn.apache.org/repos/asf/httpd/httpd/trunk@1891404 
>> 13f79535-47bb-0310-9956-ffa450edef68
>> 
>> View the changeset: 
>> https://github.com/apache/httpd/compare/bc06af472169...f087735b4d6e
>> 
>> View the full build log and details: 
>> https://app.travis-ci.com/github/apache/httpd/builds/232281086?utm_medium=notification&utm_source=email
> 
> I'm guessing this is the same crash seen in 2.4.x builds recently, but 
> in the trunk build it has a useful backtrace:
> 
> Thread 1 (Thread 0x7f101cff1700 (LWP 3534)):
> 3622#0  ap_proxy_acquire_connection 
> (proxy_function=proxy_function@entry=0x7f105467de9f "http", 
> conn=conn@entry=0x7f101cff0a88, worker=worker@entry=0x55e9deb5c710, 
> s=0x55e9def94a58) at proxy_util.c:2484
> 3623#1  0x7f105467bb5f in proxy_http_handler (r=0x7f0ff40234a0, 
> worker=0x55e9deb5c710, conf=0x55e9decd0e98, url=0x7f0ff400d1c0 
> "http://localhost:8529/";, proxyname=0x0, proxyport=) at 
> mod_proxy_http.c:1920
> 3624#2  0x7f1054c9be0c in proxy_run_scheme_handler 
> (r=r@entry=0x7f0ff40234a0, worker=0x55e9deb5c710, 
> conf=conf@entry=0x55e9decd0e98, url=0x7f0ff400d1c0 "http://localhost:8529/";, 
> proxyhost=proxyhost@entry=0x0, proxyport=proxyport@entry=0) at 
> mod_proxy.c:3479
> 3625#3  0x7f1054c9de7e in proxy_handler (r=0x7f0ff40234a0) at 
> mod_proxy.c:1521
> 3626#4  0x55e9dcff8480 in ap_run_handler (r=r@entry=0x7f0ff40234a0) at 
> config.c:169
> 
> this is new?

Is this the "worker->cp" being NULL?