Bug#1036213: apache2: frequent SIGSEGV in mod_http2.so (purge_consumed_buckets)

2023-05-17 Thread Bastien Durel
Package: apache2
Version: 2.4.56-1~deb11u2
Severity: important

Dear Maintainer,

I see many segmentation faults in apache2, for exemple in the last 24h I got:

Tue 2023-05-16 13:40:59 CEST 7757403333  11 present   /usr/sbin/apache2
Tue 2023-05-16 13:52:44 CEST 7983293333  11 present   /usr/sbin/apache2
Tue 2023-05-16 16:15:46 CEST 8107093333  11 present   /usr/sbin/apache2
Tue 2023-05-16 16:28:55 CEST 8174833333  11 present   /usr/sbin/apache2
Tue 2023-05-16 17:59:23 CEST 8231293333  11 present   /usr/sbin/apache2
Tue 2023-05-16 18:35:50 CEST 8269743333  11 present   /usr/sbin/apache2
Tue 2023-05-16 18:44:15 CEST 8319743333  11 present   /usr/sbin/apache2
Tue 2023-05-16 18:44:56 CEST 8361743333  11 present   /usr/sbin/apache2
Tue 2023-05-16 18:54:56 CEST 8226183333  11 present   /usr/sbin/apache2
Tue 2023-05-16 21:12:28 CEST 8362463333  11 present   /usr/sbin/apache2
Tue 2023-05-16 21:21:10 CEST 8539593333  11 present   /usr/sbin/apache2
Tue 2023-05-16 22:04:42 CEST 8587493333  11 present   /usr/sbin/apache2
Tue 2023-05-16 23:26:09 CEST 8666103333  11 present   /usr/sbin/apache2
Wed 2023-05-17 00:08:42 CEST 8659683333  11 present   /usr/sbin/apache2
Wed 2023-05-17 00:24:04 CEST 8748073333  11 present   /usr/sbin/apache2
Wed 2023-05-17 00:47:25 CEST 8786753333  11 present   /usr/sbin/apache2
Wed 2023-05-17 01:42:14 CEST 8775803333  11 present   /usr/sbin/apache2
Wed 2023-05-17 09:21:02 CEST 9497813333  11 present   /usr/sbin/apache2
Wed 2023-05-17 09:50:49 CEST 9547843333  11 present   /usr/sbin/apache2

All crashes I looked into are at the same function: purge_consumed_buckets at 
h2_bucket_beam.c:159

Here is the output of the "bt full" command from the core:

#0  0x7ffb03778981 in purge_consumed_buckets 
(beam=beam@entry=0x7ffae452c0a0) at h2_bucket_beam.c:159
b = 0x7ffae45ea108
#1  0x7ffb03778aaf in beam_shutdown (how=APR_SHUTDOWN_READWRITE, 
beam=) at h2_bucket_beam.c:258
No locals.
#2  beam_shutdown (how=APR_SHUTDOWN_READWRITE, beam=0x7ffae452c0a0) at 
h2_bucket_beam.c:242
No locals.
#3  beam_cleanup (data=0x7ffae452c0a0) at h2_bucket_beam.c:265
beam = 0x7ffae452c0a0
#4  0x7ffb03e6780e in run_cleanups (cref=0x7ffae452c098) at 
./memory/unix/apr_pools.c:2629
c = 
c = 
#5  apr_pool_destroy (pool=0x7ffae452c028) at ./memory/unix/apr_pools.c:987
active = 
allocator = 
#6  0x7ffb03e6782d in apr_pool_destroy (pool=0x7ffae4530028) at 
./memory/unix/apr_pools.c:997
active = 
allocator = 
#7  0x7ffb03e6782d in apr_pool_destroy (pool=0x7ffae4551028) at 
./memory/unix/apr_pools.c:997
active = 
allocator = 
#8  0x7ffb03e6782d in apr_pool_destroy (pool=0x7ffae45a1028) at 
./memory/unix/apr_pools.c:997
active = 
allocator = 
#9  0x7ffb03e6782d in apr_pool_destroy (pool=0x7ffae4606028) at 
./memory/unix/apr_pools.c:997
active = 
allocator = 
#10 0x7ffb037914c5 in h2_session_pre_close (session=0x7ffae46060a0, 
async=) at h2_session.c:1988
status = 0
#11 0x7ffb0377b745 in h2_c1_pre_close (ctx=, c=) at h2_c1.c:180
status = 
conn_ctx = 
#12 0x56438478c9b0 in ap_run_pre_close_connection 
(c=c@entry=0x7ffae4614360) at connection.c:44
pHook = 
n = 0
rv = 0
#13 0x56438478cade in ap_prep_lingering_close (c=0x7ffae4614360) at 
connection.c:101
No locals.
#14 ap_start_lingering_close (c=0x7ffae4614360) at connection.c:127
csd = 0x7ffae46140b0
#15 0x7ffb03b08abe in process_lingering_close (cs=0x7ffae46142b0) at 
event.c:1500
csd = 0x7ffae46140b0
dummybuf = 
"\027\003\003\000\023\067\020\251\027\003\215Re\345\310{\f8\312X\332N\310\375", 
'\000' ...
nbytes = 0
rv = 
q = 
#16 0x7ffb03b0a512 in process_socket (thd=thd@entry=0x7ffb037345c8, 
p=, sock=, cs=, 
my_child_num=my_child_num@entry=3, my_thread_num=my_thread_num@entry=16) at 
event.c:1238
c = 
conn_id = 
clogging = 
rv = 
rc = 
#17 0x7ffb03b0b125 in worker_thread (thd=0x7ffb037345c8, dummy=) at event.c:2199
csd = 0x7ffae46140b0
cs = 0x7ffae46142b0
te = 0x0
ptrans = 0x0
ti = 
process_slot = -855667096
thread_slot = 16
rv = 
is_idle = 0
#18 0x7ffb03e2aea7 in start_thread (arg=) at 
pthread_create.c:477
ret = 
pd = 
unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140715157853952, 
-1517716079030320448, 140715846122926, 140715846122927, 140715157852032, 
8396800, 1520638580441989824, 1520521782042673856}, mask_was_saved = 0}}, priv 
= {pad = {0x0, 
  0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 
0}}}
not_first_call = 0
#19 0x7ffb03d4aa2f in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S

Bug#1036213: apache2: frequent SIGSEGV in mod_http2.so (purge_consumed_buckets)

2023-05-18 Thread Bastien Durel
9ba06430a0, 
thread=thread@entry=0x7f9bafdfb0b0, worker_id=) at h2_c2.c:658
conn_ctx = 0x7f9ba06434b0
#18 0x7f9bb1187567 in slot_run (thread=0x7f9bafdfb0b0, wctx=0x7f9bb0511850) 
at h2_workers.c:299
slot = 0x7f9bb0511850
workers = 0x7f9bb05110a0
c = 0x7f9ba06430a0
rv = 
#19 0x7f9bb1844ea7 in start_thread (arg=) at 
pthread_create.c:477
ret = 
pd = 
unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140306500486912, 
3999667409471690328, 140722438679950, 140722438679951, 140306500484992, 
8396800, -3983747864113949096, -3983768835883903400}, mask_was_saved = 0}}, 
priv = {pad = {
  0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, 
canceltype = 0}}}
not_first_call = 0
#20 0x7f9bb1764a2f in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:95


The request itself seems legit (I can run it without crashing the
server).

The last log entry before 21:29:30 for this query is a few minutes
before the crash :

/var/log/apache2/access.log:37.98.109.173 - - [17/May/2023:21:26:57 +0200] "GET 
/dns-query?dns=AAABAAABDHNhZmVicm93c2luZwpnb29nbGVhcGlzA2NvbQAAHAAB 
HTTP/2.0" 200 103 "-" "-"

At this time, I have 23 ESTABLISHED (no TIME_WAIT) connections from
this IP, with no requests in access log since yesterday ; so I guess
it's a mis-behaving client

Regards,

Le mercredi 17 mai 2023 à 13:24 +0200, Stefan Eissing a écrit :
> Sorry to hear about this. I think the recent change that could be
> relevant here is
> the addition of:
> 
> h2_mplx.c#515:    c1_purge_streams(m);
> 
> as seen in Apache httpd trunk and at https://github.com/icing/mod_h2.
> 
> This is intended to assure that streams and their requests are
> destroyed
> in the right order when the connection is shut down.
> 
> Connection shutdown can happen at any time during request processing
> and
> this makes it hard to reproduce issues in test cases. We have load
> tests
> with well-behaving clients. Tests with mis-behaving ones are the
> tricky part.
> 
> It would be helpful if you could try
> https://github.com/icing/mod_h2/releases/tag/v2.0.15 
> on your system, to see how that is faring.
> 
> Kind Regards,
> Stefan
> 
> > Am 17.05.2023 um 12:24 schrieb Bastien Durel :
> > 
> > Package: apache2
> > Version: 2.4.56-1~deb11u2
> > Severity: important
> > 
> > Dear Maintainer,
> > 
> > I see many segmentation faults in apache2, for exemple in the last
> > 24h I got:
> > 
> > Tue 2023-05-16 13:40:59 CEST 775740    33    33  11 present  
> > /usr/sbin/apache2
> > Tue 2023-05-16 13:52:44 CEST 798329    33    33  11 present  
> > /usr/sbin/apache2
> > Tue 2023-05-16 16:15:46 CEST 810709    33    33  11 present  
> > /usr/sbin/apache2
> > Tue 2023-05-16 16:28:55 CEST 817483    33    33  11 present  
> > /usr/sbin/apache2
> > Tue 2023-05-16 17:59:23 CEST 823129    33    33  11 present  
> > /usr/sbin/apache2
> > Tue 2023-05-16 18:35:50 CEST 826974    33    33  11 present  
> > /usr/sbin/apache2
> > Tue 2023-05-16 18:44:15 CEST 831974    33    33  11 present  
> > /usr/sbin/apache2
> > Tue 2023-05-16 18:44:56 CEST 836174    33    33  11 present  
> > /usr/sbin/apache2
> > Tue 2023-05-16 18:54:56 CEST 822618    33    33  11 present  
> > /usr/sbin/apache2
> > Tue 2023-05-16 21:12:28 CEST 836246    33    33  11 present  
> > /usr/sbin/apache2
> > Tue 2023-05-16 21:21:10 CEST 853959    33    33  11 present  
> > /usr/sbin/apache2
> > Tue 2023-05-16 22:04:42 CEST 858749    33    33  11 present  
> > /usr/sbin/apache2
> > Tue 2023-05-16 23:26:09 CEST 866610    33    33  11 present  
> > /usr/sbin/apache2
> > Wed 2023-05-17 00:08:42 CEST 865968    33    33  11 present  
> > /usr/sbin/apache2
> > Wed 2023-05-17 00:24:04 CEST 874807    33    33  11 present  
> > /usr/sbin/apache2
> > Wed 2023-05-17 00:47:25 CEST 878675    33    33  11 present  
> > /usr/sbin/apache2
> > Wed 2023-05-17 01:42:14 CEST 877580    33    33  11 present  
> > /usr/sbin/apache2
> > Wed 2023-05-17 09:21:02 CEST 949781    33    33  11 present  
> > /usr/sbin/apache2
> > Wed 2023-05-17 09:50:49 CEST 954784    33    33  11 present  
> > /usr/sbin/apache2
> > 
> > All crashes I looked into are at the same function:
> > purge_consumed_buckets at h2_bucket_beam.c:159
> > 
> > Here is the output of the "bt full" command from the core:
> > 
> > #0  0x7ffb03778981 in purge_consumed_buckets
> > (beam=beam@entry=0x7ffae452c0a0) at h2_bucket_beam.c:159
> >    b = 0x7ffae45ea108
> > #1  0x7ffb03778aaf in beam_shutdown
> > (h

Bug#1036213: apache2: frequent SIGSEGV in mod_http2.so (purge_consumed_buckets)

2023-05-18 Thread Bastien Durel

Le 18/05/2023 à 14:41, Stefan Eissing a écrit :

Did you have an warning message like "AH03516: unexpected NN streams in hold" 
at that time in out error log?


No (grepping AH03516 in *.log returns nothing (nor does "streams in hold"))

--
Bastien Durel



Bug#1036213: apache2: frequent SIGSEGV in mod_http2.so (purge_consumed_buckets)

2023-05-19 Thread Bastien Durel
Le jeudi 18 mai 2023 à 15:04 +0200, Bastien Durel a écrit :
> Le 18/05/2023 à 14:41, Stefan Eissing a écrit :
> > Did you have an warning message like "AH03516: unexpected NN
> > streams in hold" at that time in out error log?
> 
> No (grepping AH03516 in *.log returns nothing (nor does "streams in
> hold"))
> 
Hello. I got another one in ap_save_brigade()

-- 
Bastien


output.log.bz2
Description: application/bzip


Bug#1036213: apache2: frequent SIGSEGV in mod_http2.so (purge_consumed_buckets)

2023-05-19 Thread Bastien Durel

Le 19/05/2023 à 18:11, Stefan Eissing a écrit :

Hi Bastien,


Hello


I am looking into the failures of mod_proxy_http2 and seeing an issue with 
connection reuse and subsequent handling of requests.

Going back to the original issue, do you still see those with mod_h2 v2.0.15? 
That would be interesting to know.

No, I had only 2 segfaults since I upgraded to 2.0.15, and none is in 
purge_consumed_buckets :


- proxy_pass_brigade
- ap_save_brigade


I do not know if just relying on mod_proxy_http in your setup would be 
possible. But it could help us isolating issues.


I don't think so ... The knot-resolver I forward DOH requests to seems 
to only support HTTP/2 ; if I change it to https://, dog fails to 
resolve on it :


Error [http]: Nameserver returned HTTP 502 (Proxy Error)

Regards,

--
Bastien Durel



Bug#1036213: apache2: frequent SIGSEGV in mod_http2.so (purge_consumed_buckets)

2023-05-22 Thread Bastien Durel
Le lundi 22 mai 2023 à 16:36 +0200, Stefan Eissing a écrit :
> 
> 
> > Am 19.05.2023 um 18:50 schrieb Bastien Durel :
> > 
> > Le 19/05/2023 à 18:11, Stefan Eissing a écrit :
> > > Hi Bastien,
> > Hello
> > 
> > > I am looking into the failures of mod_proxy_http2 and seeing an
> > > issue with connection reuse and subsequent handling of requests.
> > > Going back to the original issue, do you still see those with
> > > mod_h2 v2.0.15? That would be interesting to know.
> > No, I had only 2 segfaults since I upgraded to 2.0.15, and none is
> > in purge_consumed_buckets :
> > 
> > - proxy_pass_brigade
> > - ap_save_brigade
> > 
> > > I do not know if just relying on mod_proxy_http in your setup
> > > would be possible. But it could help us isolating issues.
> > 
> > I don't think so ... The knot-resolver I forward DOH requests to
> > seems to only support HTTP/2 ; if I change it to https://,
> > dog fails to resolve on it :
> > 
> > Error [http]: Nameserver returned HTTP 502 (Proxy Error)
> 
> I just released v2.0.16
> <https://github.com/icing/mod_h2/releases/tag/v2.0.16>. It contains a
> fix in mod_proxy_http2 retry handling. Not sure if this is related to
> your crashes as I was so far unsuccessful to reproduce those locally.
> But maybe you'd like to give this a try.

Hello,

Thanks. I've compiled and installed v2.0.16.

I "only" had 2 crashes([1][2]) since friday, so I guess I'll have to
monitor a few days to report back.

Thanks for your work,


[1]
#0  0x7f9bb04650b3 in ssl_io_filter_output (f=0x7f9ba03c68d8, 
bb=0x7f9ba04a83b8) at ssl_engine_io.c:2008
#1  0x7f9bb0597999 in proxy_pass_brigade (flush=1, bb=, 
origin=, p_conn=0x7f9ba03c80a0, bucket_alloc=) at 
h2_proxy_session.c:218
#2  raw_send (ngh2=, data=, length=70, 
flags=, user_data=0x7f9ba04a8190) at h2_proxy_session.c:244
#3  0x7f9bb14a11f9 in nghttp2_session_send () from 
/usr/lib/x86_64-linux-gnu/libnghttp2.so.14
#4  0x7f9bb059a9b9 in send_loop (session=0x7f9ba04a8190) at 
h2_proxy_session.c:1517
#5  h2_proxy_session_process (session=0x7f9ba04a8190) at h2_proxy_session.c:1553
#6  0x7f9bb059db44 in ctx_run (ctx=0x7f9ba03524d0) at mod_proxy_http2.c:258

[2]
#0  0x7f9bb04650c5 in ssl_io_filter_output (f=0x7f9ba01388d8, 
bb=0x7f9ba05613b8) at ssl_engine_io.c:2008
#1  0x7f9bb0597999 in proxy_pass_brigade (flush=1, bb=, 
origin=, p_conn=0x7f9ba013a0a0, bucket_alloc=) at 
h2_proxy_session.c:218
#2  raw_send (ngh2=, data=, length=74, 
flags=, user_data=0x7f9ba0561190) at h2_proxy_session.c:244
#3  0x7f9bb14a11f9 in nghttp2_session_send () from 
/usr/lib/x86_64-linux-gnu/libnghttp2.so.14
#4  0x7f9bb059a9b9 in send_loop (session=0x7f9ba0561190) at 
h2_proxy_session.c:1517
#5  h2_proxy_session_process (session=0x7f9ba0561190) at h2_proxy_session.c:1553
#6  0x7f9bb059db44 in ctx_run (ctx=0x7f9ba014c4d0) at mod_proxy_http2.c:258

-- 
Bastien



Bug#1036213: apache2: frequent SIGSEGV in mod_http2.so (purge_consumed_buckets)

2023-05-24 Thread Bastien Durel
Le mercredi 24 mai 2023 à 14:50 +0200, Stefan Eissing a écrit :
> I continue to improve mod_proxy_http2:
> https://github.com/icing/mod_h2/releases/tag/v2.0.17
> 
> Added more edge case tests for the module, fixed observed bugs. But
> have not replicated your crashes which look weird. Sorry.

Hello,

I've put it in use on my server.

Do you need the configuration I use to serve these requests ?

Thanks,

-- 
Bastien



Bug#1036213: apache2: frequent SIGSEGV in mod_http2.so (purge_consumed_buckets)

2023-05-31 Thread Bastien Durel

Le 31/05/2023 à 13:15, Stefan Eissing a écrit :

Hi Bastien,

I was finally able to reproduce here what looks like the crashes you see with 
mod_proxy_http2 (notice the careful wording). And I fixed it in 
https://github.com/icing/mod_h2/releases/tag/v2.0.18

Looking forward to hear how it fares on your system.


Hello,

Thanks for your work. I've put it on my system, I'll report any new 
crash (last week I got approximately one every 2 days)


Regards,

--
Bastien Durel



Bug#1036213: apache2: frequent SIGSEGV in mod_http2.so (purge_consumed_buckets)

2023-06-08 Thread Bastien Durel
Le mercredi 31 mai 2023 à 13:15 +0200, Stefan Eissing a écrit :
> Hi Bastien,
> 
> I was finally able to reproduce here what looks like the crashes you
> see with mod_proxy_http2 (notice the careful wording). And I fixed it
> in https://github.com/icing/mod_h2/releases/tag/v2.0.18
> 
> Looking forward to hear how it fares on your system.
> 
Hello,

After a week running on my system, I did not got any crash at all.

So maybe you effectively found and fix the problem I saw :)

Regards,

-- 
Bastien