Bug#1036213: apache2: frequent SIGSEGV in mod_http2.so (purge_consumed_buckets)
Package: apache2 Version: 2.4.56-1~deb11u2 Severity: important Dear Maintainer, I see many segmentation faults in apache2, for exemple in the last 24h I got: Tue 2023-05-16 13:40:59 CEST 7757403333 11 present /usr/sbin/apache2 Tue 2023-05-16 13:52:44 CEST 7983293333 11 present /usr/sbin/apache2 Tue 2023-05-16 16:15:46 CEST 8107093333 11 present /usr/sbin/apache2 Tue 2023-05-16 16:28:55 CEST 8174833333 11 present /usr/sbin/apache2 Tue 2023-05-16 17:59:23 CEST 8231293333 11 present /usr/sbin/apache2 Tue 2023-05-16 18:35:50 CEST 8269743333 11 present /usr/sbin/apache2 Tue 2023-05-16 18:44:15 CEST 8319743333 11 present /usr/sbin/apache2 Tue 2023-05-16 18:44:56 CEST 8361743333 11 present /usr/sbin/apache2 Tue 2023-05-16 18:54:56 CEST 8226183333 11 present /usr/sbin/apache2 Tue 2023-05-16 21:12:28 CEST 8362463333 11 present /usr/sbin/apache2 Tue 2023-05-16 21:21:10 CEST 8539593333 11 present /usr/sbin/apache2 Tue 2023-05-16 22:04:42 CEST 8587493333 11 present /usr/sbin/apache2 Tue 2023-05-16 23:26:09 CEST 8666103333 11 present /usr/sbin/apache2 Wed 2023-05-17 00:08:42 CEST 8659683333 11 present /usr/sbin/apache2 Wed 2023-05-17 00:24:04 CEST 8748073333 11 present /usr/sbin/apache2 Wed 2023-05-17 00:47:25 CEST 8786753333 11 present /usr/sbin/apache2 Wed 2023-05-17 01:42:14 CEST 8775803333 11 present /usr/sbin/apache2 Wed 2023-05-17 09:21:02 CEST 9497813333 11 present /usr/sbin/apache2 Wed 2023-05-17 09:50:49 CEST 9547843333 11 present /usr/sbin/apache2 All crashes I looked into are at the same function: purge_consumed_buckets at h2_bucket_beam.c:159 Here is the output of the "bt full" command from the core: #0 0x7ffb03778981 in purge_consumed_buckets (beam=beam@entry=0x7ffae452c0a0) at h2_bucket_beam.c:159 b = 0x7ffae45ea108 #1 0x7ffb03778aaf in beam_shutdown (how=APR_SHUTDOWN_READWRITE, beam=) at h2_bucket_beam.c:258 No locals. #2 beam_shutdown (how=APR_SHUTDOWN_READWRITE, beam=0x7ffae452c0a0) at h2_bucket_beam.c:242 No locals. #3 beam_cleanup (data=0x7ffae452c0a0) at h2_bucket_beam.c:265 beam = 0x7ffae452c0a0 #4 0x7ffb03e6780e in run_cleanups (cref=0x7ffae452c098) at ./memory/unix/apr_pools.c:2629 c = c = #5 apr_pool_destroy (pool=0x7ffae452c028) at ./memory/unix/apr_pools.c:987 active = allocator = #6 0x7ffb03e6782d in apr_pool_destroy (pool=0x7ffae4530028) at ./memory/unix/apr_pools.c:997 active = allocator = #7 0x7ffb03e6782d in apr_pool_destroy (pool=0x7ffae4551028) at ./memory/unix/apr_pools.c:997 active = allocator = #8 0x7ffb03e6782d in apr_pool_destroy (pool=0x7ffae45a1028) at ./memory/unix/apr_pools.c:997 active = allocator = #9 0x7ffb03e6782d in apr_pool_destroy (pool=0x7ffae4606028) at ./memory/unix/apr_pools.c:997 active = allocator = #10 0x7ffb037914c5 in h2_session_pre_close (session=0x7ffae46060a0, async=) at h2_session.c:1988 status = 0 #11 0x7ffb0377b745 in h2_c1_pre_close (ctx=, c=) at h2_c1.c:180 status = conn_ctx = #12 0x56438478c9b0 in ap_run_pre_close_connection (c=c@entry=0x7ffae4614360) at connection.c:44 pHook = n = 0 rv = 0 #13 0x56438478cade in ap_prep_lingering_close (c=0x7ffae4614360) at connection.c:101 No locals. #14 ap_start_lingering_close (c=0x7ffae4614360) at connection.c:127 csd = 0x7ffae46140b0 #15 0x7ffb03b08abe in process_lingering_close (cs=0x7ffae46142b0) at event.c:1500 csd = 0x7ffae46140b0 dummybuf = "\027\003\003\000\023\067\020\251\027\003\215Re\345\310{\f8\312X\332N\310\375", '\000' ... nbytes = 0 rv = q = #16 0x7ffb03b0a512 in process_socket (thd=thd@entry=0x7ffb037345c8, p=, sock=, cs=, my_child_num=my_child_num@entry=3, my_thread_num=my_thread_num@entry=16) at event.c:1238 c = conn_id = clogging = rv = rc = #17 0x7ffb03b0b125 in worker_thread (thd=0x7ffb037345c8, dummy=) at event.c:2199 csd = 0x7ffae46140b0 cs = 0x7ffae46142b0 te = 0x0 ptrans = 0x0 ti = process_slot = -855667096 thread_slot = 16 rv = is_idle = 0 #18 0x7ffb03e2aea7 in start_thread (arg=) at pthread_create.c:477 ret = pd = unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140715157853952, -1517716079030320448, 140715846122926, 140715846122927, 140715157852032, 8396800, 1520638580441989824, 1520521782042673856}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}} not_first_call = 0 #19 0x7ffb03d4aa2f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S
Bug#1036213: apache2: frequent SIGSEGV in mod_http2.so (purge_consumed_buckets)
9ba06430a0, thread=thread@entry=0x7f9bafdfb0b0, worker_id=) at h2_c2.c:658 conn_ctx = 0x7f9ba06434b0 #18 0x7f9bb1187567 in slot_run (thread=0x7f9bafdfb0b0, wctx=0x7f9bb0511850) at h2_workers.c:299 slot = 0x7f9bb0511850 workers = 0x7f9bb05110a0 c = 0x7f9ba06430a0 rv = #19 0x7f9bb1844ea7 in start_thread (arg=) at pthread_create.c:477 ret = pd = unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140306500486912, 3999667409471690328, 140722438679950, 140722438679951, 140306500484992, 8396800, -3983747864113949096, -3983768835883903400}, mask_was_saved = 0}}, priv = {pad = { 0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}} not_first_call = 0 #20 0x7f9bb1764a2f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 The request itself seems legit (I can run it without crashing the server). The last log entry before 21:29:30 for this query is a few minutes before the crash : /var/log/apache2/access.log:37.98.109.173 - - [17/May/2023:21:26:57 +0200] "GET /dns-query?dns=AAABAAABDHNhZmVicm93c2luZwpnb29nbGVhcGlzA2NvbQAAHAAB HTTP/2.0" 200 103 "-" "-" At this time, I have 23 ESTABLISHED (no TIME_WAIT) connections from this IP, with no requests in access log since yesterday ; so I guess it's a mis-behaving client Regards, Le mercredi 17 mai 2023 à 13:24 +0200, Stefan Eissing a écrit : > Sorry to hear about this. I think the recent change that could be > relevant here is > the addition of: > > h2_mplx.c#515: c1_purge_streams(m); > > as seen in Apache httpd trunk and at https://github.com/icing/mod_h2. > > This is intended to assure that streams and their requests are > destroyed > in the right order when the connection is shut down. > > Connection shutdown can happen at any time during request processing > and > this makes it hard to reproduce issues in test cases. We have load > tests > with well-behaving clients. Tests with mis-behaving ones are the > tricky part. > > It would be helpful if you could try > https://github.com/icing/mod_h2/releases/tag/v2.0.15 > on your system, to see how that is faring. > > Kind Regards, > Stefan > > > Am 17.05.2023 um 12:24 schrieb Bastien Durel : > > > > Package: apache2 > > Version: 2.4.56-1~deb11u2 > > Severity: important > > > > Dear Maintainer, > > > > I see many segmentation faults in apache2, for exemple in the last > > 24h I got: > > > > Tue 2023-05-16 13:40:59 CEST 775740 33 33 11 present > > /usr/sbin/apache2 > > Tue 2023-05-16 13:52:44 CEST 798329 33 33 11 present > > /usr/sbin/apache2 > > Tue 2023-05-16 16:15:46 CEST 810709 33 33 11 present > > /usr/sbin/apache2 > > Tue 2023-05-16 16:28:55 CEST 817483 33 33 11 present > > /usr/sbin/apache2 > > Tue 2023-05-16 17:59:23 CEST 823129 33 33 11 present > > /usr/sbin/apache2 > > Tue 2023-05-16 18:35:50 CEST 826974 33 33 11 present > > /usr/sbin/apache2 > > Tue 2023-05-16 18:44:15 CEST 831974 33 33 11 present > > /usr/sbin/apache2 > > Tue 2023-05-16 18:44:56 CEST 836174 33 33 11 present > > /usr/sbin/apache2 > > Tue 2023-05-16 18:54:56 CEST 822618 33 33 11 present > > /usr/sbin/apache2 > > Tue 2023-05-16 21:12:28 CEST 836246 33 33 11 present > > /usr/sbin/apache2 > > Tue 2023-05-16 21:21:10 CEST 853959 33 33 11 present > > /usr/sbin/apache2 > > Tue 2023-05-16 22:04:42 CEST 858749 33 33 11 present > > /usr/sbin/apache2 > > Tue 2023-05-16 23:26:09 CEST 866610 33 33 11 present > > /usr/sbin/apache2 > > Wed 2023-05-17 00:08:42 CEST 865968 33 33 11 present > > /usr/sbin/apache2 > > Wed 2023-05-17 00:24:04 CEST 874807 33 33 11 present > > /usr/sbin/apache2 > > Wed 2023-05-17 00:47:25 CEST 878675 33 33 11 present > > /usr/sbin/apache2 > > Wed 2023-05-17 01:42:14 CEST 877580 33 33 11 present > > /usr/sbin/apache2 > > Wed 2023-05-17 09:21:02 CEST 949781 33 33 11 present > > /usr/sbin/apache2 > > Wed 2023-05-17 09:50:49 CEST 954784 33 33 11 present > > /usr/sbin/apache2 > > > > All crashes I looked into are at the same function: > > purge_consumed_buckets at h2_bucket_beam.c:159 > > > > Here is the output of the "bt full" command from the core: > > > > #0 0x7ffb03778981 in purge_consumed_buckets > > (beam=beam@entry=0x7ffae452c0a0) at h2_bucket_beam.c:159 > > b = 0x7ffae45ea108 > > #1 0x7ffb03778aaf in beam_shutdown > > (h
Bug#1036213: apache2: frequent SIGSEGV in mod_http2.so (purge_consumed_buckets)
Le 18/05/2023 à 14:41, Stefan Eissing a écrit : Did you have an warning message like "AH03516: unexpected NN streams in hold" at that time in out error log? No (grepping AH03516 in *.log returns nothing (nor does "streams in hold")) -- Bastien Durel
Bug#1036213: apache2: frequent SIGSEGV in mod_http2.so (purge_consumed_buckets)
Le jeudi 18 mai 2023 à 15:04 +0200, Bastien Durel a écrit : > Le 18/05/2023 à 14:41, Stefan Eissing a écrit : > > Did you have an warning message like "AH03516: unexpected NN > > streams in hold" at that time in out error log? > > No (grepping AH03516 in *.log returns nothing (nor does "streams in > hold")) > Hello. I got another one in ap_save_brigade() -- Bastien output.log.bz2 Description: application/bzip
Bug#1036213: apache2: frequent SIGSEGV in mod_http2.so (purge_consumed_buckets)
Le 19/05/2023 à 18:11, Stefan Eissing a écrit : Hi Bastien, Hello I am looking into the failures of mod_proxy_http2 and seeing an issue with connection reuse and subsequent handling of requests. Going back to the original issue, do you still see those with mod_h2 v2.0.15? That would be interesting to know. No, I had only 2 segfaults since I upgraded to 2.0.15, and none is in purge_consumed_buckets : - proxy_pass_brigade - ap_save_brigade I do not know if just relying on mod_proxy_http in your setup would be possible. But it could help us isolating issues. I don't think so ... The knot-resolver I forward DOH requests to seems to only support HTTP/2 ; if I change it to https://, dog fails to resolve on it : Error [http]: Nameserver returned HTTP 502 (Proxy Error) Regards, -- Bastien Durel
Bug#1036213: apache2: frequent SIGSEGV in mod_http2.so (purge_consumed_buckets)
Le lundi 22 mai 2023 à 16:36 +0200, Stefan Eissing a écrit : > > > > Am 19.05.2023 um 18:50 schrieb Bastien Durel : > > > > Le 19/05/2023 à 18:11, Stefan Eissing a écrit : > > > Hi Bastien, > > Hello > > > > > I am looking into the failures of mod_proxy_http2 and seeing an > > > issue with connection reuse and subsequent handling of requests. > > > Going back to the original issue, do you still see those with > > > mod_h2 v2.0.15? That would be interesting to know. > > No, I had only 2 segfaults since I upgraded to 2.0.15, and none is > > in purge_consumed_buckets : > > > > - proxy_pass_brigade > > - ap_save_brigade > > > > > I do not know if just relying on mod_proxy_http in your setup > > > would be possible. But it could help us isolating issues. > > > > I don't think so ... The knot-resolver I forward DOH requests to > > seems to only support HTTP/2 ; if I change it to https://, > > dog fails to resolve on it : > > > > Error [http]: Nameserver returned HTTP 502 (Proxy Error) > > I just released v2.0.16 > <https://github.com/icing/mod_h2/releases/tag/v2.0.16>. It contains a > fix in mod_proxy_http2 retry handling. Not sure if this is related to > your crashes as I was so far unsuccessful to reproduce those locally. > But maybe you'd like to give this a try. Hello, Thanks. I've compiled and installed v2.0.16. I "only" had 2 crashes([1][2]) since friday, so I guess I'll have to monitor a few days to report back. Thanks for your work, [1] #0 0x7f9bb04650b3 in ssl_io_filter_output (f=0x7f9ba03c68d8, bb=0x7f9ba04a83b8) at ssl_engine_io.c:2008 #1 0x7f9bb0597999 in proxy_pass_brigade (flush=1, bb=, origin=, p_conn=0x7f9ba03c80a0, bucket_alloc=) at h2_proxy_session.c:218 #2 raw_send (ngh2=, data=, length=70, flags=, user_data=0x7f9ba04a8190) at h2_proxy_session.c:244 #3 0x7f9bb14a11f9 in nghttp2_session_send () from /usr/lib/x86_64-linux-gnu/libnghttp2.so.14 #4 0x7f9bb059a9b9 in send_loop (session=0x7f9ba04a8190) at h2_proxy_session.c:1517 #5 h2_proxy_session_process (session=0x7f9ba04a8190) at h2_proxy_session.c:1553 #6 0x7f9bb059db44 in ctx_run (ctx=0x7f9ba03524d0) at mod_proxy_http2.c:258 [2] #0 0x7f9bb04650c5 in ssl_io_filter_output (f=0x7f9ba01388d8, bb=0x7f9ba05613b8) at ssl_engine_io.c:2008 #1 0x7f9bb0597999 in proxy_pass_brigade (flush=1, bb=, origin=, p_conn=0x7f9ba013a0a0, bucket_alloc=) at h2_proxy_session.c:218 #2 raw_send (ngh2=, data=, length=74, flags=, user_data=0x7f9ba0561190) at h2_proxy_session.c:244 #3 0x7f9bb14a11f9 in nghttp2_session_send () from /usr/lib/x86_64-linux-gnu/libnghttp2.so.14 #4 0x7f9bb059a9b9 in send_loop (session=0x7f9ba0561190) at h2_proxy_session.c:1517 #5 h2_proxy_session_process (session=0x7f9ba0561190) at h2_proxy_session.c:1553 #6 0x7f9bb059db44 in ctx_run (ctx=0x7f9ba014c4d0) at mod_proxy_http2.c:258 -- Bastien
Bug#1036213: apache2: frequent SIGSEGV in mod_http2.so (purge_consumed_buckets)
Le mercredi 24 mai 2023 à 14:50 +0200, Stefan Eissing a écrit : > I continue to improve mod_proxy_http2: > https://github.com/icing/mod_h2/releases/tag/v2.0.17 > > Added more edge case tests for the module, fixed observed bugs. But > have not replicated your crashes which look weird. Sorry. Hello, I've put it in use on my server. Do you need the configuration I use to serve these requests ? Thanks, -- Bastien
Bug#1036213: apache2: frequent SIGSEGV in mod_http2.so (purge_consumed_buckets)
Le 31/05/2023 à 13:15, Stefan Eissing a écrit : Hi Bastien, I was finally able to reproduce here what looks like the crashes you see with mod_proxy_http2 (notice the careful wording). And I fixed it in https://github.com/icing/mod_h2/releases/tag/v2.0.18 Looking forward to hear how it fares on your system. Hello, Thanks for your work. I've put it on my system, I'll report any new crash (last week I got approximately one every 2 days) Regards, -- Bastien Durel
Bug#1036213: apache2: frequent SIGSEGV in mod_http2.so (purge_consumed_buckets)
Le mercredi 31 mai 2023 à 13:15 +0200, Stefan Eissing a écrit : > Hi Bastien, > > I was finally able to reproduce here what looks like the crashes you > see with mod_proxy_http2 (notice the careful wording). And I fixed it > in https://github.com/icing/mod_h2/releases/tag/v2.0.18 > > Looking forward to hear how it fares on your system. > Hello, After a week running on my system, I did not got any crash at all. So maybe you effectively found and fix the problem I saw :) Regards, -- Bastien