Re: mod_http2 and Frequent wake-ups for mpm_event

Stefan Eissing Sun, 22 Jan 2017 08:24:06 -0800

Not my forté. Yann probably has a better idea here. Let's see if he has time to 
look at it tomorrow.


> Am 22.01.2017 um 17:21 schrieb Stefan Priebe - Profihost AG 
> <s.pri...@profihost.ag>:
> 
> Hi Stefan,
> 
> no i was mistaken customer isn't using mod_proxy - but I think this is
> the patch causing me problems:
> https://github.com/apache/httpd/commit/a61a4bd02e483fb45d433343740d0130ee3d8d5d
> 
> What do you think?
> 
> Greets,
> Stefan
> 
> Am 22.01.2017 um 17:17 schrieb Stefan Eissing:
>> 
>>> Am 22.01.2017 um 17:14 schrieb Stefan Priebe - Profihost AG 
>>> <s.pri...@profihost.ag>:
>>> 
>>> *arg* it's just mod_proxy - just saw thread safety and apr bucket aloc.
>> 
>> ??? Can you elaborate? Is your finding the known hcheck bug or something 
>> else?
>> 
>>> Stefan
>>> 
>>> Am 22.01.2017 um 17:06 schrieb Stefan Priebe - Profihost AG:
>>>> Looks like others have the same crashes too:
>>>> https://bz.apache.org/bugzilla/show_bug.cgi?id=60071
>>>> and
>>>> https://github.com/apache/httpd/commit/8e63c3c9372cd398f57357099aa941cbba695758
>>>> 
>>>> So it looks like mod_http2 is running fine now. Thanks a lot Stefan.
>>>> 
>>>> Yann i think i can start testing your mpm patch again after the
>>>> segfaults in 2.4 branch are fixed.
>>>> 
>>>> Greets,
>>>> Stefan
>>>> 
>>>> Am 22.01.2017 um 13:16 schrieb Stefan Priebe:
>>>>> Hi,
>>>>> 
>>>>> and a new one but also in ap_start_lingering_close:
>>>>> 
>>>>> Program terminated with signal SIGSEGV, Segmentation fault.
>>>>> #0  apr_palloc (pool=pool@entry=0x7f455805e138, in_size=in_size@entry=32)
>>>>>   at memory/unix/apr_pools.c:684
>>>>> #0  apr_palloc (pool=pool@entry=0x7f455805e138, in_size=in_size@entry=32)
>>>>>   at memory/unix/apr_pools.c:684
>>>>> #1  0x00007f456bc5d8b4 in apr_brigade_create (p=0x7f455805e138,
>>>>>   list=0x7f45040034e8) at buckets/apr_brigade.c:61
>>>>> #2  0x000055e165efa319 in ap_shutdown_conn (c=c@entry=0x7f455805e458,
>>>>>   flush=flush@entry=1) at connection.c:76
>>>>> #3  0x000055e165efa40d in ap_flush_conn (c=0x7f455805e458) at
>>>>> connection.c:95
>>>>> #4  ap_start_lingering_close (c=0x7f455805e458) at connection.c:145
>>>>> #5  0x000055e165f942dd in start_lingering_close_blocking (cs=<optimized
>>>>> out>)
>>>>>   at event.c:876
>>>>> #6  process_socket (my_thread_num=<optimized out>,
>>>>>   my_child_num=<optimized out>, cs=0x7f455805e3c8, sock=<optimized out>,
>>>>>   p=<optimized out>, thd=<optimized out>) at event.c:1153
>>>>> #7  worker_thread (thd=0x7f455805e138, dummy=0x20) at event.c:2001
>>>>> #8  0x00007f456b80a0a4 in start_thread ()
>>>>>  from /lib/x86_64-linux-gnu/libpthread.so.0
>>>>> #9  0x00007f456b53f62d in clone () from /lib/x86_64-linux-gnu/libc.so.6
>>>>> 
>>>>> Stefan
>>>>> 
>>>>> Am 21.01.2017 um 19:31 schrieb Stefan Priebe:
>>>>>> All last traces come from event, proces_longering_close ap_push_pool but
>>>>>> end in different functions. It looks like a race somewhere and it just
>>>>>> races at different function in the event of close and pool clear.
>>>>>> 
>>>>>> Might there be two places where the same pool gets cleared?
>>>>>> 
>>>>>> Stefan
>>>>>> 
>>>>>> Am 21.01.2017 um 19:07 schrieb Stefan Priebe:
>>>>>>> Hi Stefan,
>>>>>>> 
>>>>>>> thanks. No crashes where h2 comes up. But i still have these and no idea
>>>>>>> how to find out who and why they're crashing.
>>>>>>> 
>>>>>>> Using host libthread_db library
>>>>>>> "/lib/x86_64-linux-gnu/libthread_db.so.1".
>>>>>>> Core was generated by `/usr/local/apache2/bin/httpd -k start'.
>>>>>>> Program terminated with signal SIGSEGV, Segmentation fault.
>>>>>>> #0  allocator_free (node=0x0, allocator=0x7f6e08066540)
>>>>>>>   at memory/unix/apr_pools.c:381
>>>>>>> #0  allocator_free (node=0x0, allocator=0x7f6e08066540)
>>>>>>>   at memory/unix/apr_pools.c:381
>>>>>>> #1  apr_pool_clear (pool=0x7f6e0808d238) at memory/unix/apr_pools.c:793
>>>>>>> #2  0x00000000004fe528 in ap_push_pool (queue_info=0x0,
>>>>>>>   pool_to_recycle=0x7f6e08066548) at fdqueue.c:234
>>>>>>> #3  0x00000000004fa2c8 in process_lingering_close (cs=0x7f6e0808d4c8,
>>>>>>>   pfd=0x1d3bf98) at event.c:1439
>>>>>>> #4  0x00000000004fd410 in listener_thread (thd=0x1d3cb70,
>>>>>>> dummy=0x7f6e0808d4c8)
>>>>>>>   at event.c:1704
>>>>>>> #5  0x00007f6e1aed20a4 in start_thread ()
>>>>>>>  from /lib/x86_64-linux-gnu/libpthread.so.0
>>>>>>> #6  0x00007f6e1aa0362d in clone () from /lib/x86_64-linux-gnu/libc.so.6
>>>>>>> (gdb) (gdb) quit
>>>>>>> 
>>>>>>> Reading symbols from /usr/local/apache/bin/httpd...Reading symbols from
>>>>>>> /usr/lib/debug//usr/local/apache2/bin/httpd...done.
>>>>>>> done.
>>>>>>> [Thread debugging using libthread_db enabled]
>>>>>>> Using host libthread_db library
>>>>>>> "/lib/x86_64-linux-gnu/libthread_db.so.1".
>>>>>>> Core was generated by `/usr/local/apache2/bin/httpd -k start'.
>>>>>>> Program terminated with signal SIGSEGV, Segmentation fault.
>>>>>>> #0  allocator_free (node=0x0, allocator=0x7f6e08053ae0)
>>>>>>>   at memory/unix/apr_pools.c:381
>>>>>>> #0  allocator_free (node=0x0, allocator=0x7f6e08053ae0)
>>>>>>>   at memory/unix/apr_pools.c:381
>>>>>>> #1  apr_pool_clear (pool=0x7f6e08076bb8) at memory/unix/apr_pools.c:793
>>>>>>> #2  0x00000000004fe528 in ap_push_pool (queue_info=0x0,
>>>>>>>   pool_to_recycle=0x7f6e08053ae8) at fdqueue.c:234
>>>>>>> #3  0x00000000004fa2c8 in process_lingering_close (cs=0x7f6e08076e48,
>>>>>>>   pfd=0x1d3bf98) at event.c:1439
>>>>>>> #4  0x00000000004fd410 in listener_thread (thd=0x1d3cb70,
>>>>>>> dummy=0x7f6e08076e48)
>>>>>>>   at event.c:1704
>>>>>>> #5  0x00007f6e1aed20a4 in start_thread ()
>>>>>>>  from /lib/x86_64-linux-gnu/libpthread.so.0
>>>>>>> #6  0x00007f6e1aa0362d in clone () from /lib/x86_64-linux-gnu/libc.so.6
>>>>>>> (gdb) (gdb) quit
>>>>>>> 
>>>>>>> Stefan
>>>>>>> 
>>>>>>> Am 21.01.2017 um 17:03 schrieb Stefan Eissing:
>>>>>>>> Stefan,
>>>>>>>> 
>>>>>>>> made a release at https://github.com/icing/mod_h2/releases/tag/v1.8.9
>>>>>>>> with all patches and improved (hopefully) on them a bit. If you dare
>>>>>>>> to drop that into your installation, that'd be great.
>>>>>>>> 
>>>>>>>> Cheers,
>>>>>>>> 
>>>>>>>> Stefan
>>>>>>>> 
>>>>>>>>> Am 21.01.2017 um 15:25 schrieb Stefan Priebe <s.pri...@profihost.ag>:
>>>>>>>>> 
>>>>>>>>> and i got another crash here:
>>>>>>>>> 
>>>>>>>>> 2346 static void run_cleanups(cleanup_t **cref)
>>>>>>>>> 2347 {
>>>>>>>>> 2348     cleanup_t *c = *cref;
>>>>>>>>> 2349
>>>>>>>>> 2350     while (c) {
>>>>>>>>> 2351         *cref = c->next;
>>>>>>>>> 2352         (*c->plain_cleanup_fn)((void *)c->data);   <== here
>>>>>>>>> 2353         c = *cref;
>>>>>>>>> 2354
>>>>>>>>> 
>>>>>>>>> which looks similar to the other crash.
>>>>>>>>> 
>>>>>>>>> #0  0x00007fe4bbd33e1b in run_cleanups (cref=<optimized out>) at
>>>>>>>>> memory/unix/apr_pools.c:2352
>>>>>>>>> #1  apr_pool_clear (pool=0x7fe4a804dac8) at
>>>>>>>>> memory/unix/apr_pools.c:772
>>>>>>>>> #2  0x00000000004feb38 in ap_push_pool
>>>>>>>>> (queue_info=0x6d616e79642d3733, pool_to_recycle=0x2) at fdqueue.c:234
>>>>>>>>> #3  0x00000000004fa8d8 in process_lingering_close (cs=0x7fe4a804dd58,
>>>>>>>>> pfd=0x25d3f98) at event.c:1439
>>>>>>>>> 
>>>>>>>>> Details:
>>>>>>>>> (gdb) print c
>>>>>>>>> $1 = (cleanup_t *) 0x7fe4a804e9f0
>>>>>>>>> (gdb) print *c
>>>>>>>>> $2 = {next = 0x7fe4a804e870, data = 0x6d616e79642d3733,
>>>>>>>>> plain_cleanup_fn = 0x392d3734322e6369,
>>>>>>>>> child_cleanup_fn = 0x617465722e722d35}
>>>>>>>>> (gdb) print *c->data
>>>>>>>>> Attempt to dereference a generic pointer.
>>>>>>>>> (gdb) print *c->plain_cleanup_fn
>>>>>>>>> Cannot access memory at address 0x392d3734322e6369
>>>>>>>>> (gdb)
>>>>>>>>> 
>>>>>>>>> Stefan
>>>>>>>>> 
>>>>>>>>> Am 21.01.2017 um 15:18 schrieb Stefan Priebe:
>>>>>>>>>> Hi,
>>>>>>>>>> 
>>>>>>>>>> #0  apr_pool_cleanup_kill (p=0x7fe4a8072358,
>>>>>>>>>> data=data@entry=0x7fe4a80723e0,
>>>>>>>>>>  cleanup_fn=cleanup_fn@entry=0x7fe4bbd38a40 <socket_cleanup>) at
>>>>>>>>>> memory/unix/apr_pools.c:2276
>>>>>>>>>> 
>>>>>>>>>> it crashes here in apr:
>>>>>>>>>> 2276         if (c->data == data && c->plain_cleanup_fn ==
>>>>>>>>>> cleanup_fn) {
>>>>>>>>>> 
>>>>>>>>>> some lines before c becomes this
>>>>>>>>>> 2264     c = p->cleanups;
>>>>>>>>>> 
>>>>>>>>>> p is:
>>>>>>>>>> (gdb) print *p
>>>>>>>>>> $1 = {parent = 0x256f138, child = 0x7fe46c0751c8, sibling =
>>>>>>>>>> 0x7fe4a8096888, ref = 0x7fe4a8069fe8, cleanups = 0x7fe478159748,
>>>>>>>>>> free_cleanups = 0x7fe478159788, allocator = 0x7fe4a803b490,
>>>>>>>>>> subprocesses = 0x0, abort_fn = 0x43da00 <abort_on_oom>,
>>>>>>>>>> user_data = 0x0, tag = 0x502285 "transaction", active =
>>>>>>>>>> 0x7fe478158d70, self = 0x7fe4a8072330,
>>>>>>>>>> self_first_avail = 0x7fe4a80723d0 "X#\a\250\344\177", pre_cleanups =
>>>>>>>>>> 0x7fe4a8072ab8}
>>>>>>>>>> 
>>>>>>>>>> wouldn't the error mean that p->cleanups is NULL?
>>>>>>>>>> 
>>>>>>>>>> (gdb) print *p->cleanups
>>>>>>>>>> $2 = {next = 0x7fe478159628, data = 0x7fe478159648,
>>>>>>>>>> plain_cleanup_fn =
>>>>>>>>>> 0x7fe4bbd2ffd0 <apr_unix_file_cleanup>,
>>>>>>>>>> child_cleanup_fn = 0x7fe4bbd2ff70 <apr_unix_child_file_cleanup>}
>>>>>>>>>> 
>>>>>>>>>> So p->cleanups->data is 0x7fe478159648 and data is 0x7fe4a80723e0?
>>>>>>>>>> 
>>>>>>>>>> I don't get why it's segfaulting
>>>>>>>>>> 
>>>>>>>>>> Stefan
>>>>>>>>>> Am 21.01.2017 um 09:50 schrieb Yann Ylavic:
>>>>>>>>>>> Hi Stefan,
>>>>>>>>>>> 
>>>>>>>>>>> On Sat, Jan 21, 2017 at 9:45 AM, Stefan Priebe
>>>>>>>>>>> <s.pri...@profihost.ag>
>>>>>>>>>>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>> after running the whole night. These are the only ones still
>>>>>>>>>>>> happening.
>>>>>>>>>>>> Should i revert the mpm patch to check whether it's the source?
>>>>>>>>>>> 
>>>>>>>>>>> Yes please, we need to determine...
>>>>>>>>>>> 
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Yann.
>>>>>>>>>>> 
>>>>>>>> 
>>>>>>>> Stefan Eissing
>>>>>>>> 
>>>>>>>> <green/>bytes GmbH
>>>>>>>> Hafenstrasse 16
>>>>>>>> 48155 Münster
>>>>>>>> www.greenbytes.de
>>>>>>>> 
>> 
>> Stefan Eissing
>> 
>> <green/>bytes GmbH
>> Hafenstrasse 16
>> 48155 Münster
>> www.greenbytes.de
>> 

Stefan Eissing

<green/>bytes GmbH
Hafenstrasse 16
48155 Münster
www.greenbytes.de

Re: mod_http2 and Frequent wake-ups for mpm_event

Reply via email to