Perrin Harkins wrote:
> Honestly, the person who has done the most work on debugging thread
> crashes is Torsten.  His advice on how to debug it will be better than
> mine.  It does seem like people usually solve them by using backtrace
> analysis though.

Getting back to this, I've now had time to install debug versions of mod_perl module and perl to the production server, and run backtraces on a core dump. Unfortunately I do not have the possibility to run a debug version of Apache right now, but hopefully that won't be needed.

Looking at the backtraces for each thread, there are a couple things that I think I can see: - The segfault appears to happen when Apache attempts to recycle the child process (ap_graceful_stop_signalled in thread #1) - Threads 4 and 9 appear to be running perl, and those are also the only threads that show anything interesting (atleast to me)

As per the instructions on mod_perl documention, I analyzed the perl threads a bit more closely. The 'curinfo' returns for both threads '536870923Cannot access memory at address 0x4040004'.

If anyone has any ideas/thoughts on how to further debug the problem based on the backtraces, please let me know.

Backtraces:

(gdb) btt 1
[Switching to thread 1 (process 2094)]#0 0xb7cb2401 in __read_nocancel () from /lib/tls/libpthread.so.0
#0  0xb7cb2401 in __read_nocancel () from /lib/tls/libpthread.so.0
#1  0x0808b855 in ap_mpm_pod_check ()
#2  0x080894d5 in ap_graceful_stop_signalled ()
#3  0x08089656 in ap_graceful_stop_signalled ()
#4  0x08089715 in ap_graceful_stop_signalled ()
#5  0x0808a807 in ap_mpm_run ()
#6  0x0806232f in main ()
(gdb) btt 2
[Switching to thread 2 (process 2135)]#0 0xb7cb25fe in accept () from /lib/tls/libpthread.so.0
#0  0xb7cb25fe in accept () from /lib/tls/libpthread.so.0
#1  0xb7d0c5bd in apr_socket_accept () from /usr/lib/libapr-1.so.0
#2  0x0808b98c in unixd_accept ()
#3  0x08088b09 in ap_graceful_stop_signalled ()
#4  0xb7d10316 in apr_proc_detach () from /usr/lib/libapr-1.so.0
#5  0xb7cad0bd in start_thread () from /lib/tls/libpthread.so.0
#6  0xb7c3c9ee in clone () from /lib/tls/libc.so.6
(gdb) btt 3
[Switching to thread 3 (process 2134)]#0 0xb7c32ef9 in poll () from /lib/tls/libc.so.6
#0  0xb7c32ef9 in poll () from /lib/tls/libc.so.6
#1  0xb7d10145 in apr_wait_for_io_or_timeout () from /usr/lib/libapr-1.so.0
#2  0xb7d0b6b3 in apr_socket_recv () from /usr/lib/libapr-1.so.0
#3  0x0807b756 in ap_lingering_close ()
#4  0x08089153 in ap_graceful_stop_signalled ()
#5  0xb7d10316 in apr_proc_detach () from /usr/lib/libapr-1.so.0
#6  0xb7cad0bd in start_thread () from /lib/tls/libpthread.so.0
#7  0xb7c3c9ee in clone () from /lib/tls/libc.so.6
(gdb) btt 4
[Switching to thread 4 (process 2133)]#0 0xb77eb15a in modperl_mgv_as_string (my_perl=0x8662c58, symbol=0x8178190, p=0x8938438,
    package=0) at modperl_mgv.c:399
399     modperl_mgv.c: No such file or directory.
        in modperl_mgv.c
#0 0xb77eb15a in modperl_mgv_as_string (my_perl=0x8662c58, symbol=0x8178190, p=0x8938438, package=0) at modperl_mgv.c:399 #1 0xb77df146 in modperl_callback (my_perl=0x8662c58, handler=0x8939ee8, p=0x8938438, r=0x8938470, s=0x80a88c8, args=0x870c29c)
    at modperl_callback.c:85
#2 0xb77dfb79 in modperl_callback_run_handlers (idx=5, type=4, r=0x8938470, c=0x0, s=0x80a88c8, pconf=0x0, plog=0x0, ptemp=0x0,
    run_mode=MP_HOOK_RUN_ALL) at modperl_callback.c:263
#3 0xb77e00aa in modperl_callback_per_dir (idx=5, r=0x8938470, run_mode=MP_HOOK_RUN_ALL) at modperl_callback.c:370
#4  0xb77feb63 in modperl_fixup_handler (r=0x8938470) at modperl_hooks.c:57
#5  0x0806ff29 in ap_run_fixups ()
#6  0x08084528 in ap_process_request ()
#7  0x080817de in ap_register_input_filter ()
#8  0x0807b507 in ap_run_process_connection ()
#9  0x0808914b in ap_graceful_stop_signalled ()
#10 0xb7d10316 in apr_proc_detach () from /usr/lib/libapr-1.so.0
#11 0xb7cad0bd in start_thread () from /lib/tls/libpthread.so.0
#12 0xb7c3c9ee in clone () from /lib/tls/libc.so.6
(gdb) btt 5
[Switching to thread 5 (process 2132)]#0 0xb7cafc01 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/tls/libpthread.so.0 #0 0xb7cafc01 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/tls/libpthread.so.0
#1  0xb7d053da in apr_thread_cond_wait () from /usr/lib/libapr-1.so.0
#2  0x0808b2b3 in ap_queue_pop ()
#3  0x08088fc5 in ap_graceful_stop_signalled ()
#4  0xb7d10316 in apr_proc_detach () from /usr/lib/libapr-1.so.0
#5  0xb7cad0bd in start_thread () from /lib/tls/libpthread.so.0
#6  0xb7c3c9ee in clone () from /lib/tls/libc.so.6
(gdb) btt 6
[Switching to thread 6 (process 2131)]#0 0xb7c32ef9 in poll () from /lib/tls/libc.so.6
#0  0xb7c32ef9 in poll () from /lib/tls/libc.so.6
#1  0xb7d10145 in apr_wait_for_io_or_timeout () from /usr/lib/libapr-1.so.0
#2  0xb7d0b6b3 in apr_socket_recv () from /usr/lib/libapr-1.so.0
#3  0x0807b756 in ap_lingering_close ()
#4  0x08089153 in ap_graceful_stop_signalled ()
#5  0xb7d10316 in apr_proc_detach () from /usr/lib/libapr-1.so.0
#6  0xb7cad0bd in start_thread () from /lib/tls/libpthread.so.0
#7  0xb7c3c9ee in clone () from /lib/tls/libc.so.6
(gdb) btt 7
[Switching to thread 7 (process 2130)]#0 0xb7c32ef9 in poll () from /lib/tls/libc.so.6
#0  0xb7c32ef9 in poll () from /lib/tls/libc.so.6
#1  0xb7d10145 in apr_wait_for_io_or_timeout () from /usr/lib/libapr-1.so.0
#2  0xb7d0b6b3 in apr_socket_recv () from /usr/lib/libapr-1.so.0
#3 0xb7ef1967 in apr_bucket_socket_create () from /usr/lib/libaprutil-1.so.0
#4  0xb7ef357a in apr_brigade_split_line () from /usr/lib/libaprutil-1.so.0
#5  0x0807407b in ap_core_input_filter ()
#6  0x080810ce in ap_register_input_filter ()
#7  0x08068c2e in ap_rgetline_core ()
#8  0x08069557 in ap_read_request ()
#9  0x08081768 in ap_register_input_filter ()
#10 0x0807b507 in ap_run_process_connection ()
#11 0x0808914b in ap_graceful_stop_signalled ()
#12 0xb7d10316 in apr_proc_detach () from /usr/lib/libapr-1.so.0
#13 0xb7cad0bd in start_thread () from /lib/tls/libpthread.so.0
#14 0xb7c3c9ee in clone () from /lib/tls/libc.so.6
(gdb) btt 8
[Switching to thread 8 (process 2129)]#0 0xb7c32ef9 in poll () from /lib/tls/libc.so.6
#0  0xb7c32ef9 in poll () from /lib/tls/libc.so.6
#1  0xb7d10145 in apr_wait_for_io_or_timeout () from /usr/lib/libapr-1.so.0
#2  0xb7d0b247 in apr_socket_sendfile () from /usr/lib/libapr-1.so.0
#3  0x080734d7 in ap_core_output_filter ()
#4  0x08085418 in ap_http_header_filter ()
#5  0x08069d24 in ap_content_length_filter ()
#6  0x08086855 in ap_byterange_filter ()
#7  0xb781b120 in ?? () from /usr/lib/apache2/modules/mod_cache.so
#8  0xb1309398 in ?? ()
#9  0xb1312e08 in ?? ()
#10 0x00000007 in ?? ()
#11 0x00000000 in ?? ()
(gdb) btt 9
[Switching to thread 9 (process 2125)]#0 0xb7b99d51 in kill () from /lib/tls/libc.so.6
#0  0xb7b99d51 in kill () from /lib/tls/libc.so.6
#1  0x0807c9bb in ap_fatal_signal_child_setup ()
#2  <signal handler called>
#3 0xb77eb15a in modperl_mgv_as_string (my_perl=0x85bc758, symbol=0x8178190, p=0x88ad7f8, package=0) at modperl_mgv.c:399 #4 0xb77df146 in modperl_callback (my_perl=0x85bc758, handler=0x88af2a8, p=0x88ad7f8, r=0x88ad830, s=0x80a88c8, args=0x86eaf0c)
    at modperl_callback.c:85
#5 0xb77dfb79 in modperl_callback_run_handlers (idx=5, type=4, r=0x88ad830, c=0x0, s=0x80a88c8, pconf=0x0, plog=0x0, ptemp=0x0,
    run_mode=MP_HOOK_RUN_ALL) at modperl_callback.c:263
#6 0xb77e00aa in modperl_callback_per_dir (idx=5, r=0x88ad830, run_mode=MP_HOOK_RUN_ALL) at modperl_callback.c:370
#7  0xb77feb63 in modperl_fixup_handler (r=0x88ad830) at modperl_hooks.c:57
#8  0x0806ff29 in ap_run_fixups ()
#9  0x08084528 in ap_process_request ()
#10 0x080817de in ap_register_input_filter ()
#11 0x0807b507 in ap_run_process_connection ()
#12 0x0808914b in ap_graceful_stop_signalled ()
#13 0xb7d10316 in apr_proc_detach () from /usr/lib/libapr-1.so.0
#14 0xb7cad0bd in start_thread () from /lib/tls/libpthread.so.0
#15 0xb7c3c9ee in clone () from /lib/tls/libc.so.6
(gdb) btt 10
[Switching to thread 10 (process 2124)]#0 0xb7c32ef9 in poll () from /lib/tls/libc.so.6
#0  0xb7c32ef9 in poll () from /lib/tls/libc.so.6
#1  0xb7d10145 in apr_wait_for_io_or_timeout () from /usr/lib/libapr-1.so.0
#2  0xb7d0b6b3 in apr_socket_recv () from /usr/lib/libapr-1.so.0
#3  0x0807b756 in ap_lingering_close ()
#4  0x08089153 in ap_graceful_stop_signalled ()
#5  0xb7d10316 in apr_proc_detach () from /usr/lib/libapr-1.so.0
#6  0xb7cad0bd in start_thread () from /lib/tls/libpthread.so.0
#7  0xb7c3c9ee in clone () from /lib/tls/libc.so.6
(gdb) btt 11
[Switching to thread 11 (process 2123)]#0 0xb7c32ef9 in poll () from /lib/tls/libc.so.6
#0  0xb7c32ef9 in poll () from /lib/tls/libc.so.6
#1  0xb7d10145 in apr_wait_for_io_or_timeout () from /usr/lib/libapr-1.so.0
#2  0xb7d0b058 in apr_socket_sendv () from /usr/lib/libapr-1.so.0
#3  0x08073145 in ap_bucket_eoc_create ()
#4  0x080739d7 in ap_core_output_filter ()
#5  0x08069d24 in ap_content_length_filter ()
#6  0xb781b4fe in ?? () from /usr/lib/apache2/modules/mod_cache.so
#7  0x0890f298 in ?? ()
#8  0x08905298 in ?? ()
#9  0x08905298 in ?? ()
#10 0x08905298 in ?? ()
#11 0x00000000 in ?? ()
(gdb) btt 12
[Switching to thread 12 (process 2122)]#0 0xb7c32ef9 in poll () from /lib/tls/libc.so.6
#0  0xb7c32ef9 in poll () from /lib/tls/libc.so.6
#1  0xb7d10145 in apr_wait_for_io_or_timeout () from /usr/lib/libapr-1.so.0
#2  0xb7d0b6b3 in apr_socket_recv () from /usr/lib/libapr-1.so.0
#3  0x0807b756 in ap_lingering_close ()
#4  0x08089153 in ap_graceful_stop_signalled ()
#5  0xb7d10316 in apr_proc_detach () from /usr/lib/libapr-1.so.0
#6  0xb7cad0bd in start_thread () from /lib/tls/libpthread.so.0
#7  0xb7c3c9ee in clone () from /lib/tls/libc.so.6

Reply via email to