Perrin Harkins wrote:
> Honestly, the person who has done the most work on debugging thread
> crashes is Torsten. His advice on how to debug it will be better than
> mine. It does seem like people usually solve them by using backtrace
> analysis though.
Getting back to this, I've now had time to install debug versions of
mod_perl module and perl to the production server, and run backtraces on
a core dump. Unfortunately I do not have the possibility to run a debug
version of Apache right now, but hopefully that won't be needed.
Looking at the backtraces for each thread, there are a couple things
that I think I can see:
- The segfault appears to happen when Apache attempts to recycle the
child process (ap_graceful_stop_signalled in thread #1)
- Threads 4 and 9 appear to be running perl, and those are also the only
threads that show anything interesting (atleast to me)
As per the instructions on mod_perl documention, I analyzed the perl
threads a bit more closely. The 'curinfo' returns for both threads
'536870923Cannot access memory at address 0x4040004'.
If anyone has any ideas/thoughts on how to further debug the problem
based on the backtraces, please let me know.
Backtraces:
(gdb) btt 1
[Switching to thread 1 (process 2094)]#0 0xb7cb2401 in __read_nocancel
() from /lib/tls/libpthread.so.0
#0 0xb7cb2401 in __read_nocancel () from /lib/tls/libpthread.so.0
#1 0x0808b855 in ap_mpm_pod_check ()
#2 0x080894d5 in ap_graceful_stop_signalled ()
#3 0x08089656 in ap_graceful_stop_signalled ()
#4 0x08089715 in ap_graceful_stop_signalled ()
#5 0x0808a807 in ap_mpm_run ()
#6 0x0806232f in main ()
(gdb) btt 2
[Switching to thread 2 (process 2135)]#0 0xb7cb25fe in accept () from
/lib/tls/libpthread.so.0
#0 0xb7cb25fe in accept () from /lib/tls/libpthread.so.0
#1 0xb7d0c5bd in apr_socket_accept () from /usr/lib/libapr-1.so.0
#2 0x0808b98c in unixd_accept ()
#3 0x08088b09 in ap_graceful_stop_signalled ()
#4 0xb7d10316 in apr_proc_detach () from /usr/lib/libapr-1.so.0
#5 0xb7cad0bd in start_thread () from /lib/tls/libpthread.so.0
#6 0xb7c3c9ee in clone () from /lib/tls/libc.so.6
(gdb) btt 3
[Switching to thread 3 (process 2134)]#0 0xb7c32ef9 in poll () from
/lib/tls/libc.so.6
#0 0xb7c32ef9 in poll () from /lib/tls/libc.so.6
#1 0xb7d10145 in apr_wait_for_io_or_timeout () from /usr/lib/libapr-1.so.0
#2 0xb7d0b6b3 in apr_socket_recv () from /usr/lib/libapr-1.so.0
#3 0x0807b756 in ap_lingering_close ()
#4 0x08089153 in ap_graceful_stop_signalled ()
#5 0xb7d10316 in apr_proc_detach () from /usr/lib/libapr-1.so.0
#6 0xb7cad0bd in start_thread () from /lib/tls/libpthread.so.0
#7 0xb7c3c9ee in clone () from /lib/tls/libc.so.6
(gdb) btt 4
[Switching to thread 4 (process 2133)]#0 0xb77eb15a in
modperl_mgv_as_string (my_perl=0x8662c58, symbol=0x8178190, p=0x8938438,
package=0) at modperl_mgv.c:399
399 modperl_mgv.c: No such file or directory.
in modperl_mgv.c
#0 0xb77eb15a in modperl_mgv_as_string (my_perl=0x8662c58,
symbol=0x8178190, p=0x8938438, package=0) at modperl_mgv.c:399
#1 0xb77df146 in modperl_callback (my_perl=0x8662c58,
handler=0x8939ee8, p=0x8938438, r=0x8938470, s=0x80a88c8, args=0x870c29c)
at modperl_callback.c:85
#2 0xb77dfb79 in modperl_callback_run_handlers (idx=5, type=4,
r=0x8938470, c=0x0, s=0x80a88c8, pconf=0x0, plog=0x0, ptemp=0x0,
run_mode=MP_HOOK_RUN_ALL) at modperl_callback.c:263
#3 0xb77e00aa in modperl_callback_per_dir (idx=5, r=0x8938470,
run_mode=MP_HOOK_RUN_ALL) at modperl_callback.c:370
#4 0xb77feb63 in modperl_fixup_handler (r=0x8938470) at modperl_hooks.c:57
#5 0x0806ff29 in ap_run_fixups ()
#6 0x08084528 in ap_process_request ()
#7 0x080817de in ap_register_input_filter ()
#8 0x0807b507 in ap_run_process_connection ()
#9 0x0808914b in ap_graceful_stop_signalled ()
#10 0xb7d10316 in apr_proc_detach () from /usr/lib/libapr-1.so.0
#11 0xb7cad0bd in start_thread () from /lib/tls/libpthread.so.0
#12 0xb7c3c9ee in clone () from /lib/tls/libc.so.6
(gdb) btt 5
[Switching to thread 5 (process 2132)]#0 0xb7cafc01 in
pthread_cond_wait@@GLIBC_2.3.2 () from /lib/tls/libpthread.so.0
#0 0xb7cafc01 in pthread_cond_wait@@GLIBC_2.3.2 () from
/lib/tls/libpthread.so.0
#1 0xb7d053da in apr_thread_cond_wait () from /usr/lib/libapr-1.so.0
#2 0x0808b2b3 in ap_queue_pop ()
#3 0x08088fc5 in ap_graceful_stop_signalled ()
#4 0xb7d10316 in apr_proc_detach () from /usr/lib/libapr-1.so.0
#5 0xb7cad0bd in start_thread () from /lib/tls/libpthread.so.0
#6 0xb7c3c9ee in clone () from /lib/tls/libc.so.6
(gdb) btt 6
[Switching to thread 6 (process 2131)]#0 0xb7c32ef9 in poll () from
/lib/tls/libc.so.6
#0 0xb7c32ef9 in poll () from /lib/tls/libc.so.6
#1 0xb7d10145 in apr_wait_for_io_or_timeout () from /usr/lib/libapr-1.so.0
#2 0xb7d0b6b3 in apr_socket_recv () from /usr/lib/libapr-1.so.0
#3 0x0807b756 in ap_lingering_close ()
#4 0x08089153 in ap_graceful_stop_signalled ()
#5 0xb7d10316 in apr_proc_detach () from /usr/lib/libapr-1.so.0
#6 0xb7cad0bd in start_thread () from /lib/tls/libpthread.so.0
#7 0xb7c3c9ee in clone () from /lib/tls/libc.so.6
(gdb) btt 7
[Switching to thread 7 (process 2130)]#0 0xb7c32ef9 in poll () from
/lib/tls/libc.so.6
#0 0xb7c32ef9 in poll () from /lib/tls/libc.so.6
#1 0xb7d10145 in apr_wait_for_io_or_timeout () from /usr/lib/libapr-1.so.0
#2 0xb7d0b6b3 in apr_socket_recv () from /usr/lib/libapr-1.so.0
#3 0xb7ef1967 in apr_bucket_socket_create () from
/usr/lib/libaprutil-1.so.0
#4 0xb7ef357a in apr_brigade_split_line () from /usr/lib/libaprutil-1.so.0
#5 0x0807407b in ap_core_input_filter ()
#6 0x080810ce in ap_register_input_filter ()
#7 0x08068c2e in ap_rgetline_core ()
#8 0x08069557 in ap_read_request ()
#9 0x08081768 in ap_register_input_filter ()
#10 0x0807b507 in ap_run_process_connection ()
#11 0x0808914b in ap_graceful_stop_signalled ()
#12 0xb7d10316 in apr_proc_detach () from /usr/lib/libapr-1.so.0
#13 0xb7cad0bd in start_thread () from /lib/tls/libpthread.so.0
#14 0xb7c3c9ee in clone () from /lib/tls/libc.so.6
(gdb) btt 8
[Switching to thread 8 (process 2129)]#0 0xb7c32ef9 in poll () from
/lib/tls/libc.so.6
#0 0xb7c32ef9 in poll () from /lib/tls/libc.so.6
#1 0xb7d10145 in apr_wait_for_io_or_timeout () from /usr/lib/libapr-1.so.0
#2 0xb7d0b247 in apr_socket_sendfile () from /usr/lib/libapr-1.so.0
#3 0x080734d7 in ap_core_output_filter ()
#4 0x08085418 in ap_http_header_filter ()
#5 0x08069d24 in ap_content_length_filter ()
#6 0x08086855 in ap_byterange_filter ()
#7 0xb781b120 in ?? () from /usr/lib/apache2/modules/mod_cache.so
#8 0xb1309398 in ?? ()
#9 0xb1312e08 in ?? ()
#10 0x00000007 in ?? ()
#11 0x00000000 in ?? ()
(gdb) btt 9
[Switching to thread 9 (process 2125)]#0 0xb7b99d51 in kill () from
/lib/tls/libc.so.6
#0 0xb7b99d51 in kill () from /lib/tls/libc.so.6
#1 0x0807c9bb in ap_fatal_signal_child_setup ()
#2 <signal handler called>
#3 0xb77eb15a in modperl_mgv_as_string (my_perl=0x85bc758,
symbol=0x8178190, p=0x88ad7f8, package=0) at modperl_mgv.c:399
#4 0xb77df146 in modperl_callback (my_perl=0x85bc758,
handler=0x88af2a8, p=0x88ad7f8, r=0x88ad830, s=0x80a88c8, args=0x86eaf0c)
at modperl_callback.c:85
#5 0xb77dfb79 in modperl_callback_run_handlers (idx=5, type=4,
r=0x88ad830, c=0x0, s=0x80a88c8, pconf=0x0, plog=0x0, ptemp=0x0,
run_mode=MP_HOOK_RUN_ALL) at modperl_callback.c:263
#6 0xb77e00aa in modperl_callback_per_dir (idx=5, r=0x88ad830,
run_mode=MP_HOOK_RUN_ALL) at modperl_callback.c:370
#7 0xb77feb63 in modperl_fixup_handler (r=0x88ad830) at modperl_hooks.c:57
#8 0x0806ff29 in ap_run_fixups ()
#9 0x08084528 in ap_process_request ()
#10 0x080817de in ap_register_input_filter ()
#11 0x0807b507 in ap_run_process_connection ()
#12 0x0808914b in ap_graceful_stop_signalled ()
#13 0xb7d10316 in apr_proc_detach () from /usr/lib/libapr-1.so.0
#14 0xb7cad0bd in start_thread () from /lib/tls/libpthread.so.0
#15 0xb7c3c9ee in clone () from /lib/tls/libc.so.6
(gdb) btt 10
[Switching to thread 10 (process 2124)]#0 0xb7c32ef9 in poll () from
/lib/tls/libc.so.6
#0 0xb7c32ef9 in poll () from /lib/tls/libc.so.6
#1 0xb7d10145 in apr_wait_for_io_or_timeout () from /usr/lib/libapr-1.so.0
#2 0xb7d0b6b3 in apr_socket_recv () from /usr/lib/libapr-1.so.0
#3 0x0807b756 in ap_lingering_close ()
#4 0x08089153 in ap_graceful_stop_signalled ()
#5 0xb7d10316 in apr_proc_detach () from /usr/lib/libapr-1.so.0
#6 0xb7cad0bd in start_thread () from /lib/tls/libpthread.so.0
#7 0xb7c3c9ee in clone () from /lib/tls/libc.so.6
(gdb) btt 11
[Switching to thread 11 (process 2123)]#0 0xb7c32ef9 in poll () from
/lib/tls/libc.so.6
#0 0xb7c32ef9 in poll () from /lib/tls/libc.so.6
#1 0xb7d10145 in apr_wait_for_io_or_timeout () from /usr/lib/libapr-1.so.0
#2 0xb7d0b058 in apr_socket_sendv () from /usr/lib/libapr-1.so.0
#3 0x08073145 in ap_bucket_eoc_create ()
#4 0x080739d7 in ap_core_output_filter ()
#5 0x08069d24 in ap_content_length_filter ()
#6 0xb781b4fe in ?? () from /usr/lib/apache2/modules/mod_cache.so
#7 0x0890f298 in ?? ()
#8 0x08905298 in ?? ()
#9 0x08905298 in ?? ()
#10 0x08905298 in ?? ()
#11 0x00000000 in ?? ()
(gdb) btt 12
[Switching to thread 12 (process 2122)]#0 0xb7c32ef9 in poll () from
/lib/tls/libc.so.6
#0 0xb7c32ef9 in poll () from /lib/tls/libc.so.6
#1 0xb7d10145 in apr_wait_for_io_or_timeout () from /usr/lib/libapr-1.so.0
#2 0xb7d0b6b3 in apr_socket_recv () from /usr/lib/libapr-1.so.0
#3 0x0807b756 in ap_lingering_close ()
#4 0x08089153 in ap_graceful_stop_signalled ()
#5 0xb7d10316 in apr_proc_detach () from /usr/lib/libapr-1.so.0
#6 0xb7cad0bd in start_thread () from /lib/tls/libpthread.so.0
#7 0xb7c3c9ee in clone () from /lib/tls/libc.so.6