I've been invited (http://twitter.com/dsp_/status/5350037022) to post my question here, so here goes.
I've got 2009.06 running on an EC2 instance. >From time to time, my Apache processes just freeze. This sucks. Sucks badly enough that my client (insightcruises.com) is saying "no more EC2 - no more opensolaris". I had been hosting him for five years on OpenBSD, so he wants to move back. Sucks. After various attempts at writing a deadman timer that would try to hit the site every 15 minutes to see if it was broke, I finally managed to capture this "pstack `pgrep http`" output before needing (immediately) to "pkill http" to wake it back up. If anyone has any clues looking at this about what I can try next to get it unstuck, please let me know. ================================================== Looks like most kids are in the mutex wait... and 1141 and 29473 are the active processes. Not sure which one was talking to port 80. 1140: /usr/apache2/2.2/bin/httpd -k start cf591337 fcntl (12, 23, cf6f5330) cf57d3cc fcntl (12, 23, cf6f5330, cf6d5f02) + 104 cf6d5f23 proc_mutex_fcntl_acquire (9847fd0, 0, 2, 0, 98b7930, 1) + 2f cf6d6265 apr_proc_mutex_lock (9847fd0, 2, 0, 8091666) + 15 08091537 child_main (3, 8090fac, 8047c38, 8091801) + 24f 08091846 make_child (80c8128, 3, 8047c70, 80c6230) + 86 08091d64 ap_mpm_run (80c6230, 80f42e8, 80c8128, 8070831) + 410 0807083e main (3, 8047de0, 8047df0, 8047d9c) + 812 0806f9fd _start (3, 8047e90, 8047eab, 8047eae, 0, 8047eb4) + 7d 1106: /usr/apache2/2.2/bin/httpd -k start cf591337 fcntl (12, 23, cf6f5330) cf57d3cc fcntl (12, 23, cf6f5330, cf6d5f02) + 104 cf6d5f23 proc_mutex_fcntl_acquire (9847fd0, 0, 2, 0, 98b7930, 1) + 2f cf6d6265 apr_proc_mutex_lock (9847fd0, 2, 0, 8091666) + 15 08091537 child_main (2, 8090fac, 8047c38, 8091801) + 24f 08091846 make_child (80c8128, 2, 8047c70, 80c6230) + 86 08091d64 ap_mpm_run (80c6230, 80f42e8, 80c8128, 8070831) + 410 0807083e main (3, 8047de0, 8047df0, 8047d9c) + 812 0806f9fd _start (3, 8047e90, 8047eab, 8047eae, 0, 8047eb4) + 7d 815: /usr/apache2/2.2/bin/httpd -k start cf591337 fcntl (12, 23, cf6f5330) cf57d3cc fcntl (12, 23, cf6f5330, cf6d5f02) + 104 cf6d5f23 proc_mutex_fcntl_acquire (9847fd0, 0, 2, 0, 98b7930, 1) + 2f cf6d6265 apr_proc_mutex_lock (9847fd0, 2, 0, 8091666) + 15 08091537 child_main (6, 8090fac, 8047c38, 8091801) + 24f 08091846 make_child (80c8128, 6, 8047c70, 80c6230) + 86 08091d64 ap_mpm_run (80c6230, 80f42e8, 80c8128, 8070831) + 410 0807083e main (3, 8047de0, 8047df0, 8047d9c) + 812 0806f9fd _start (3, 8047e90, 8047eab, 8047eae, 0, 8047eb4) + 7d 1141: /usr/apache2/2.2/bin/httpd -k start cf591027 portfs (6, 13, 98aba98, 2, 1, 8047b78) cf6dd302 apr_pollset_poll (98aba50, 989680, 0, 8047bd4, 8047bd8, 2) + 126 08091611 child_main (4, 8090fac, 8047c38, 8091801) + 329 08091846 make_child (80c8128, 4, 8047c70, 80c6230) + 86 08091d64 ap_mpm_run (80c6230, 80f42e8, 80c8128, 8070831) + 410 0807083e main (3, 8047de0, 8047df0, 8047d9c) + 812 0806f9fd _start (3, 8047e90, 8047eab, 8047eae, 0, 8047eb4) + 7d 1108: /usr/apache2/2.2/bin/httpd -k start cf591337 fcntl (12, 23, cf6f5330) cf57d3cc fcntl (12, 23, cf6f5330, cf6d5f02) + 104 cf6d5f23 proc_mutex_fcntl_acquire (9847fd0, 0, 2, 0, 98b7930, 1) + 2f cf6d6265 apr_proc_mutex_lock (9847fd0, 2, 0, 8091666) + 15 08091537 child_main (0, 8090fac, 8047c38, 8091801) + 24f 08091846 make_child (80c8128, 0, 8047c70, 80c6230) + 86 08091d64 ap_mpm_run (80c6230, 80f42e8, 80c8128, 8070831) + 410 0807083e main (3, 8047de0, 8047df0, 8047d9c) + 812 0806f9fd _start (3, 8047e90, 8047eab, 8047eae, 0, 8047eb4) + 7d 1039: /usr/apache2/2.2/bin/httpd -k start cf591337 fcntl (12, 23, cf6f5330) cf57d3cc fcntl (12, 23, cf6f5330, cf6d5f02) + 104 cf6d5f23 proc_mutex_fcntl_acquire (9847fd0, 0, 2, 0, 98b7930, 1) + 2f cf6d6265 apr_proc_mutex_lock (9847fd0, 2, 0, 8091666) + 15 08091537 child_main (1, 8090fac, 8047c38, 8091801) + 24f 08091846 make_child (80c8128, 1, 8047c70, 80c6230) + 86 08091d64 ap_mpm_run (80c6230, 80f42e8, 80c8128, 8070831) + 410 0807083e main (3, 8047de0, 8047df0, 8047d9c) + 812 0806f9fd _start (3, 8047e90, 8047eab, 8047eae, 0, 8047eb4) + 7d 29743: /usr/apache2/2.2/bin/httpd -k start cf591687 pollsys (8047b40, 0, 8047bb8, 0) cf53ce01 pselect (0, 0, 0, 0, 8047bb8, 0) + 199 cf53d1d6 select (0, 0, 0, 0, 8047bf8, 0) + 78 cf6e2cad apr_sleep (f4240, 0, 9848b20, 9) + 51 08088783 ap_wait_or_timeout (8047c68, 8047c6c, 8047c70, 80c6230) + 5f 08091b9d ap_mpm_run (80c6230, 80f42e8, 80c8128, 8070831) + 249 0807083e main (3, 8047de0, 8047df0, 8047d9c) + 812 0806f9fd _start (3, 8047e90, 8047eab, 8047eae, 0, 8047eb4) + 7d -- Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095 <merlyn at stonehenge.com> <URL:http://www.stonehenge.com/merlyn/> Smalltalk/Perl/Unix consulting, Technical writing, Comedy, etc. etc. See http://methodsandmessages.vox.com/ for Smalltalk and Seaside discussion
