System:
=======
Red Hat Enterprise Linux WS release 3 (Taroon Update 8)
RT 3.6.1
Apache v2.0.59
Perl 5.8.7
mod_fcgi-2.4.2
Postgres 8.1.4

Approximately 80.000 tickets.

Problem:
========
RT/Apache suddenly becomes unavailable/hangs (normaly once a day), and requires Apache restart so RT can work again.

We are not sure what causes the problem, and if others have similar problems, we would be gladly to hear about it!

List of processes running and load on server:
=============================================
# ps aux | grep apache
root 22066 0.0 0.0 7964 3360 ? S Feb01 0:08 /local/opt/apache2/bin/httpd -k start -DSSL nobody 13898 0.0 0.0 7964 3320 ? S 03:59 0:00 /local/opt/apache2/bin/fcgi- -k start -DSSL nobody 21699 0.0 0.0 8216 3832 ? S 14:02 0:00 /local/opt/apache2/bin/httpd -k start -DSSL nobody 22628 0.0 0.0 8216 3840 ? S 14:14 0:00 /local/opt/apache2/bin/httpd -k start -DSSL nobody 22648 0.0 0.0 8216 3840 ? S 14:15 0:00 /local/opt/apache2/bin/httpd -k start -DSSL nobody 22650 0.0 0.0 8216 3820 ? S 14:15 0:00 /local/opt/apache2/bin/httpd -k start -DSSL nobody 22939 0.0 0.0 8252 3756 ? S 14:18 0:00 /local/opt/apache2/bin/httpd -k start -DSSL nobody 22941 0.0 0.0 8216 3848 ? S 14:18 0:00 /local/opt/apache2/bin/httpd -k start -DSSL nobody 22945 0.0 0.0 8216 3804 ? S 14:18 0:00 /local/opt/apache2/bin/httpd -k start -DSSL nobody 22953 0.0 0.0 8216 3756 ? S 14:18 0:00 /local/opt/apache2/bin/httpd -k start -DSSL nobody 22955 0.0 0.0 8216 3796 ? S 14:18 0:00 /local/opt/apache2/bin/httpd -k start -DSSL nobody 22959 0.0 0.0 8216 3812 ? S 14:19 0:00 /local/opt/apache2/bin/httpd -k start -DSSL nobody 22961 0.0 0.0 8216 3800 ? S 14:19 0:00 /local/opt/apache2/bin/httpd -k start -DSSL nobody 22962 0.0 0.0 8216 3788 ? S 14:19 0:00 /local/opt/apache2/bin/httpd -k start -DSSL nobody 22965 0.0 0.0 8236 3804 ? S 14:19 0:00 /local/opt/apache2/bin/httpd -k start -DSSL nobody 22966 0.0 0.0 8216 3788 ? S 14:19 0:00 /local/opt/apache2/bin/httpd -k start -DSSL nobody 22967 0.0 0.0 8216 3812 ? S 14:19 0:00 /local/opt/apache2/bin/httpd -k start -DSSL nobody 23217 0.0 0.0 8228 3792 ? S 14:21 0:00 /local/opt/apache2/bin/httpd -k start -DSSL nobody 23218 0.0 0.0 8228 3744 ? S 14:21 0:00 /local/opt/apache2/bin/httpd -k start -DSSL nobody 23219 0.0 0.0 8244 3740 ? S 14:21 0:00 /local/opt/apache2/bin/httpd -k start -DSSL nobody 23224 0.0 0.0 8232 3768 ? S 14:21 0:00 /local/opt/apache2/bin/httpd -k start -DSSL nobody 23225 0.0 0.0 8216 3752 ? S 14:21 0:00 /local/opt/apache2/bin/httpd -k start -DSSL nobody 23230 0.0 0.0 8228 3776 ? S 14:22 0:00 /local/opt/apache2/bin/httpd -k start -DSSL nobody 23240 0.0 0.0 8220 3780 ? S 14:22 0:00 /local/opt/apache2/bin/httpd -k start -DSSL nobody 23241 0.0 0.0 8220 3740 ? S 14:22 0:00 /local/opt/apache2/bin/httpd -k start -DSSL nobody 23242 0.0 0.0 8248 3728 ? S 14:22 0:00 /local/opt/apache2/bin/httpd -k start -DSSL nobody 23250 0.0 0.0 8216 3732 ? S 14:22 0:00 /local/opt/apache2/bin/httpd -k start -DSSL nobody 23254 0.0 0.0 8216 3744 ? S 14:22 0:00 /local/opt/apache2/bin/httpd -k start -DSSL nobody 23255 0.0 0.0 8216 3732 ? S 14:22 0:00 /local/opt/apache2/bin/httpd -k start -DSSL nobody 23286 0.0 0.0 8216 3772 ? S 14:22 0:00 /local/opt/apache2/bin/httpd -k start -DSSL nobody 23290 0.0 0.0 8216 3760 ? S 14:23 0:00 /local/opt/apache2/bin/httpd -k start -DSSL nobody 23292 0.0 0.0 8248 3724 ? S 14:23 0:00 /local/opt/apache2/bin/httpd -k start -DSSL nobody 23294 0.0 0.0 8216 3764 ? S 14:23 0:00 /local/opt/apache2/bin/httpd -k start -DSSL nobody 23299 0.0 0.0 8108 3696 ? S 14:23 0:00 /local/opt/apache2/bin/httpd -k start -DSSL nobody 23326 0.0 0.0 8108 3672 ? S 14:25 0:00 /local/opt/apache2/bin/httpd -k start -DSSL nobody 23327 0.0 0.0 8108 3708 ? S 14:25 0:00 /local/opt/apache2/bin/httpd -k start -DSSL nobody 23328 0.0 0.0 8216 3696 ? S 14:25 0:00 /local/opt/apache2/bin/httpd -k start -DSSL nobody 23336 0.1 0.0 8248 3744 ? S 14:25 0:00 /local/opt/apache2/bin/httpd -k start -DSSL nobody 23337 0.0 0.0 8108 3692 ? S 14:25 0:00 /local/opt/apache2/bin/httpd -k start -DSSL nobody 23338 0.0 0.0 8108 3680 ? S 14:25 0:00 /local/opt/apache2/bin/httpd -k start -DSSL nobody 23339 0.0 0.0 8108 3692 ? S 14:25 0:00 /local/opt/apache2/bin/httpd -k start -DSSL nobody 23352 0.0 0.0 8236 3712 ? S 14:25 0:00 /local/opt/apache2/bin/httpd -k start -DSSL nobody 23353 0.0 0.0 8236 3720 ? S 14:25 0:00 /local/opt/apache2/bin/httpd -k start -DSSL nobody 23354 0.0 0.0 8236 3712 ? S 14:25 0:00 /local/opt/apache2/bin/httpd -k start -DSSL nobody 23355 0.0 0.0 8236 3712 ? S 14:25 0:00 /local/opt/apache2/bin/httpd -k start -DSSL nobody 23356 0.0 0.0 8236 3716 ? S 14:25 0:00 /local/opt/apache2/bin/httpd -k start -DSSL nobody 23358 0.0 0.0 8236 3716 ? S 14:25 0:00 /local/opt/apache2/bin/httpd -k start -DSSL nobody 23557 0.1 0.0 8108 3708 ? S 14:25 0:00 /local/opt/apache2/bin/httpd -k start -DSSL nobody 23558 0.0 0.0 8108 3656 ? S 14:25 0:00 /local/opt/apache2/bin/httpd -k start -DSSL nobody 23559 0.0 0.0 8108 3656 ? S 14:25 0:00 /local/opt/apache2/bin/httpd -k start -DSSL nobody 23560 0.0 0.0 8108 3692 ? S 14:25 0:00 /local/opt/apache2/bin/httpd -k start -DSSL nobody 23561 0.0 0.0 8108 3672 ? S 14:25 0:00 /local/opt/apache2/bin/httpd -k start -DSSL nobody 23562 0.0 0.0 8108 3660 ? S 14:25 0:00 /local/opt/apache2/bin/httpd -k start -DSSL nobody 23563 0.0 0.0 8108 3660 ? S 14:25 0:00 /local/opt/apache2/bin/httpd -k start -DSSL
root     23569  0.0  0.0  1616  468 pts/0    S    14:26   0:00 grep apache



# uptime
 14:26:08  up 45 days, 22:31,  2 users,  load average: 8.14, 8.23, 8.01

It normaly spaws 5 Apache processes when starting. Here, it's unusual many processes.

Apache logs messages as before and even rt.log logs as nothing has happened.

Apache is normaly restarted once a night, due to memory leek which Mason/Perl/FastCGI is responsible for in some strange way. But this should not be the problem here.

I noticed that there was a mail-loop from a spam, that looped in the same time-frame as the server suddenly stopped. But I cannot draw any connections between those problems. I cannot find anything directly in the logs that says some problems/alerts with Apache. It just hang, and needs a restart.

How can I debug and find out what's wrong? Is there some kind of diffuse searches in RT that causes hang (search bug) ... that may be fixed in 3.6.3 or... the release of 3.6.3 was quite fast after 3.6.2.

Sincerely,
Tomas

--
________________________________________________________________________
Tomas A. P. Olaj, email: [EMAIL PROTECTED], web: folk.uio.no/tomaso
 University of Oslo / USIT (Center for Information Technology Services)
   System- and Application Management / Applications Management Group
_______________________________________________
http://lists.bestpractical.com/cgi-bin/mailman/listinfo/rt-users

Community help: http://wiki.bestpractical.com
Commercial support: [EMAIL PROTECTED]


Discover RT's hidden secrets with RT Essentials from O'Reilly Media. Buy a copy at http://rtbook.bestpractical.com

Reply via email to