Mongrel 0.3.13.4
Mongrel Cluster 0.2.0
Ruby 1.8.4
Rails 1.1.6
Apache 2.2.2
RHEL 4
The symptom is that we are getting frequent application 500 errors. Monitoring the mongrel cluster shows some of the servers in Err status at any given moment. We run 40 mongrel instances in the cluster and only a few of them are in Err status at a time. Does a mongrel instance in Err status return an application 500 error? I have not found documentation on the cluster monitor.
The Apache config setup is standard, taken from Coda Hale's blog.
The Ruby on Rails code is straight forward. Very few gems are used, the memory footprint is only 35MB for each mongrel instance. I have not been able to find any error messages in the production log. What types of things would be major no-no's? I don't use sessions. The web application mainly uses AJAX to coordinate messaging between a server and the client.
I have not been able to put the servers into debug mode for the past couple of days since they are running in production at our colo, which is the only place we've seen these application errors occur. Restarting them causes chats to get dropped as well searches our users are performing.
The linux kernel's TCP/IP settings have been tweaked to the following:
net.ipv4.tcp_fin_timeout = 30Although we were noticing the application 500 errors before these tweaks and removing these tweaks did not seem to make a difference.
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_keepalive_time = 1800
net.ipv4.tcp_window_scaling = 0
net.ipv4.tcp_dsack = 0
net.ipv4.tcp_sack = 0
net.ipv4.tcp_timestamps = 0
net.ipv4.icmp_echo_ignore_broadcasts = 0
net.ipv4.inet_peer_threshold = 16536
net.ipv4.ipfrag_high_thresh = 5000000
net.ipv4.ipfrag_low_thresh = 3000000
net.ipv4.conf.all.rp_filter = 0
net.ipv4.conf.lo.rp_filter = 0
net.ipv4.conf.eth0.rp_filter = 0
net.ipv4.conf.default.rp_filter = 0
net.core.netdev_max_backlog = 2500
net.core.optmem_max = 102400
net.core.rmem_default = 262141
net.core.rmem_max = 262141
net.core.wmem_default = 262141
net.core.wmem_max = 262141
net.ipv4.route.gc_interval = 5
net.ipv4.route.gc_elasticity = 3
net.ipv4.route.gc_min_interval = 1
net.ipv4.route.gc_timeout = 30
net.ipv4.route.max_size = 65536
net.ipv4.route.gc_thresh = 256
fs.file-max = 32768
net.ipv4.ip_local_port_range = 1024 65535
Switching servers seems to resolve the problem for a day or so.
On 10/30/06, Zed A. Shaw <[EMAIL PROTECTED]
> wrote:
On Mon, 30 Oct 2006 10:02:29 -0500
"Jared Brown" < [EMAIL PROTECTED]> wrote:
> Configuration:
>
> (2) Dual Core Opterons
> 8GB RAM
> Apache used to balance 40 mongrel instances
>
> We receive Application 500 Errors. Nothing suspect appears in the log, so we
> are at a lost as to what to do next.
>
> Any advice would be welcome and/or an explanation of what types of things
> cause Application 500 Errors in mongrel.
Jared, you write software right? What would you do if someone came running into your office babbling about some kind of bug, but couldn't tell you important information you needed to know to fix the bug?
As a programmer I expect other programmers to treat me as they want to be treated. If you ask a question on the list also report: the versions of your software, how you're using it, if you're doing anything odd, operating systems used, etc.
C'mon, you know how to do this right.
--
Zed A. Shaw, MUDCRAP-CE Master Black Belt Sifu
http://www.zedshaw.com/
http://safari.oreilly.com/0321483502 -- The Mongrel Book
http://mongrel.rubyforge.org/
http://www.lingr.com/room/3yXhqKbfPy8 -- Come get help.
_______________________________________________
Mongrel-users mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/mongrel-users
_______________________________________________ Mongrel-users mailing list [email protected] http://rubyforge.org/mailman/listinfo/mongrel-users
