Sure. Note that I have done no real investigation of my own, though I
certainly plan on it. Monit makes me a bit lazy...
Here's my existing Monit rule:
check process mongrel-8001 with pidfile /var/www/apps/myapp/current/
log/mongrel.8001.pid
start program = "/usr/bin/ruby /usr/bin/mongrel_rails start -d -
e production -p 8001 -a 127.0.0.1 -l /var/www/apps/myapp/shared/log -
P log/mongrel.8001.pid -c /var/www/apps/myapp/current -B --user
myuser --group mygroup"
stop program = "/usr/bin/ruby /usr/bin/mongrel_rails stop -P /
var/www/apps/myapp/shared/log/mongrel.8001.pid"
if totalmem > 100.0 MB for 5 cycles then restart
if failed port 8001 protocol http
with timeout 10 seconds
then restart
group mongrel
I'll add something like: if cpu usage > 99% for 5 cycles then restart
Here's a snippit from today's monit log. Note that there was no load
on the application at all at 2am or 6am.
[EDT Sep 26 02:10:06] error : HTTP: error receiving data --
Resource temporarily unavailable
[EDT Sep 26 02:10:06] error : 'mongrel-8002' failed protocol test
[HTTP] at INET[localhost:8002] via TCP
[EDT Sep 26 02:10:06] info : 'mongrel-8002' trying to restart
[EDT Sep 26 02:10:06] info : 'mongrel-8002' start: /usr/bin/ruby
[EDT Sep 26 02:12:10] info : 'mongrel-8002' connection passed to
INET[localhost:8002] via TCP
[EDT Sep 26 06:50:59] error : HTTP: error receiving data --
Resource temporarily unavailable
[EDT Sep 26 06:50:59] error : 'mongrel-8001' failed protocol test
[HTTP] at INET[localhost:8001] via TCP
[EDT Sep 26 06:50:59] info : 'mongrel-8001' trying to restart
[EDT Sep 26 06:50:59] info : 'mongrel-8001' start: /usr/bin/ruby
[EDT Sep 26 06:53:03] info : 'mongrel-8001' connection passed to
INET[localhost:8001] via TCP
[EDT Sep 26 15:08:50] error : 'mongrel-8002' process is not running
[EDT Sep 26 15:08:50] info : 'mongrel-8002' trying to restart
[EDT Sep 26 15:08:50] info : 'mongrel-8002' start: /usr/bin/ruby
[EDT Sep 26 15:10:58] info : 'mongrel-8002' process is running
with pid 31565
I'm running RedHat EL4 (Linux eis3 2.6.9-5.ELsmp #1 SMP Wed Jan 5
19:30:39 EST 2005 i686 i686 i386 GNU/Linux) and Mongrel 0.3.13.3
behind Apache 2.2 with mod_proxy
Erik
On Sep 26, 2006, at 1:35 PM, Zed A. Shaw wrote:
> On Tue, 26 Sep 2006 10:32:20 -0400
> Erik Morton <[EMAIL PROTECTED]> wrote:
>
>> I have a very similar stack to you and I noticed Mongrels dying once
>> or twice a day. Now I'm using Monit to watch each individual Mongrel
>> in the cluster and I've noticed that each Mongrel gets restarted once
>> a day on average. I haven't got around to figure out the exact cause
>> yet, but with Monit there is always a full cluster available.
>>
>
> Can you turn on CPU usage monitoring with Monit and tell me if
> Monit has to restart mongrel due to CPU usage? Thanks.
>
> --
> Zed A. Shaw, MUDCRAP-CE Master Black Belt Sifu
> http://www.zedshaw.com/
> http://mongrel.rubyforge.org/
> http://www.lingr.com/room/3yXhqKbfPy8 -- Come get help.
> _______________________________________________
> Mongrel-users mailing list
> [email protected]
> http://rubyforge.org/mailman/listinfo/mongrel-users
>
_______________________________________________
Mongrel-users mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/mongrel-users