Thanks Matt, I checked it out; it's definitely the same PID, so the process is not getting restarted.
However... There's a new development here. I found that someone using Monit was having a similar problem, and that it got fixed when they changed the path to the mongrel pid files in mongrel_cluster.yml from a relative path (tmp/pids/mongrel.pid) to a full one (/home/myuser/web/site/tmp/pids/mongrel.pid). I did this, and I now have different results, so it seems to have "worked". Now, when the conditions are met, god manages to stop the mongrel process... but apparently not started again. It complains that there's no pid file to clean... I'm not sure if tat could be stopping it. Here's what the god log looks like: I [2008-11-27 06:02:59] INFO: mongrel-8002 [ok] process is running (ProcessRunning) I [2008-11-27 06:03:00] INFO: mongrel-8002 [ok] http response nominal [200, 200, 200, 200, 200] (HttpResponseCode) I [2008-11-27 06:03:00] INFO: mongrel-8002 [trigger] memory out of bounds [40632kb, 40632kb, *42440kb, *42536kb, *42464kb] (MemoryUsage) I [2008-11-27 06:03:00] INFO: mongrel-8002 move 'up' to 'restart' I [2008-11-27 06:03:00] INFO: mongrel-8002 restart: mongrel_rails cluster::restart -C /home/btiadmin/web/bti-usa/config/mongrel_cluster.yml --clean --only 8002 I [2008-11-27 06:03:11] INFO: mongrel-8002 [trigger] process is not running (ProcessRunning) I [2008-11-27 06:03:11] INFO: mongrel-8002 move 'up' to 'start' I [2008-11-27 06:03:11] INFO: mongrel-8002 before_start: no pid file to delete (CleanPidFile) I [2008-11-27 06:03:11] INFO: mongrel-8002 start: mongrel_rails cluster::start -C /home/btiadmin/web/bti-usa/config/mongrel_cluster.yml --clean --only 8002 I [2008-11-27 06:03:21] INFO: mongrel-8002 moved 'up' to 'up' I [2008-11-27 06:03:21] INFO: mongrel-8002 [trigger] process is not running (ProcessRunning) I [2008-11-27 06:03:21] INFO: mongrel-8002 move 'up' to 'start' I [2008-11-27 06:03:21] INFO: mongrel-8002 before_start: no pid file to delete (CleanPidFile) I [2008-11-27 06:03:21] INFO: mongrel-8002 start: mongrel_rails cluster::start -C /home/btiadmin/web/bti-usa/config/mongrel_cluster.yml --clean --only 8002 I [2008-11-27 06:03:32] INFO: mongrel-8002 moved 'up' to 'up' I [2008-11-27 06:03:32] INFO: mongrel-8002 [trigger] process is not running (ProcessRunning) I [2008-11-27 06:03:32] INFO: mongrel-8002 move 'up' to 'start' I [2008-11-27 06:03:32] INFO: mongrel-8002 before_start: no pid file to delete (CleanPidFile) I [2008-11-27 06:03:32] INFO: mongrel-8002 start: mongrel_rails cluster::start -C /home/btiadmin/web/bti-usa/config/mongrel_cluster.yml --clean --only 8002 I [2008-11-27 06:03:42] INFO: mongrel-8002 [trigger] process is not running (ProcessRunning) I [2008-11-27 06:03:42] INFO: mongrel-8002 move 'up' to 'start' I [2008-11-27 06:03:42] INFO: mongrel-8002 before_start: no pid file to delete (CleanPidFile) I [2008-11-27 06:03:42] INFO: mongrel-8002 start: mongrel_rails cluster::start -C /home/btiadmin/web/bti-usa/config/mongrel_cluster.yml --clean --only 8002 I [2008-11-27 06:03:53] INFO: mongrel-8002 moved 'up' to 'up' I [2008-11-27 06:03:53] INFO: mongrel-8002 [trigger] process is not running (ProcessRunning) I [2008-11-27 06:03:53] INFO: mongrel-8002 move 'up' to 'start' I [2008-11-27 06:03:53] INFO: mongrel-8002 before_start: no pid file to delete (CleanPidFile) I [2008-11-27 06:03:53] INFO: mongrel-8002 start: mongrel_rails cluster::start -C /home/btiadmin/web/bti-usa/config/mongrel_cluster.yml --clean --only 8002 I [2008-11-27 06:03:54] INFO: mongrel-8002 auto-reenable monitoring in 600 seconds So that's better, but also terrible; it just slowly kills the website. Anyone have a suggestions of where to go from here? On Thu, Nov 27, 2008 at 9:58 AM, Matt Davies <[EMAIL PROTECTED]> wrote: > Hi John > > My guess would be this command, your restart command > > restart: mongrel_rails > cluster::restart -C /home/myuser/web/site/config/ > mongrel_cluster.yml -- > clean --only 8012 > > Is not stopping the mongrel, check the PID of the running mongrel in > question and see if changes after the god log tells you it's restarted. > > > > > > 2008/11/26 John <[EMAIL PROTECTED]> > > >> Hi Vanderkerkoff, thanks to you for responding as well. >> >> Here's what the god log looks like when the restart is triggered (and >> apparently fails). >> >> I [2008-11-26 11:48:25] INFO: mongrel-8012 [ok] process is running >> (ProcessRunning) >> I [2008-11-26 11:48:31] INFO: mongrel-8012 [ok] http response nominal >> [200, 200, 200] (HttpResponseCode) >> I [2008-11-26 11:48:31] INFO: mongrel-8012 [trigger] memory out of >> bounds [*41508kb, *41364kb, *42996kb] (MemoryUsage) >> I [2008-11-26 11:48:31] INFO: mongrel-8012 move 'up' to 'restart' >> I [2008-11-26 11:48:31] INFO: mongrel-8012 restart: mongrel_rails >> cluster::restart -C /home/myuser/web/site/config/mongrel_cluster.yml -- >> clean --only 8012 >> I [2008-11-26 11:48:41] INFO: mongrel-8012 moved 'up' to 'up' >> I [2008-11-26 11:48:41] INFO: mongrel-8012 [ok] process is running >> (ProcessRunning) >> I [2008-11-26 11:48:41] INFO: mongrel-8012 [ok] http response nominal >> [200] (HttpResponseCode) >> I [2008-11-26 11:48:41] INFO: mongrel-8012 [ok] memory within bounds >> [*42996kb] (MemoryUsage) >> I [2008-11-26 11:48:41] INFO: mongrel-8012 [ok] cpu within bounds >> [0.286313867754517%] (CpuUsage) >> I [2008-11-26 11:48:46] INFO: mongrel-8012 [ok] process is running >> (ProcessRunning) >> >> When this happens in god, nothing happens at the same time in >> mongrel.log, and mongrel.8012.log is empty. So... I'm not sure what >> to think. Any other ideas? >> >> On Nov 26, 12:12 pm, vanderkerkoff <[EMAIL PROTECTED]> wrote: >> > Check the mongrel logs John, sometimes my mongrels aren't getting shut >> > down in time, and then the restart mongrel is fired off prior to the >> > old ones getting shut down and the error will be that the port is >> > already in use. >> > >> > On Nov 26, 9:36 am, "Matt Davies" <[EMAIL PROTECTED]> wrote: >> > >> > > Hi John >> > >> > > Can you stick your god config file up in pastie so we can have a look? >> > >> > > I had a similar problem myself. >> > >> > > Matt >> > >> > > 2008/11/25 John <[EMAIL PROTECTED]> >> > >> > > > I have an Ubuntu 7.04 web server running a reasonably busy site >> using >> > > > Apache and Mongrel Cluster. It behaved really well for about a year >> > > > and a half, and then suddenly I began having Mongrel processes hang, >> > > > each one that hangs taking up 100% of one of the eight cores of the >> > > > server (when two or three of these get going at the same time, the >> > > > site becomes virtually unusable). I'm trying to find out what went >> > > > wrong, but in the meantime I'm trying to use God to keep the >> mongrels >> > > > in check. >> > >> > > > I have God installed, and the watches are working; when I look at >> the >> > > > god log, it's clear that God can see what's happening with the >> > > > mongrels and is trying to restart them when the restart conditions >> are >> > > > met. Just one problem: although God seems to think that it has >> > > > restarted a given Mongrel using the restart command, it doesn't >> > > > actually restart the process. The memory usage is the same, and, if >> > > > it's a hung process, it's still hung. >> > >> > > > The restart command works when I run it manually as root, and God >> > > > should be running as root... so... I'm not sure what could be the >> > > > problem here. Please let me know if you have an idea where to look >> > > > next. >> > >> > > > Thanks! >> > >> > > > -John >> >> > > > > --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "god.rb" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/god-rb?hl=en -~----------~----~----~----~------~----~------~--~---
