This patch to mongrel cluster adds a check wait between each start/stop: http://rubyforge.org/tracker/download.php/1306/5147/15427/2761/rolling-restart.patch
"check wait" is defined as: (1) stop the port (2) check if it's really dead (3) wait 1 second and check again if it's not (4) wait 10 seconds if (3) fails and send a force quit It also does a "rolling restart" stopping and restarting each mongrel one at a time rather than taking down the whole batch (good for lb setups). I've been using in production on 20 servers (160 mongrels total) via a monkey patched mongrel_rails script for awhile now with good effect (ymmv). However, it's not been accepted into any mongrel cluster releases yet because I've heard they're revamping the whole package. On Jan 22, 2008 7:25 AM, Dave Cheney <[EMAIL PROTECTED]> wrote: > We run into this problem a lot as well. The problem can be exacerbated > when a mongrel has a backlog of work, or has bloated to a point that > it is heavily swapped. The mongrels always get the shutdown signal, > but they don't act on it fast enough to clear their pid file by the > time the start is actioned. > > In our case those mongrels will eventually quit and monit will restart > them, but its not ideal. > > If cluster::restart supported a --delay parameter that would go some > way to fixing the problem. > > Cheers > > Dave > > On 22/01/2008, at 6:05 AM, John Joseph Bachir wrote: > > > Is my problem typical? Is there a solution? Seems like mongrel_rails > > and/or the capistrano recipes should wait for the processes to stop > > before attempting to restart them. > > > _______________________________________________ > Mongrel-users mailing list > Mongrel-users@rubyforge.org > http://rubyforge.org/mailman/listinfo/mongrel-users > _______________________________________________ Mongrel-users mailing list Mongrel-users@rubyforge.org http://rubyforge.org/mailman/listinfo/mongrel-users