Hello - I need help. For the past 11 days, one of the two mongrel processes on my railsmachine VPS has been crashing intermittently - it has crashed about 10 times, with increasing frequency in the past few days. Unfortunately, after many, many hours I still have not been able to reproduce this problem in a controlled way - neither on my production railsmachine nor on my development machine. As far as I can tell, I have followed these suggestions from Bradley and Zed and Zed's Mongrel book: lsop -i -P | grep CLOSE_WAIT shows nothing
99% CPU is not associated with either mongrel process - the CPU is never above 5%, usually at 0%, both while the process is crashed and while they are running Memory leak seems impossible. %MEM for both processes never above 15% both when crashed and when running. Dash Bee logging on my development machine shows that no object is steadily increasing its memory consumption - garbage collection seems to be working fine. Dash Bee logging on my develoment machine also shows no leaking files. Number of open files is stable (at 6). Traffic is miniscule (< 100 requests / hour); Inserting ActiveRecord::Base.verification_timeout = 14400 in environments/production.rb had no effect. Upgrading to pre-release mongrel had no effect: sudo gem install mongrel --source=http://mongrel.rubyforge.org/releases My application is butt simple, and supported by oodles of unit, functional and integration test code. There is no transaction processing in the application and no opportunity for a jammed request due to using shared resources without proper locking. The application does not use RMagik, does not explicitely manipulate any files, and though I am not sure what you mean by 'shared resources', I suspect I am not using any. The only external libraries (external to rails) are three gems which access geo-coding services - but these were not in play when the processes crashed. killall -USR1 mongrel_rails has been in effect now through the last two crashes. The rails action which held things up was different in both cases - and is butt simple in both cases. Here is the mongrel.log in the vicinity of those two crashes: Thu Nov 02 13:07:16 PST 2006: 0 threads sync_waiting for /, 1 still active in Mongrel. Thu Nov 02 13:07:19 PST 2006: 0 threads sync_waiting for /login, 1 still active in Mongrel. Thu Nov 02 13:07:27 PST 2006: 0 threads sync_waiting for /login, 1 still active in Mongrel. Thu Nov 02 13:07:33 PST 2006: 0 threads sync_waiting for /admin/list_vote, 1 still active in Mongrel. Thu Nov 02 13:07:42 PST 2006: 0 threads sync_waiting for /admin/mark_reviewed, 1 still active in Mongrel. Thu Nov 02 13:08:17 PST 2006: 0 threads sync_waiting for /admin/mark_reviewed, 1 still active in Mongrel. Thu Nov 02 13:08:26 PST 2006: 0 threads sync_waiting for /admin/mark_reviewed, 3 still active in Mongrel. Thu Nov 02 13:08:37 PST 2006: 0 threads sync_waiting for /admin/mark_reviewed, 3 still active in Mongrel. Thu Nov 02 13:09:08 PST 2006: 1 threads sync_waiting for /admin/mark_reviewed, 4 still active in Mongrel. Thu Nov 02 13:09:35 PST 2006: Error calling Dispatcher.dispatch #<Sync_m::Err::UnknownLocker: Thread(#<Thread:0xb7234be4 aborting>) not locked.> /usr/lib/ruby/1.8/sync.rb:57:in `Fail' /usr/lib/ruby/1.8/sync.rb:63:in `Fail' and Thu Nov 02 00:05:29 PST 2006: 0 threads sync_waiting for /berkeley/downzoning/comments, 1 still active in Mongrel. Thu Nov 02 00:05:37 PST 2006: 0 threads sync_waiting for /berkeley/downzoning/comments, 1 still active in Mongrel. Thu Nov 02 00:06:11 PST 2006: 0 threads sync_waiting for /berkeley/downzoning/comments, 1 still active in Mongrel. Thu Nov 02 00:07:07 PST 2006: 0 threads sync_waiting for /berkeley/downzoning/comments, 1 still active in Mongrel. Thu Nov 02 00:07:27 PST 2006: 0 threads sync_waiting for /email_updates, 1 still active in Mongrel. Thu Nov 02 00:07:27 PST 2006: 0 threads sync_waiting for /email_updates_edit, 1 still active in Mongrel. Thu Nov 02 00:07:53 PST 2006: 0 threads sync_waiting for /berkeley/bus_rapid_transit/page/brtqanda, 1 still active in Mongrel. Thu Nov 02 00:08:11 PST 2006: 0 threads sync_waiting for /email_updates_edit, 2 still active in Mongrel. Thu Nov 02 00:08:39 PST 2006: 0 threads sync_waiting for /robots.txt, 1 still active in Mongrel. Thu Nov 02 00:08:39 PST 2006: 1 threads sync_waiting for /email_updates_edit, 3 still active in Mongrel. Thu Nov 02 00:09:50 PST 2006: 0 threads sync_waiting for /howitworks.php, 1 still active in Mongrel. Thu Nov 02 00:09:50 PST 2006: 3 threads sync_waiting for /email_updates_edit, 5 still active in Mongrel. So - as you can tell, I am a newbie at wits end, hoping you guys can 1) help me fix the problem, and 2) help me implement a temporary workaround so I can stop checking every few hours to see if I need to cap -a restart_app (which so far, has always worked...) Thanks for your careful attention. Cheers Robert Vogel > -------- Original Message -------- > Subject: Re: [Mongrel] Problems with mongrel dying > From: "Zed A. Shaw" <[EMAIL PROTECTED]> > Date: Tue, October 31, 2006 2:36 pm > To: mongrel-users@rubyforge.org > > On Tue, 31 Oct 2006 12:48:02 -0700 > Robert Vogel <[EMAIL PROTECTED]> wrote: > > > Hi > > > > One of the two mongrel processes has died in the middle of the night > > four times in the past 9 days, and I need help debugging this. > > > > Each time the symptoms are the same: > > Really, quick, but upgrade to the pre-release and then tell me if you still > get these: > > sudo gem install mongrel --source=http://mongrel.rubyforge.org/releases > > If it does not fix the problem (remember, it's random so let it run in > production for a while), then turn on USR1 logging and watch for the rails > action that is blocking things: > > sudo killall -USR1 mongrel_rails > > Otherwise, keep in mind that many many people use Mongrel without blocking > problems, so you need to rule out anything non-standard you're using that can > cause problems. RMagick, frequent DNS calls, working with files or shared > resources, are all main culprits. > > -- > Zed A. Shaw, MUDCRAP-CE Master Black Belt Sifu > http://www.zedshaw.com/ > http://safari.oreilly.com/0321483502 -- The Mongrel Book > http://mongrel.rubyforge.org/ > http://www.lingr.com/room/3yXhqKbfPy8 -- Come get help. > _______________________________________________ > Mongrel-users mailing list > Mongrel-users@rubyforge.org > http://rubyforge.org/mailman/listinfo/mongrel-users _______________________________________________ Mongrel-users mailing list Mongrel-users@rubyforge.org http://rubyforge.org/mailman/listinfo/mongrel-users