Hi, I have an application which is dying horrible deaths (i.e. segmentation faults) in mid-flight, in production... And of course, I should fix it. But while I find and fix the bugs, I found something I think should be different - I can work on submitting a patch, as it is quite simple, but I might be losing something on my rationale.
When Mongrel segfaults, it does not -obviously- get to clean up after itself, so it does not remove the PID files. As an example: $ sudo /etc/init.d/mongrel-cluster start Starting mongrel-cluster: Starting all mongrel_clusters... mongrel-cluster. $ sudo cat tmp/pids/mongrel.8203.pid | xargs kill -9 $ sudo /etc/init.d/mongrel-cluster status (...) found pid_file: tmp/pids/mongrel.8203.pid missing mongrel_rails: port 8203 (...) $ sudo /etc/init.d/mongrel-cluster restart Restarting mongrel-cluster: Restarting all mongrel_clusters... ** !!! PID file tmp/pids/mongrel.8203.pid already exists. Mongrel could be running already. Check your log/mongrel.8203.log for errors. ** !!! Exiting with error. You must stop mongrel and clear the .pid before I'll attempt a start. mongrel-cluster. So, what's the solution? I must manually do: $ sudo rm tmp/pids/mongrel.8203.pid $ sudo /etc/init.d/mongrel-cluster restart And now it works. What should happen? Well, 'status' already found that there is a stale PID. Of course, the 'status' action means exactly that: Get the status, do nothing else. But the 'stop' action should clean the PIDs if they do no longer exist, and the 'start' action should check whether the process with that PID is alive, and ignore it if it's not. At least, this behaviour should be specifiable via the configuration file. What do you think? -- Gunnar Wolf - [EMAIL PROTECTED] - (+52-55)5623-0154 / 1451-2244 PGP key 1024D/8BB527AF 2001-10-23 Fingerprint: 0C79 D2D1 2C4E 9CE4 5973 F800 D80E F35A 8BB5 27AF _______________________________________________ Mongrel-users mailing list Mongrel-users@rubyforge.org http://rubyforge.org/mailman/listinfo/mongrel-users