Good stuff, Dan. I was not aware of the differences between how the reboot and shutdown commands handle the reboot process.
Turns out that we're doing a reboot -f, which explains why I have orphaned PID files laying around. I'm going to make the call right now that to fight the fight to have 'reboot -f' changed to the plays-more-nicely-with-others "shutdown -r" is already lost and I'm going to work around that in code. Thanks for helping clarify this. It's weird....when I run nagios and kill it with -9, it leaves the pid file in tact, but when I restart it, it zero's out the pid file and starts just fine. when I just kill it with the default kill signal, it removes the pid file. In any case, I now know what the issues are and how to address this. Thanks again very much for you help, guys. You are a feature of Nagios. Eric > -----Original Message----- > From: Daniel Wittenberg [mailto:[email protected]] > Sent: Tuesday, December 21, 2010 9:23 AM > To: Nagios Users List > Subject: Re: [Nagios-users] Nagios kept from restarting after > reboot by lockfile > > So are you using the actual "reboot" command not "shutdown -r > now" which > is a little friendlier? The standard nagios shutdown script > should take > care of cleaning those up for you. Otherwise putting something like: > rm -f <lockfile>; service nagios start > in your rc.local would take care of it. But when you mention > pid file, > are you saying the PID file is still there, or the lock file? Since > they are different things. Again though, if nagios it > shutdown properly > you shouldn't be seeing that. > > Dan > > -----Original Message----- > From: [email protected] > [mailto:[email protected]] > Sent: Monday, December 20, 2010 6:59 PM > To: [email protected] > Subject: Re: [Nagios-users] Nagios kept from restarting after rebootby > lockfile > > We reboot all of our hosts on a weekly basis. I used to > price myself in > keeping my boxes up as long as possible, but having spent years now > supporting mission-critical financial production applications, I'm on > board with the weekly reboots. Lets you know early if some system or > app change is problematic. > > Reboot is being done via a standard reboot command. > > I've looked around for rc scripts that might address this issue, but > haven't found any. Got any pointers? > > Regarding the rc.local solution, a) I'd prefer to solve the > problem, not > just address the symptoms, and b) elsewhere in this thread I've > described the roadblocks that we have to doing anything a > system level. > Yep, that's right, boys, we survive in the app developer layer within > which we do not have root on these boxes. It's a tedious, > time-consuming, frustrating, productivity-killing endeavor to do just > about anything you can't do yourself. > > So....got any sample RC scripts, or command line params to nagios to > make it smart enough to know that the PID that is in it's PID > file isn't > an active process? > > Thanks. > > Eric > > > -----Original Message----- > > From: Daniel Wittenberg > [mailto:[email protected]] > > Sent: Monday, December 20, 2010 11:56 AM > > To: Nagios Users List > > Subject: Re: [Nagios-users] Nagios kept from restarting after > > reboot by lockfile > > > > Couple questions > > 1) Why do you have to reboot your monitoring server weekly? > > 2) How is the reboot being done? > > > > Reason I ask 2) is because the standard rc script will remove the > > lockfile when nagios is told to stop. So if you are having > > this problem > > is sounds like you are not doing a clean shutdown and > > something could be > > wrong. > > > > Either way, I guess worst case one way to check for this > would be put > > something like this in your /etc/rc.d/rc.local: > > rm -f /var/lock/subsys/nagios > > > > Assuming that's where your lockfile is. > > > > Dan > > > > > > -----Original Message----- > > From: [email protected] > > [mailto:[email protected]] > > Sent: Monday, December 20, 2010 10:16 AM > > To: [email protected]; > [email protected] > > Subject: Re: [Nagios-users] Nagios kept from restarting after > > reboot by > > lockfile > > > > Alternatively, could you recommend a good system/resource monitoring > > tool that would be able to let me know if nagios is down and > > restart it > > automatically? > > > > _____________________________________________ > > From: Berg, Eric: IT (NYK) > > Sent: Monday, December 20, 2010 11:03 AM > > To: '[email protected]' > > Subject: Nagios kept from restarting after reboot by > lock file > > > > Gee, this seems like an annoying newbie problem, but if > Nagios crashes > > or is killed (as on system reboot), it leaves a lock file > around that > > prevents it from starting again until the lock file is > > manually removed. > > > > I see this on Monday mornings after weekend reboots on a > Red Hat Linux > > box: > > > > nagios: Lockfile '/home/nagios/nagios/var/nagios.lock' > looks like its > > already held by another instance of Nagios (PID 0). Bailing out... > > > > Does anyone know if there's a config option or something else that > > obviates the need to write a wrapper scropt to check to see > > if Nagios is > > really running and remove the lock file (look slike Nagios > > already knows > > it's not running by virtue of the value of the PID inthis > > very message!) > > so that it can cleanly start up again? > > > > Thanks. > > > > Eric > > > > _______________________________________________ > > > > This e-mail may contain information that is confidential, > > privileged or > > otherwise protected from disclosure. If you are not an intended > > recipient of this e-mail, do not duplicate or redistribute it by any > > means. Please delete it and any attachments and notify the > sender that > > you have received it in error. Unless specifically indicated, this > > e-mail is not an offer to buy or sell or a solicitation to > buy or sell > > any securities, investment products or other financial product or > > service, an official confirmation of any transaction, or an official > > statement of Barclays. Any views or opinions presented are > > solely those > > of the author and do not necessarily represent those of > Barclays. This > > e-mail is subject to terms available at the following link: > > www.barcap.com/emaildisclaimer. By messaging with Barclays > you consent > > to the foregoing. Barclays Capital is the investment > banking division > > of Barclays Bank PLC, a company registered in England > (number 1026167) > > with its registered offic > > e at 1 Churchill Place, London, E14 5HP. This email may > relate to or > > be sent from other members of the Barclays Group. > > _______________________________________________ > > > > -------------------------------------------------------------- > > ---------- > > ------ > > Lotusphere 2011 > > Register now for Lotusphere 2011 and learn how > > to connect the dots, take your collaborative environment > > to the next level, and enter the era of Social Business. > > http://p.sf.net/sfu/lotusphere-d2d > > _______________________________________________ > > Nagios-users mailing list > > [email protected] > > https://lists.sourceforge.net/lists/listinfo/nagios-users > > ::: Please include Nagios version, plugin version (-v) and OS when > > reporting any issue. > > ::: Messages without supporting info will risk being sent > to /dev/null > > > > -------------------------------------------------------------- > > ---------------- > > Lotusphere 2011 > > Register now for Lotusphere 2011 and learn how > > to connect the dots, take your collaborative environment > > to the next level, and enter the era of Social Business. > > http://p.sf.net/sfu/lotusphere-d2d > > _______________________________________________ > > Nagios-users mailing list > > [email protected] > > https://lists.sourceforge.net/lists/listinfo/nagios-users > > ::: Please include Nagios version, plugin version (-v) and OS > > when reporting any issue. > > ::: Messages without supporting info will risk being sent > to /dev/null > > > -------------------------------------------------------------- > ---------- > ------ > Lotusphere 2011 > Register now for Lotusphere 2011 and learn how > to connect the dots, take your collaborative environment > to the next level, and enter the era of Social Business. > http://p.sf.net/sfu/lotusphere-d2d > _______________________________________________ > Nagios-users mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS when > reporting any issue. > ::: Messages without supporting info will risk being sent to /dev/null > > -------------------------------------------------------------- > ---------------- > Lotusphere 2011 > Register now for Lotusphere 2011 and learn how > to connect the dots, take your collaborative environment > to the next level, and enter the era of Social Business. > http://p.sf.net/sfu/lotusphere-d2d > _______________________________________________ > Nagios-users mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS > when reporting any issue. > ::: Messages without supporting info will risk being sent to /dev/null > ------------------------------------------------------------------------------ Forrester recently released a report on the Return on Investment (ROI) of Google Apps. They found a 300% ROI, 38%-56% cost savings, and break-even within 7 months. Over 3 million businesses have gone Google with Google Apps: an online email calendar, and document program that's accessible from your browser. Read the Forrester report: http://p.sf.net/sfu/googleapps-sfnew _______________________________________________ Nagios-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
