Andreas Ericsson wrote: > On 11/18/2010 03:48 PM, Tim Palmer wrote: > >> Good morning, or whatever as the case may be... >> >> I have a Nagios 3.2.1install which is showing a problem I'm unsure how >> to troubleshoot further. It's either something simple I'm missing, or a >> deeper, more difficult problem. Or a transient to be perhaps put on a >> shelf until it happens again. >> >> First, the questions: >> - Is the notifications log absolute? >> - Meaning, if a notification is shown in this log, it has passed all >> filters (notification options etc) and Nagios believes it was submitted >> to the MTA. >> >> > > Yes. >
Excellent, thank you. That's the critical bit for me regarding Nagios. > >> - Is there anywhere besides the MTA's log,status.dat and nagios.log to >> look for clues to mail problems? >> > > The receiving end comes to mind, or any server(s) in between. > > >> ============== >> Details >> - Running on FreeBSD 7.0, using stock sendmail on localhost. >> - In general, everything is working fine. 125 hosts, 1600 ish services. >> This system has been up and stable for a few months. >> >> Host and service notifications of all kinds go out properly all the time. >> >> Last night, I had a host go down. Notification got to my cell phone and >> the other contacts it's configured to just fine. This morning, I dealt >> with the problem host and Nagios showed it back up. But no Host up >> notification to any of the configured contacts. The Notifications log >> shows the host up notifications as having been sent. There's nothing in >> /var/log/maillog for the time Nagios says the notifications were sent. >> In status.dat, the record for my cell contact has a >> "last_host_notification" line with the epoch time version of the exact >> second the notification was in theory sent. Host and template records >> included at the bottom of this email. I've included one contact def, but >> there were 4 contacts, using 2 different scripts that should have >> received the notification. >> >> As far as I can see, there is nothing in the host configuration or >> related templates that would keep a host up notification from being sent. >> >> We use custom host-notify scripts which log actions, and again, no >> entries for the specific problem, but lots of other notifications before >> and after. These scripts could be the problem, but I want to rule out >> other issues first. >> >> > > Notifications are a pretty integral part to what makes Nagios worth > anything at all. Since you're using homebrewed scripts and noone else > has reported any problems with them, I suggest you first debug your > own scripts, or enable debug-logging for notifications. The dosc will > tell you how to do that. It won't help for this occurrance of the > failed notifications, but it will definitely help you in the future > if it ever happens again. > > Agreed on all counts. Now that you've confirmed the final-ness of the notifications log, I am comfortable looking outside Nagios to the scripts, system and sendmail. I'm sure there's a reasonable, logical explanation for a small subset of mail not getting from Nagios to the local MTA... Thank you Tim ------------------------------------------------------------------------------ Beautiful is writing same markup. Internet Explorer 9 supports standards for HTML5, CSS3, SVG 1.1, ECMAScript5, and DOM L2 & L3. Spend less time writing and rewriting code and more time creating great experiences on the web. Be a part of the beta today http://p.sf.net/sfu/msIE9-sfdev2dev _______________________________________________ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null