Hi All, For about a week Nagios had gone quiet and although I should have tweaked that something suspicious had happened, I had no previous reasons for concern. Until I was alerted by a user that they were getting no alerts when they knew that their network device had gone offline.
I quickly checked and indeed the nagios service status was shown as stopped! Nagios is installed on CentOS and looking at the logs I saw just this: ============================================ [1200807131] SERVICE ALERT: router1.XXXXXXXXX.com;Camera;CRITICAL;HARD;1;CRITICAL: - failed: A temporary error occurred on an authoritative name server. Try again later. [1200807301] SERVICE ALERT: router1.XXXXXXXXX.com;WAP;CRITICAL;SOFT;1;(Service Check Timed Out) [1200807381] SERVICE ALERT: router1.XXXXXXXX.com;Router;CRITICAL;SOFT;1; (Service Check Timed Out) [1200807461] HOST ALERT: router2.XXXXXXXXX.com;DOWN;SOFT;1;(No output returned from host check) [1200807461] SERVICE ALERT: router2.XXXXXXXXX.com;WAP;CRITICAL;SOFT;1;(Service Check Timed Out) [1200807551] SERVICE ALERT: router2.XXXXXXXXX.com;Camera;CRITICAL;HARD;1;CRITICAL: - failed: A temporary error occurred on an authoritative name server. Try again later. [1200807551] SERVICE ALERT: router2.XXXXXXXXX.com;Router;CRITICAL;HARD;1;CRITICAL: - failed: A temporary error occurred on an authoritative name server. Try again later. [1200807551] SERVICE ALERT: router1.XXXXXXXXX.com;WAP;CRITICAL;HARD;1;(Service Check Timed Out) [1200807551] SERVICE ALERT: router1.XXXXXXXXX.com;Camera;CRITICAL;HARD;1; (Service Check Timed Out) [1200807551] SERVICE ALERT: router1.XXXXXXXXX.com;Router;CRITICAL;HARD;1; (Service Check Timed Out) [1200807551] SERVICE ALERT: router2.XXXXXXXXX.com;WAP;CRITICAL;HARD;1;(Service Check Timed Out) [1200807591] HOST ALERT: router2.XXXXXXXXX.com;DOWN;SOFT;2;(Host Check Timed Out) [1200807691] HOST ALERT: router2.XXXXXXXXX.com;DOWN;HARD;3;(Host Check Timed Out) [1201469001] Nagios 3.0a3 starting... (PID=20058 ============================================ The last line is when I restarted the Nagios service. I can't see anything else in the system logs. Any ideas as to what might have happened? IS the message about the DNS server something that could have caused Nagios to implode (I don't think so, but just asking for your experience here). BTW, the server receives the usual unwanted attention from script-kiddies trying to crack their way in. Fail2ban tries its best to keep them out. Could it be that the server has been compromised? A Nagios vulnerability? What else could I look at, or test? -- Regards, Mick
signature.asc
Description: This is a digitally signed message part.
------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________ Nagios-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
