I’m running Nagios 2.0 (Stable) on Redhat 9.0, in a distributed environment. I’m utilizing NSCA for checks and all appears to be working properly.

 

I’m running into several issues that seemed to have “started all of a sudden”.

 

1)       On my distributed server, I don’t see syslog messages any longer, with the exception of “INITIAL SERVICE STATE” messages. Syslog is working, and in the nagios.cfg file, “nagios.cfg:use_syslog=1” I used to see all the check messages, etc. Nothing in the configuration has changed to the best of my knowledge.

 

2)       Nagios appears to “hang” on the remote sensor. Once I receive notifications that network devices are down, I never see a recovery of the network devices, even though they are recovered. The work around is to restart nagios with “service nagios restart”. Sometimes, this takes multiple tries.

 

3)       When I have a massive network outage, I receive the appropriate alerts but I receive multiple “PROBLEM” notifications. I’m only using service checks (I’m only using check_ping currently) and the notification_interval set to “0”, which according to the documentation should limit the amount of messages I’m receiving to “1”, unless I’m using the service escalations, which I am not at this time. I am not receiving multiple notifications for “OK” messages, which is what I would expect.

 

 

Sorry about the novel but these have frustrated me into drinking lots of beer.

 

Mike

 

Reply via email to