This is probably going to be a silly question as my first to the list, so apologies in advance if so...
Caveat: I have no prior experience with Nagios (et al), only commercial systems like What's Up Gold. I got the latest (1.10) and have configured it on Ubuntu server 13.10, using nagios as the core. It's running, monitoring. I've scripted host discovery with nmap and got it nicely discovering the lan, and inventorying services. Am starting to experiment with varions options and configuration (e.g. a distributed polling site soon and some wan tests). One really minor thing is bugging me, and I've wasted most of the day looking for it. When I have a ping-only host that goes down, or comes back up, I get two messages. For example just got both: PING RECOVERY, CRITICAL -> OK (RECOVERY), Service PING RECOVERY, DOWN->UP (RECOVERY, state OK, I get that the node has a state as well as the service, but this doesn't happen for SNMP polled devices nor for Check_MK agent devices. There I get just the state RECOVERY. These hosts have tags "lan|prod|ping", and I have not changed (at least knowingly) any of the out of the box settings for the checks. Both messages are correct of course - it can't ping, and it is down. But it seems like, consistent with SNMP/Check_MK agent devices, it should only send the state message. Is that correct? But I can't find any description (I haven't gone as far as browsing the code - yet) of how it determines states relative to these tests and knows not to continue to check services if the object state is down. I suspect I'm also going to have a problem with the first non-pingable devices that I can poll for SNMP (etc) as well? Not sure, will cross that later. Thanks in advance for any pointers, Linwood PS. If it matters notifications are going out via email.
_______________________________________________ omd-users mailing list [email protected] http://lists.mathias-kettner.de/mailman/listinfo/omd-users
