This is probably going to be a silly question as my first to the list, so 
apologies in advance if so...

Caveat: I have no prior experience with Nagios (et al), only commercial systems 
like What's Up Gold.

I got the latest (1.10) and have configured it on Ubuntu server 13.10, using 
nagios as the core.  It's running, monitoring. I've scripted host discovery 
with nmap and got it nicely discovering the lan, and inventorying services.  Am 
starting to experiment with varions options and configuration (e.g. a 
distributed polling site soon and some wan tests).

One really minor thing is bugging me, and I've wasted most of the day looking 
for it.

When I have a ping-only host that goes down, or comes back up, I get two 
messages.  For example just got both:

PING RECOVERY, CRITICAL -> OK (RECOVERY), Service PING
RECOVERY, DOWN->UP (RECOVERY, state

OK, I get that the node has a state as well as the service, but this doesn't 
happen for SNMP polled devices nor for Check_MK agent devices.  There I get 
just the state RECOVERY.

These hosts have tags "lan|prod|ping", and I have not changed (at least 
knowingly) any of the out of the box settings for the checks.

Both messages are correct of course - it can't ping, and it is down.  But it 
seems like, consistent with SNMP/Check_MK agent devices, it should only send 
the state message.  Is that correct?

But I can't find any description (I haven't gone as far as browsing the code - 
yet) of how it determines states relative to these tests and knows not to 
continue to check services if the object state is down.  I suspect I'm also 
going to have a problem with the first non-pingable devices that I can poll for 
SNMP (etc) as well?   Not sure, will cross that later.

Thanks in advance for any pointers,

Linwood

PS. If it matters notifications are going out via email.




_______________________________________________
omd-users mailing list
[email protected]
http://lists.mathias-kettner.de/mailman/listinfo/omd-users

Reply via email to