Hugo van der Kooij wrote: > On Tue, 19 Dec 2006, Andreas Ericsson wrote: > >> Yes, for reasons stated above. It gets slightly worse if you have a >> largely linear network (many hosts only have one child), since it also >> has to check parent hosts until it finds the "closest" possible "up" to >> determine where a possible network outage is happening. > > Just curious. How will this work if you have something like 5 hosts in > line in a parent-child relation? > > The fastest way would be starting from nagios and work your way to the > downed host as the average latency on a check on a live host is much > faster then the timeout you get on downed hosts. > > Considere the map as shown on > http://hvdkooij.xs4all.nl/statusmap-20061219.png > > If nagios detects the ipv6 router in the lab to be down and it has to work > it's way up it has to deal with the timeouts on nlams04 and nlams05. > > If it starts polling the other way around it only has to deal with the > host check latency of the switch and the timeout of nlams05. >
In the case you posted on your map, it would indeed be faster to start walking in -> out. However, if the closest parent had been up it would have been the other way around. Anyways, I *think* nagios checks in->out. Either way, it's important for a host check to return OK *immediately* when it finds that the host it's checking actually *is* ok, which is why I wrote check_icmp and let it have a check_host mode which does just that. The original default hostcheck (I think it's still the default, btw) would wait a minimum of 5 seconds no matter if the first ping came back ok after 5ms. Since all other checks are stopped, this causes quite a bit of slowdown. When I think about it, it would indeed (especially with check_host) be faster to start unreachability checks with root-hosts and then following children down to the targeted host since we would that way encounter a minimum of host check timeouts. It's programmatically slightly trickier though, as you'd have to walk it backwards from the problem host, push each parent host to a stack and then pop them from that stack when you do the actual checking. I'll look in this and see if a patch is necessary and, if so, if I can come up with one. -- Andreas Ericsson [EMAIL PROTECTED] OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys - and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Nagios-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
