On Apr 23, 2010, at 6:36 AM, Albert Shih wrote:

> My problem is : 
> 
>       I've lot of services/hosts under nagios (~1500) and when some
>       building is out of power (so lots of host/services going down), the
>       nagios go wrong because of the charge on the server. 

What do you mean by 'nagios go wrong?'

> The first thing I think about is the parents/children but as you say that's
> not going to change anything.

This is the monitoring solution to this kind of outage. Nagios will continue to 
try to check all hosts at the site but will report all devices behind the 
blocking device (top switch in your case it seems) in an unreachable state. You 
can choose to receive (or not) notifications about 'unreachable' devices 
separately from the 'down' switch.

> The second idea is to : 
> 
>       Use dependencies and put a set (like all servers in a building) of
>       host/servers in dependence of top of switch. 

Dependencies are not a good solution to this problem as you've stated it. What 
benefit do you believe there is with dependencies v.s. parents?


>       And when the power come out, I put manually the top of the switch
>       to down.

If nagios were monitoring this switch it would just override whatever you did 
here unless you also disabled monitoring of the switch.

>       What's do you think ? 

I think that you should set parents for the hosts at this site such that every 
host -> parent relationship eventually leads to the top-level switch. If the 
top-level switch goes out, it will be shown as down and anything behind it will 
be shown as unreachable.

host -- (parents) --> top-level switch    or
host -- (parents) -> some mid-level switch -- (parents) --> top-level switch

--
Marc


------------------------------------------------------------------------------
_______________________________________________
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Reply via email to