Title: Message

Hi Mark,

 

I too had submitted an e-mail asking if passing the host properties along to the services would have some detrimental impact, but I didn’t receive a reply. I agree that it would certainly make the services.cfg file much simpler to maintain if you didn’t have to set up multiple services in order to create different views for various user groups. For instance, all my hosts are setup for PING. If I want notifications sent out to specific users for specific hosts, I have to have multiple iterations of PING in my services.cfg file. If I try to add all contacts to PING, then the see every host in the interface. This is not desirable since I only want them to see the hosts and services for which they have a stake in. I think this is a greater design question that may be best answered by Ethan, but if anyone else has thoughts on the matter I’d be happy to hear the response.

 

Best regards,

 

Todd

 

-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Garringer, Mark
Sent: Monday, February 06, 2006 9:07 PM
To: 'Nagios-users@lists.sourceforge.net'
Subject: [Nagios-users] Nagios Behaviors

 

This may be rather long, please bare with me.

I am currently running Nagios 2.0rc2 with 451 hosts and 1733 services. I've recently upgraded from Netsaint (0.78b I think). During this upgrade I took the time to reexamine my configurations to take advantage of the grouping logic and regular _expression_ features in Nagios. This help make the setup much easier, faster, and more consistent than it ever was under Netsaint. Kudos!

All of the following assumes that I'm doing things to the best of my understanding, and if they are incorrect or misguided in some way, I'd be more than happy to discuss the 'correct' way of implementing these features.

I'm starting to run into a few operational issues that I'd like to raise. One of the most annoying is that RECOVERY should actually have a state for EACH type of RECOVERY that can exist. That way, I'm not getting RECOVERY notices for WARNINGS that I didn't get in the first place (because I don't care about WARNINGS in most cases.). I should only ever get RECOVERY for states which I've already indicated I want to receive their corresponding original alert from.

The other is that HOST level notification/check_periods should be inherited by any service being check on that HOST unless explicitly overridden.

For example, I have a host with 10 services on it (checking disk space in various file systems, checking for certain named processes to be running) and it has a HOST level notification_period (maintenance window) from Saturday at 1800 till Sunday 1000. Every week. I don't want to, or need to know about service changes during these times. I'm the user, and I'm requesting a black out of notifications during this time. If there are (still) problems after 1000 on Sunday, I do need to know about them or take action. The best answers I can seem to find seem to be workarounds that will cause me more overall upkeep and/or are counter to the whole idea of using grouping logic to begin with.

Creating a separate, non-dynamic cron offshoot to take care of this for me while functional requires me to basically go and define another list of items (and upkeep them) to deal with and is completely external to Nagios.

Defining each service individually so that they all can have 1 host assigned to them so they can have different maintenance windows as needed is totally against the grouping logic and allows again for much easier divergence from a set standard. Having 25 linux machines all in a linux host group because I want to check 10 common services rocks. However, not all 25 servers will have the same (or even overlapping) maintenance windows, so creating 25*10 separate service entries instead of just 10 sucks eggs. Maintaining the maintenance window at the host level makes perfect sense to me.

Creating more  host groups doesn't seem like any better of a solution and will just make things more difficult to find under the Hostgroup sections of the web display.

By not understanding something am I trying to use this in a way which it was never intended? This seem like a pretty logical feature to me, of course. Can anyone offer me advice?

Thanks!

Reply via email to