Hi folks,
I would prefer a completely different approach:
Setup both Nagios servers independently and have them both send
notifications to an email address that hands the mail over to a little
script. All this script does is checking if the same notification has
already been sent from the other host and if yes delete it.
The advantage of this approach is: You can add other functionality by
and by to this script (e. g. filter out mass notifications from Nagios
that can occur even if you use topologic dependencies). And you do not
have to mess around with layers of service definitions. You can even use
rsync to copy config alterations from one Nagios instance to the other.
Any ideas pro/against this approach?
Dirk
John P. Rouillard schrieb:
In message <[EMAIL PROTECTED]>,
"Marc Powell" writes:
From: On Behalf Of John P. Rouillard
Sent: Wednesday, March 29, 2006 4:45 PM
In message <[EMAIL PROTECTED]>,
"Marc Powell" writes:
-----Original Message-----
From: On Behalf Of Philip Hallstrom
Sent: Wednesday, March 29, 2006 3:54 PM
I'm wondering if two nagios instances can be set up to monitor the
same hosts/services and have to agree with each other before
sending a notification?
[chop]
For an off-the-cuff suggestion, if you used multiple retries and didn't
specifically require that both servers see the state as HARD you could
embed that logic in your notification script.
- NagiosA always sends notifications.
If you have a redunant setup, only one server A or B would have to
send notifications for the service B.
I presume that you're referring to this from your previous e-mail --
"On both nagios 1 and 2 create service B that does notify (and poll)
that uses check_cluster to require that both be in error condition to
generate an error notification."
Correct.
How would you prevent duplicate notifications? Nagios 1 wouldn't know
that Nagios 2 had already sent a notification and vice-versa unless you
kept track of that externally.
The site where it was set up originally had the second server as a
backup notifier. If it lost connectivity to the primary server it
switched on notifications.
Later a seperate SEC process on the second server monitored the
primary's notifications and would release notifications queued up by
the second nagios process (keyed by host, service, severity) if the
notifications from the first and second didn't come through within 5
minutes of each other. It worked and made sure that alert's weren't
delayed more than 5 minutes, but frankly the original setup with the
second server not notifying unless it lost heartbeat on the original
server (or the original server detected it couldn't get pages out) had
a lot fewer issues. Then again I didn't have to work there.
- ServiceX on HostY reaches hard state.
- NagiosA initiates notification for ServiceX on HostY
- Notification script searches status.log on NagiosB or performs HTTP
screen scrape on NagiosB to determine state of ServiceX on HostY as
seen from there.
- If NagiosB shows CRITICAL, send notification
- If only one shows critical do nothing(?)
- repeat at regular intervals in case NagiosB was slow to pick up the
state (or use the vice-versa logic to also send notifications from
NagiosB)
Neat idea, however you would need to handle the case where nagios B
isn't properly updating the service (and therfore isn't providing
valid data).
Looking at Last Update should cover that scenario.
True.
There are probably pitfalls but I think that's how I would approach it
at first.
Yeah. It's a bit dicey regardless of how you slice it.
Agreed. Interesting problem though.
Yup then again so is automaticaly rewriting the nagios config files
and correcting the parent links so they can be used on a redundant
host.
-- rouilj
John Rouillard
===========================================================================
My employers don't acknowledge my existence much less my opinions.
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
_______________________________________________
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
_______________________________________________
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null