Re: [IM-Talk] Best practice on Group Probe notifications

Michael Graziano Wed, 02 Sep 2009 09:02:01 -0700

Your solution is definitely one way to go and I think it most closelyresembles the pre-probe group model.I'll toss out another method which I'm using and might work for youtoo: The group notifier goes to people who need to know "everything",with the sub-notifiers going to specific teams.

In my scenario I have a job queue which occasionally backs up, so JobQlength is checked by one of several probes on the database server. Ifa backlog develops I need to tell the admin team (so we can fix it)and the support team (so they can tell customers we're working on itif they get calls).


Our structure (simplified):

DB Server Probe Group -> Notify: Admin Team
- SNMP HR             -> Notify: NONE (Control Probe)
- IPMI/Health Check   -> Notify: NONE
- DB TCP              -> Notify: NONE
- Job Queue Length    -> Notify: Support Team

The net effect is exactly what I need: Admins get paged whenever themachine is unhappy, and the support team gets notified when the jobqueue backs up (but doesn't get harassed about dead fans or otherstuff they can't fix), and nobody gets more than one email about anygiven issue.



-MG

On Sep 2, 2009, at 11:13 AM, Michael Luz wrote:

(forgive this wordy post, but as I type it out, I'm starting tofigure out solutions..)
Problem:
We are monitoring a server with several different probes, and thuscreated a probe group for that server. We're having some issues nowwith duplicate notifications going out, and was wondering if you allcan look at what I'm currently doing, and perhaps suggest a betterway of setting it up.
Currently, I have a notification set up for (1) the probe groupitself, (2) snmp resources, which is the control probe, (3) port 80,(4) port 443, and (5) port 25. When this server had problemsrecently, a director received separate emails for the probe group,port 80 going down, and port 443 down. I think this accurate,because the control probe (snmp) never went down, so that's whyseperate alerts went out for 80 and 443, but the group probe alertseems a bit redundant.
My question is, can I reduce the number of notifications going outsomehow on one device?
Do I actually need a notification on the probe group if I havenotifications on each probe inside? Or... can I turn off all theprobes inside, and let the probe group notification handleeverything? (if the probe group is the only notifier, what if itturns red for port 80, then later 443 goes down, will another alertgo out? In that instance we WANT another to go out..)
Also, regardless of the above questions, I should make sure that mycontrol probe (usually SNMP) polls more frequently then the otherprobes to reduce the chance of multiple notifications going out ifthe server goes down... correct?
So I'm thinking of setting up my notifications as follows;

SERVER1 Probe Group - No notifications
--SERVER1 SNMP Probe - 2 minute poll, no delay on notify
--SERVER1 TCP Port 80 Probe - 2 minutes, 2 minute delay on notify
--SERVER1 TCP Port 443 - 2 minutes, 2 minute delay on notify
--SERVER1 TCP Port 25 - 2 minutes, 2 minute delay on notify
This should reduce the number of duplicates (w/ the probe groupnotifier) and also by having the delay on the "other" probes, itgives the control probe a chance to actually trigger first andprevent the other notifiers...
Good idea, am I on the right track??  Thanks for your input!

Michael
____________________________________________________________________
List archives:
http://www.mail-archive.com/intermapper-talk%40list.dartware.com/
To unsubscribe: send email to: [email protected]


____________________________________________________________________

List archives:http://www.mail-archive.com/intermapper-talk%40list.dartware.com/

To unsubscribe: send email to: [email protected]

Re: [IM-Talk] Best practice on Group Probe notifications

Reply via email to