Re: monitoring parameters

David Nolan Thu, 23 Feb 2006 06:37:50 -0800

--On Wednesday, February 22, 2006 16:46:59 -0600 Nate Reed<[EMAIL PROTECTED]> wrote:

I'm not sure if I have set the monitoring parameters correctly for what I
want  to do.

First question: is the monitoring "interval" the frequency that mon runs
the  monitor, or does it define something else?


I hate to quote the documentation, but from the manual:
interval timeval

The keyword interval followed by a time value specifies the frequencythat a monitor script will be triggered.


So 'interval 30s' means that mon will run the monitor test every 30 seconds.

It seems like MON is "forgetting" about the previous alert after the
monitoring interval has elapsed (MON_FIRST_FAILURE and MON_LAST_FAILURE
are  equal even though there were numerous failures).  Is that what's
supposed to  happen?

First and last failure should be the same in certain cases, depending onhow long the failure has been happening. first failure is an indication ofwhen the current failure started, last failure is an indication of when themost recent monitor test was run. So if your interval is 5 minutes, forthe five minutes immediately following the first detection of a failurefirst and last will be the same.

Ideally, my monitor would run very frequently (every few seconds), but
the  monitoring "interval" would be longer, like 30 minutes.  Upon on a
second  failure during the monitoring interval, my alert script will try
to take a  different action than on the first failure.  Is this possible
through Mon's  configuration (without building this logic in my script)?

You can do this. The interval setting configures the testing behavior, thealert period definitions configure the alerts (actions) that will occur.You can have multiple periods with different behaviors for differentfailure lengths or different times of day.


For example, look at these two periods:

period first_action: wd{Sun-Sat}
 alertafter 1
 alert some.alert.script -some -arguments
 numalerts 1
period second_action: wd{Sun-Sat}
 alertafter 30m
 alert some.other.alert.script -some -arguments
 alertevery 30m

Those would run some.alert.script immediately whenever a failure occurs,and some.other.alert.script after the failure has been continous for halfan hour and every half hour after that.

See the manual for full information on all the alert control semantics thatare available.


-David Nolan
Network Software Designer
Computing Services
Carnegie Mellon University

_______________________________________________
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon

Re: monitoring parameters

Reply via email to