On Fri, 19 Nov 2004, Joubin Moshrefzadeh wrote: > host1 goes down - 1 alert sent > then host2 goes down - 2 alerts sent > then host3 goes down - 3 alerts sent > etc... > > so total alerts sent is 1+2+3...+10? > > is the latter correct? I've only tested it up to two hosts going down > consecutively :)
it's correct depending on how you configure mon. this is the default behavior, but you can change it. i noticed the man page needed some updating, so i did so and check in the changes to the cvs tree on the mon-1-0-0pre1 branch. the part which affects this behavior is the "alertevery" parameters. here's a summary: ALERT DECISION LOGIC Upon a non-zero or zero exit status, the associated alert or upalert program (respectively) is started, pending the following conditions: If an alert for a specific service is disabled, do not send an alert. If dep_behavior is set to 'a', and a parent dependency is failing, then suppress the alert. If the alert has previously been acknowledged, do not send the alert, unless it is an upalert. If an alert is not within the specified period, record the failure via syslog(3) and do not send an alert. If the failure does not fall within a defined period, do not send an alert. No upalerts are sent without corresponding down alerts, unless no_comp_alerts is defined in the period section. An upalert will only be sent if the previous state is a failure. If an alert was already sent within the last alertevery interval and the monitor has continued to report a nonzero exit status for a time period less than that interval, do not send another alert, unless the summary output from the most recent monitor process differs from the previous. Other- wise, send an alert using each alert program listed for that period. The observe_detail argument to alertevery affects this behavior by observing the changes in the detail part of the output in addition to the summary line. If a monitor has successive failures and the summary output changes in each of them, alertevery will not suppress multiple consecutive alerts. The reasoning is that if the summary output changes, then a significant event occurred and the user should be alerted. The "ignore_summary" option will suppress all successive alerts while the service continues to fail, even if the summary output changes. If the "strict" alertevery option is used, then behave the same as if "ignore_summary" was set, but do not reset the alertevery timer when the monitor exits with a zero status. For example, "alertevery 24h strict" will only send out an alert once every 24 hours, regardless of whether the monitor output changes, or if the ser- vice stops and then starts failing. ... alertevery timeval [observe_detail | ignore_summary | strict ] The alertevery keyword (within a period definition) takes the same type of argument as the interval variable, and limits the number of times an alert is sent when the service continues to fail. For example, if the interval is "1h", then the alerts in the period section will only be triggered once every hour as the service continues to fail. The alertevery interval timer will be reset if the monitor stops exiting with a nonzero exit status (i.e. it reports a success). If the alertevery keyword is omit- ted in a period entry, an alert will be sent out every time a failure is detected. By default, if the summary output of two successive failures changes, then the alertevery interval is overridden, and an alert will be sent. The "ignore_summary" argument suppresses this behavior. If the string "observe_detail" is the last argument, then both the summary and detail output lines will be considered when comparing the output of successive failures. If the string "strict" is the last argument, then the output of the monitor or the state change of the service will have no effect on when alerts are sent. That is, "alertevery 24h strict" will send only one alert every 24 hours, no matter what. Please refer to the ALERT DECISION LOGIC section for a detailed explanation of how alerts are suppressed. _______________________________________________ mon mailing list [EMAIL PROTECTED] http://linux.kernel.org/mailman/listinfo/mon