Su Ralph created EAGLE-464:
------------------------------
Summary: StateCheck: multiple stage of definition in single policy
Key: EAGLE-464
URL: https://issues.apache.org/jira/browse/EAGLE-464
Project: Eagle
Issue Type: New Feature
Affects Versions: v0.5.0
Reporter: Su Ralph
Assignee: Su Ralph
Fix For: v0.5.0
The requirement of alert state and transition comes from two real customer
needs.
Alert de-duplication
"IMO, eagle should do state checks for all services. Eagle should not alert in
the first attempt itself. Instead it should change the state to SOFT for 2
tries and then if it is the same state, change the state to HARD and then send
the alert." - Aroop
Currently, eagle's alert engine(and also that of UMP) use a simple
deduplication spec of time based redundancy check(dedupIntervalMin of
Publishment). This deduplication is not flexible to reflect the need of alerts.
There are common requests like to hold a alert/policy state (basically a alert
state is policy state on given partition value, more in latter), and trigger
alert when the state changed. This state change manner could be
> Same alert trigger again in M time interval
> N alerts in given M time interval.
NOTE: on here, in this de-duplication mode, there is no required change of the
policy itself.
Alert policy define on transition
One example of the missingblock policy we met(only alert when missingblock
number changes). There is more general case with minor difference, given a
metric (or a field of a given stream), define value range, where each range
indicate different state. Etc. for perfmon.latency.avg.perpool, define value
range state as
metric
value range
state
alert trigger
perfmon.latency.avg.perpool.5min 3000 - Unlimited FATAL always
(every 5min until FATAL fixed or alert muted explictly)
1000 - 3000 CRITICAL on dual transition
100 - 1000 WARN on dual transition
10 - 50 NORMAL on worse transition
0-10 GOOD on worse transition
Then the alert should be trigger during the state changed expect for FATAL.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)