Ok,
This probably has more to do with me not "getting it", sorry.
Here's what I have so far:
# Match IP SLA events
# Jun 3 11:39:08 10.48.36.39 334031: 334645: Jun 3 11:39:08.208 BST:
%RTT-4-OPER_TIMEOUT: condition occurred, entry number = 3550
# Jun 3 11:39:33 10.48.36.39 334037: 334651: Jun 3 11:39:33.232 BST:
%RTT-4-OPER_TIMEOUT: condition cleared, entry number = 3550
#
# The two rules below should watch for a "condition occurred" event and,
# if no matching "condition cleared" event comes in for the same IP with the
same
# probe number BEFORE a new "condition occurred" event comes in for that
device/probe #
# then we need to trigger an alert
# (note that probe # is the "entry number =" in the example above)
type = single
desc = email when an alert is seen while one is asserted
continue = takenext
context = alert_$1_$2
ptype = regexp
rem = $1 is ip address, $2 is entry number
pattern = ([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}).*occurred, entry
number = (\d+)
action = pipe '$0' /usr/bin/mail -s "IP SLA - Condition occurred without
clear" cdu...@cisco.com; delete alert_$1_$2; reset +1 match
the alert for host $1 and event $2
type = pair
desc = match the alert for host $1 and event $2
ptype = regexp
rem = same pattern as in single rule
pattern = ([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}).*cleared, entry
number = (\d+)
action = create alert_$1_$2
desc2 = match the clear for host $1 and event $2
ptype2 = regexp
pattern2 = $1.*occurred, entry number = ($2)
rem = %1 and %2 are used to reference $1 and $2 from the first pattern
rem = since this is run after pattern2 is matched, $1 and $2 are
rem = reassigned by the pattern2 match
action2 = delete alert_%1_%2; pipe '$0' /usr/bin/mail -s "IP SLA - something
happened" cdu...@cisco.com
time = 0
So, if I run this using /var/log/syslog as input:
sec -conf /etc/sec.conf -debug 10 -input=/var/log/syslog
I (eventually) get this:
Creating context 'alert_10.48.36.34_3820'
Creating context 'alert_10.48.36.40_3712'
Creating context 'alert_10.48.36.40_3747'
...snip
Creating context 'alert_10.48.36.34_3652'
Feeding event 'Jun 4 19:23:21 10.48.36.40 188461: 188997: *Jun 4
19:23:17.318 BST: %RTT-4-OPER_TIMEOUT: condition occurred, entry number =
3712' to shell command '/usr/bin/mail -s "IP SLA - Condition occurred
without clear" cdu...@cisco.com'
Child 730 created for command '/usr/bin/mail -s "IP SLA - Condition occurred
without clear" cdu...@cisco.com'
Deleting context 'alert_10.48.36.40_3712'
Context 'alert_10.48.36.40_3712' deleted
Cancelling the correlation operation with key '/etc/sec.conf | 1 | match the
alert for host 10.48.36.40 and event 3712'
Creating context 'alert_10.48.36.41_3208'
But, when I look at my syslog.log, I see the matching events (in order), so
I don't know why it triggered an alert:
Jun 4 19:07:51 10.48.36.40 188373: 188909: *Jun 4 19:07:47.297 BST:
%RTT-4-OPER_TIMEOUT: condition occurred, entry number = 3712
Jun 4 19:08:16 10.48.36.40 188377: 188913: *Jun 4 19:08:12.285 BST:
%RTT-4-OPER_TIMEOUT: condition cleared, entry number = 3712
Jun 4 19:23:21 10.48.36.40 188461: 188997: *Jun 4 19:23:17.318 BST:
%RTT-4-OPER_TIMEOUT: condition occurred, entry number = 3712
Jun 4 19:23:46 10.48.36.40 188464: 189000: *Jun 4 19:23:42.303 BST:
%RTT-4-OPER_TIMEOUT: condition cleared, entry number = 3712
So here we have 1 "occurred" followed by 1 "cleared" (then repeated one more
time)
If the incoming messages would have looked like this:
Jun 4 19:07:51 10.48.36.40 188373: 188909: *Jun 4 19:07:47.297 BST:
%RTT-4-OPER_TIMEOUT: condition occurred, entry number = 3712
Jun 4 19:23:21 10.48.36.40 188461: 188997: *Jun 4 19:23:17.318 BST:
%RTT-4-OPER_TIMEOUT: condition occurred, entry number = 3712
Jun 4 19:23:46 10.48.36.40 188464: 189000: *Jun 4 19:23:42.303 BST:
%RTT-4-OPER_TIMEOUT: condition cleared, entry number = 3712
That's when I need to get an email trigger because I have an "occurred"
before a "cleared" event came in.
I don't know if my pattern tests in the rule are correct - do I look for
"occurred", then "cleared"? Do I look for 2 "occurred"'s?
Sorry for my noobness, I tell all of my customers they should be using SEC,
but I don't get many opportunities to use it myself since I'm usually
"advising" :-)
Hopefully I haven't confused you more :-)
On Fri, Jun 4, 2010 at 10:54 AM, John P. Rouillard <rou...@cs.umb.edu>wrote:
>
> In message <aanlktikm18pi-oixtb9q9r1fjphfxpoukxe2qxmqr...@mail.gmail.com>,
> Clayton Dukes writes:
> >Any insight you can give would be greatly appreciated, I would love to get
> >this working :-)
> >
>
> Sorry but which non-matching message is it triggering on? What is
> your input stream?
>
> Given the input you show (in the comments at the top of the config
> file is your input stream) a fast glance at the output looks like it's
> operating as expected and emails an SLA clear message. However looking
> at the rule set I can't see how it would ever match the pattern2 in
> the pair rule (there is no "SLAs" in the clear message) so I assume
> you either pasted somthing that you aren't running or your input
> stream is different and not supplied.
>
> So afraid not much help here as I can't see what you are doing and all
> my attempts to make sense of the info you supplied are resulting in
> nonsense.
>
> I'm afraid in your attempt to be brief you have created much
> confusion here. Maybe less brief and more detailed in what you are
> actually using for input, ruleset, and output and your analysis of
> what you expected to see and why would be helpful.
>
> >On Thu, Jun 3, 2010 at 11:06 AM, Clayton Dukes <cdu...@gmail.com> wrote:
> >
> >> Thanks for the *awesome* response John!
> >>
> >> Here's what I've set up.
> >> If I run the rule below using 'sec -conf /etc/sec.conf -debug 10
> >> -input=/var/log/syslog'
> >> I get:
> >> SEC (Simple Event Correlator) 2.4.2
> >> Reading configuration from /etc/sec.conf
> >> 2 rules loaded from /etc/sec.conf
> >> Creating context 'alert_10.48.36.42_4087'
> >> Deleting context 'alert_10.48.36.42_4087'
> >> Context 'alert_10.48.36.42_4087' deleted
> >> Feeding event 'Jun 3 16:00:05 10.48.36.42 187792: 188237: Jun 3
> >> 16:00:04.214 BST: %RTT-3-IPSLATHRESHOLD: IP SLAs(4087): Threshold
> Cleared
> >> for timeout' to shell command '/usr/bin/mail -s "IP SLA - Cleared"
> >> cdu...@cisco.com'
> >> Child 26406 created for command '/usr/bin/mail -s "IP SLA - Cleared"
> >> cdu...@cisco.com'
> >
> >> So, the question is:
> >> Why does it trigger on a non-matching message?
> >> See notes in the config below for what I was trying to accomplish,
> >> hopefully a bit more succinct this time :-)
> >>
> >>
> >>
> >>
> >> # Match IP SLA events
> >> # Jun 3 11:39:08 10.48.36.39 334031: 334645: Jun 3 11:39:08.208 BST:
> >> %RTT-4-OPER_TIMEOUT: condition occurred, entry number = 3550
> >> # Jun 3 11:39:33 10.48.36.39 334037: 334651: Jun 3 11:39:33.232 BST:
> >> %RTT-4-OPER_TIMEOUT: condition cleared, entry number = 3550
> >> #
> >> # The two rules below should watch for a "condition occurred" event and,
> >> # if no matching "condition cleared" event comes in for the same IP with
> >> the same
> >> # probe number BEFORE a new "condition occurred" event comes in for that
> >> device/probe #
> >> # then we need to trigger an alert
> >> # (note that probe # is the "entry number =" in the example above)
> >>
> >> type = single
> >> desc = email when an alert is seen while one is asserted
> >> continue = takenext
> >> context = alert_$1_$2
> >> ptype = regexp
> >> rem = $1 is ip address, $2 is entry number
> >> pattern = ([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}).*condition
> >> occurred, entry number = (\d+)
> >> action = pipe '$0' /usr/bin/mail -v "Dropped syslog entry. Found alert
> >> while it was pending" cdu...@cisco.com; delete alert_$1_$2; r
> >>
> >> eset +1 match the alert for host $1 and event $2
> >>
> >> type = pair
> >> desc = match the alert for host $1 and event $2
> >> ptype = regexp
> >> rem = same pattern as in single rule
> >> pattern = ([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}).*condition
> >> cleared, entry number = (\d+)
> >>
> >> action = create alert_$1_$2
> >> desc2 = match the clear for host $1 and event $2
> >> ptype2 = regexp
> >> pattern2 = $1.*SLAs\($2\)
> >> rem = %1 and %2 are used to reference $1 and $2 from the first pattern
> >> rem = since this is run after pattern2 is matched, $1 and $2 are
> >> rem = reassigned by the pattern2 match
> >> action2 = delete alert_%1_%2; pipe '$0' /usr/bin/mail -s "IP SLA -
> Cleared
> >"
> >> cdu...@cisco.com
> >> time = 0
> >>
> >> #; shellcmd do something else useful
>
> --
> -- rouilj
> John Rouillard
> ===========================================================================
> My employers don't acknowledge my existence much less my opinions.
>
--
______________________________________________________________
Clayton Dukes
______________________________________________________________
------------------------------------------------------------------------------
ThinkGeek and WIRED's GeekDad team up for the Ultimate
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
lucky parental unit. See the prize list and enter to win:
http://p.sf.net/sfu/thinkgeek-promo
_______________________________________________
Simple-evcorr-users mailing list
Simple-evcorr-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/simple-evcorr-users