Hi Risto,

Thank you! I am going to try the solution now.

best,

Yuheng


On Sat, Jul 26, 2014 at 10:08 AM, Risto Vaarandi <risto.vaara...@gmail.com>
wrote:

>
>
>
> 2014-07-25 21:49 GMT+03:00 Yuheng Du <yuhe...@clemson.edu>:
>
> >from your previous mails, I got an impression that you just want to match
>> two consecutive events. Do you actually want to have a rule for >detecting
>> if a machine fails to send a keepalive message after 10 seconds from
>> previous message?
>>
>> Yes. I want a rule for detecting if a machine fails to send a keepalive
>> message after 10s from previous message. The machine is identified by its
>> "deploymentId".
>>
>> >Do you want to get the notification when the keepalive is missing, or
>> also for *every* successfully received keepalive?
>>
>> I want to get notified only when keepalive is missing.
>>
>
> if you have many nodes and you want the shortest and simplest solution,
> you can try the following:
>
> type=single
> ptype=regexp
>
> pattern=\"deploymentId\"\s+=>\s+(\S+)deployment#(\S+)\",
> desc=match keepalive
> action=create KEEPALIVE_$2 10 ( write - keepalive for $2 not received )
>
> Each time a message comes in from some node, a context is set up which
> exists for 10 seconds. If the context expires (this happens when more than
> 10 seconds have elapsed since its creation), a warning message is written
> to standard output. The context can only expire for the node if no messages
> have been received for this node during >10 seconds, since each message
> recreates the context again, setting its lifetime to 10 seconds from the
> current moment. If occasionally the interval between messages can be few
> seconds larger (due to message transmission lags, for example), you can set
> the context lifetime to a somewhat larger value (like 15 seconds).
>
> The rule above has few drawbacks - for example, if an important node has
> stopped transmitting messages before you start sec, you will never know
> about it. To fix this problem, you could trigger keepalive checks
> explicitly from the Calendar rule, for example:
>
> type=calendar
> time=* * * * *
> desc=trigger keepalive check for critical nodes
> action=event keepalive_check_srb_2; \
>        event 10 keepalive_check_srb_2; \
>        event 20 keepalive_check_srb_2; \
>        event 30 keepalive_check_srb_2; \
>        event 40 keepalive_check_srb_2; \
>        event 50 keepalive_check_srb_2
>
> type=pairwithwindow
> ptype=regexp
> pattern=keepalive_check_(\S+)
> desc=keepalive check for $1
> action=write - keepalive for $1 not received
> ptype2=regexp
> pattern2=\"deploymentId\"\s+=>\s+\S+deployment#$1\",
> desc2=keepalive received for %1
> action2=none
> window=10
>
> Another question which is left open is how to keep the keepalive state
> across sec restarts. If you take the previous Single rule which sets up
> contexts, you can save/restore these contexts at shutdowns/restarts. Here
> is the relevant FAQ entry:
> http://simple-evcorr.sourceforge.net/FAQ.html#15
>
> hope this helps,
> risto
>
>
>> Thanks.
>>
>> best,
>>
>> Yuheng
>>
>>
>> On Fri, Jul 25, 2014 at 2:20 PM, Risto Vaarandi <risto.vaara...@gmail.com
>> > wrote:
>>
>>> 2014-07-24 19:13 GMT+03:00 Yuheng Du <yuhe...@clemson.edu>:
>>>
>>>> Hi guys,
>>>>
>>>> I want to do a correlation between event so If I heard/not heard a
>>>> message coming from the same machine within 10s, I need to got notified.
>>>>
>>>
>>> From your previous mails, I got an impression that you just want to
>>> match two consecutive events. Do you actually want to have a rule for
>>> detecting if a machine fails to send a keepalive message after 10 seconds
>>> from previous message? Do you want to get the notification when the
>>> keepalive is missing, or also for *every* successfully received keepalive?
>>> BR,
>>> risto
>>>
>>>
>>>> I am using an EventGroup rule to do this:
>>>>
>>>> type=EventGroup
>>>> ptype=RegExp
>>>> thresh=2
>>>> window=10
>>>> pattern=\"deploymentId\"\s+=>\s+(\S+)deployment#(\S+)\",
>>>> desc=CHECK_INTERVAL_$2
>>>> action=assign %deploymentId $2;\
>>>>        create deploymentId_$2;\
>>>>        create DEPLOYMENTID_CONTEXT;\
>>>> write - $2 heart beats heard within 10s.
>>>> slide=reset 0 %s;
>>>> end=write - $2 not heard for 10s since last receive event.;\
>>>>     create $2_HEARTBEAT_TIMEOUT;\
>>>>     event $2 not heard for 10s.
>>>>
>>>> However, the pattern can only identify messages coming form ANY
>>>> deploymentId, while I want it to identify any messages coming from a
>>>> SPECIFIC deploymentId.
>>>> like in:
>>>>
>>>> "deploymentId" => deployment#srb_2",
>>>> "deploymentId" => deployment#srb_4",
>>>> "deploymentId" => deployment#srb_2",
>>>>
>>>> I only want to correlate messages coming from srb_2 alone or srb_4
>>>> alone.
>>>>
>>>> Anyone have a suggestion how I can do it with eventgroup rule?
>>>>
>>>> Or I should just switch to single/singlewiththreshold method as John
>>>> suggested in list
>>>> http://sourceforge.net/p/simple-evcorr/mailman/message/32640664/ ?
>>>>
>>>> Thanks!
>>>>
>>>>
>>>> ------------------------------------------------------------------------------
>>>> Want fast and easy access to all the code in your enterprise? Index and
>>>> search up to 200,000 lines of code with a free copy of Black Duck
>>>> Code Sight - the same software that powers the world's largest code
>>>> search on Ohloh, the Black Duck Open Hub! Try it now.
>>>> http://p.sf.net/sfu/bds
>>>> _______________________________________________
>>>> Simple-evcorr-users mailing list
>>>> Simple-evcorr-users@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/simple-evcorr-users
>>>>
>>>>
>>>
>>
>
------------------------------------------------------------------------------
Want fast and easy access to all the code in your enterprise? Index and
search up to 200,000 lines of code with a free copy of Black Duck
Code Sight - the same software that powers the world's largest code
search on Ohloh, the Black Duck Open Hub! Try it now.
http://p.sf.net/sfu/bds
_______________________________________________
Simple-evcorr-users mailing list
Simple-evcorr-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/simple-evcorr-users

Reply via email to