Hi Risto, Basically we are developing a tool with help of SEC for this log event correlation purpose.
Mostly we are working on log inputs which are 1 week old from the current day or even more. All these alarm events and syslog events will have certain pattern based on which we can easily conclude that, if this correlation pattern matches to these logs...the Issue might be this (known) one. We mostly use the keywords from the inputs to match the events. Our actual intention is we have to run one correlation which will have 3 or 4 or 5 rules in it and accept the inputs from the command line and should store the desired matched events from syslog to only syslog monitor file and from alarm to alarm monitor file like so. This checking of input events will be based on certain time i.e, in interval of 1-2 hrs which need to be passed by command line. The above given pattern is working well and good. Thank You for that. I'm passing the inputs like the below way: [I'm in sec directory which have all these monitor files along with input files] perl sec -conf=ntp.conf --input=alamon.me --input=mon.me -debug=4 On another terminal I'm executing the below commands: [under sec directory] cat alarms-TCU >> alamon.me ; cat syslog-TCU >> mon.me I am seeing the output on the SEC running console but when I tried to open the alamon.me file, it just contains all the inputs of alarms-TCU. Could you please help me how to execute the same correlation file with two inputs as I mentioned above. Thanks & Regards, Karthik On Tue, Jun 16, 2015 at 7:45 PM, Risto Vaarandi <risto.vaara...@gmail.com> wrote: > hi Karthik, > you could try the the following rule: > > type=Pair > ptype=RegExp > pattern=ALARM RAISE SP=70307.*Threshold=lnr > desc=alarm raise > action=none > ptype2=RegExp > pattern2=^(\w{3}\s+\d{1,2}\s+[\d.:]+) .* logf\[\d+\]: logf started > desc2=write - logf matched at $1, restart is within window > action2=write - %s > window=70 > > Instead of perlfunc patterns, regular expressions are much simpler for > recognizing the input events, since as I have understood, you only need to > identify few words in input events. > > I also noticed that the timestamps of your events are more than three > months old. Are you using sec for correlating events in real-time, or is > your focus on processing past events which are several weeks/months old? If > you are doing post-factum analysis of old event logs, the above rule would > still work. However, the window of 70 seconds can probably be replaced with > a much smaller value, since the value no longer indicates a real-time > window between incoming events, but the scanning time of an already > existing data set. > > kind regards, > risto > > 2015-06-16 12:35 GMT+03:00 Rajesh M <rajesh68.embed...@gmail.com>: > >> Hi Risto, >> >> Thanks lot for your valuable comment. >> >> The below is the example i'm trying to correlate: >> >> type=Pair >> ptype=perlfun >> pattern=sub {if($_[0] =~ ("RAISE" and 70307 and "lnr")) { \ >> return defined($_[1]) ? $_[1] : "RAISE";} return 0;} >> desc=70307 is matched >> action=write -.Window for restart... >> #ptype2=RegExp >> #pattern2=Mar+\s+\d+\s+(\S+)+\s+\w+\s+(\S+)+\s+logf[\d]:logf started >> #desc2=$0 >> #action2=write - logf matched at $0. restart is within window! >> #window=80 >> >> Here in the above example, >> >> 1. Based on the RAISE alarm event 70307 and keyword "lnr" as first rule >> and followed by the restart which uses the keyword "logf" as second rule... >> I am trying to correlate these two with the input as alarm file and syslog >> file. Here our intention is to associate the events with the RAISE and >> 70307 and lnr and logf. We are in search of these logs as command output. >> >> Here there is a time gap of 70sec between the RAISE event and logf >> started event. >> >> Here are the logs which is concerned to this: >> >> Alarm: >> >> >> Mar 13 15:58:26.444097 info CLA-0 /opt/nokiasiemens/S[13619]: ALARM RAISE >> SP=70307 >> MO=fshwPIUId=piu-5,fshwEquipmentHolderId=chassis-2,fshwEquipmentHolderId=cabinet-1,fsFragmentId=HW,fsClusterId=ClusterRoot >> AP=/HPIMonitor SE=2 IINFO="Unit={ADSP1-B} Position=/chassis-2/slot-5 >> Sensor={number=44,Name=DSP-TI-0 +3.3V,Threshold=lnr}" NINFO="0.392" >> TIME=1426242506444 UTCSHIFT=330 >> >> >> Syslog: >> >> >> Mar 13 15:59:31.327684 info TCU-12 logf[722]: logf started >> >> >> For the same, how much ever I am trying to execute it is not happening at >> all. Please help/provide me with the right pattern file for this. >> >> >> Thanks & Regards, >> >> Karthik >> >> On Tue, Jun 16, 2015 at 2:06 PM, Risto Vaarandi <risto.vaara...@gmail.com >> > wrote: >> >>> hi Karthik, >>> before starting developing an event correlation rule for these events, >>> the following questions need to be answered: >>> >>> 1) do you want to detect the situations where the "NTPMonitorTask >>> executeCB(): sysPeer not chosen" is *followed* by the "ALARM RAISE" event, >>> or can the order of events vary? >>> >>> 2) do these two events have specific fields which must have identical >>> values for both events? (For example, is CLA-0 a hostname in your example >>> events which can take many different values, and would you actually like to >>> associate two events based on their hostname?) >>> >>> 3) what is the maximum number of seconds between the occurrence times of >>> those two events (is it 5 seconds, 60 seconds, or something else?) >>> >>> The following rule is a simple example which assumes that >>> "NTPMonitorTask executeCB(): sysPeer not chosen" is followed by "ALARM >>> RAISE" within 60 seconds, and that these two events do not have any >>> specific fields with identical values: >>> >>> type=Pair >>> ptype=RegExp >>> pattern=CLA-0 NTPMonitor\[\d+\]: NTPMonitorTask executeCB\(\): sysPeer >>> not chosen for \s*\d+ times Reporting Critical Out of Sync Alarm >>> desc=NTPMonitorTask critical alarm >>> action=none >>> ptype2=RegExp >>> pattern2=ALARM RAISE SP=\d+ MO=/CLA-0/FSClusterNTPServer/NTPMonitor >>> AP=/CLA-0/FSClusterNTPServer/NTPMonitor SE=2 IINFO="Clock Sync" >>> NINFO="sysPeer not chosen " TIME=\d+ UTCSHIFT=\d+ >>> desc2=NTPMonitorTask critical alarm was followed by ALARM RAISE >>> action2=write - %s >>> window=60 >>> >>> After the sequence of these two events has been observed, the rule >>> writes the string "NTPMonitorTask critical alarm was followed by ALARM >>> RAISE" to standard output. >>> >>> Like David has already mentioned, if your events are not always arriving >>> in this particular order, you might need to use contexts for setting up a >>> correlation scheme. As an alternative, you could also take advantage of the >>> EventGroup2 rule. >>> >>> Assuming that your syslog events are written to /var/log/messages and >>> ALARM RAISE messages are logged to /var/log/alarms.log, the command line >>> for monitoring these two log files simultaneously could be the following: >>> >>> sec --conf=/etc/sec//test.karthik --input=/var/log/messages >>> --input=/var/log/alarms.log >>> >>> In other words, you can repeat the --input command line option several >>> times for specifying more than one input source. >>> >>> hope this helps, >>> risto >>> >>> >>> 2015-06-15 18:38 GMT+03:00 Rajesh M <rajesh68.embed...@gmail.com>: >>> >>>> Hello All, >>>> >>>> It would be very appreciate if you would help me to get through the >>>> below scenario. >>>> >>>> 1. I have the following message from the alarm file as active alarm >>>> raise event: >>>> >>>> 2015 Mar 16 19:08:57 ALARM RAISE SP=70377 >>>> MO=/CLA-0/FSClusterNTPServer/NTPMonitor >>>> AP=/CLA-0/FSClusterNTPServer/NTPMonitor SE=2 IINFO="Clock Sync" >>>> NINFO="sysPeer not chosen " TIME=1426504137696 UTCSHIFT=480 >>>> >>>> 2. I have one more log called syslog which also contains the info >>>> related to these alarm raise event. >>>> >>>> Mar 16 19:08:57.696843 warn CLA-0 NTPMonitor[3561]: NTPMonitorTask >>>> executeCB(): sysPeer not chosen for 40 times Reporting Critical Out of >>>> Sync Alarm >>>> >>>> Mar 16 19:08:57.697347 info CLA-0 NTPMonitor[3561]: ALARM RAISE >>>> SP=70377 MO=/CLA-0/FSClusterNTPServer/NTPMonitor >>>> AP=/CLA-0/FSClusterNTPServer/NTPMonitor SE=2 IINFO="Clock Sync" >>>> NINFO="sysPeer not chosen " TIME=1426504137696 UTCSHIFT=480 >>>> >>>> 3. I need to correlate the alarm raise event in alarm file to the >>>> syslog "NTP Monitor" info along with the same alarm in syslog file around >>>> the same time stamps. >>>> >>>> Our way of idea/implementation is if EVENT-1 occurs in alarm EVENT-2 >>>> will follow in the syslog. So joining of these two events as One >>>> Correlation Rule for monitoring. >>>> >>>> Please provide us about your valuable references and examples in doing >>>> the same :) . >>>> >>>> >>>> Thanks & Regards, >>>> Karthik >>>> >>>> >>>> ------------------------------------------------------------------------------ >>>> >>>> _______________________________________________ >>>> Simple-evcorr-users mailing list >>>> Simple-evcorr-users@lists.sourceforge.net >>>> https://lists.sourceforge.net/lists/listinfo/simple-evcorr-users >>>> >>>> >>> >> >
------------------------------------------------------------------------------
_______________________________________________ Simple-evcorr-users mailing list Simple-evcorr-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/simple-evcorr-users