Re: [Simple-evcorr-users] "multi-line" and multi-file logs - out of box

Richard Ostrochovský Sat, 30 Nov 2019 13:16:55 -0800

hi Risto,

thank you for sharing. You are undoubtedly right, it is common, that
"delimiter" pattern uses to be beginning part of the message (we can't say
this about "\n", which is not "part of the beginning"... delimiting the end
of message is then optional, in cases where detecting the beginning of the
next message is satisfactory), so my original "specification" wasn't so
useful, as it could be.


Considering myself a Perl developer too, I could surely implement something
more or less as you suggested (and I probably will). But I consider also
important to explain, why I am asking such questions.

At first:

   - I like, what SEC does
   - I like, how SEC does it (although, it is not for "everybody", as
   describing below)

I operate as monitoring consultant and developer. My job is to implement
monitoring solutions and hand over them in documented state to "operations
guys" for further administration. These guys often are not developers, but
SEC configurations are more about programming, than about plain
configurations as "regular" administrators use to do (as I see it). At
first sight, good for me, because when those guys need to change something
and configuring SEC is out of their area of expertise, they depend on me,
and I have the next job... but: this is not the way I think, and I am
really trying to built such solutions, which will be comprehensible and
maintainable also by "regular" administrators by keeping them as simple as
possible, and then I can focus more on higher-level solutions, with more
added value, when I save some of my time by not doing these low level
configurations repeatedly for them (a prefer teaching fishing than selling
fishes). There is a lot of human resources turnover in IT, and SEC training
for its administrator's replacement sounds as science-fiction for me.
Therefore I am suggesting to be able to do most common things out-of-box
from command line options or some basic configuration, without need to
configure rules for everything (yes, I know, it is not easy nor clear to
tell, which things are "most common", as you said, infinite possibilities
:) ). This is also related with my next post (which am I currently
writing), maybe you will like that other ideas more than these ones
(multi-line & multi-file).

In this light, your suggestion with RegExpN is also interesting, as it can
decrease number of rules (less rules => less complexity) and scripting
complexity - in cases, where it is applicable.

Thank you again - for SEC availability at the first place :).

Richard

št 28. 11. 2019 o 11:33 Risto Vaarandi <risto.vaara...@gmail.com>
napísal(a):

> hi Richard,
>
> just one followup thought -- have you considered sec native multi-line
> patterns such as RegexpN for handling multi-line logs? Of course, there are
> scenarios where the value of N (max number of lines in a multi-line event)
> can be very large and is difficult to predict, and for such cases an event
> converter (like the one from the previous post) is probably the best
> approach. However, if you know the value of N and events can be matched
> with regular expressions, RegexpN pattern type can be used for this task.
> For example, if events have the brace separated format described in the
> previous post, and events can contain up to 20 lines, one could utilize the
> following Regexp20 pattern for matching:
>
> type=Single
> ptype=RegExp20
> pattern=(?s)^(?:.+\n)?\{\n(.+\n)?\}$
> desc=match a multiline event between braces
> action=write - $1
>
> Also, if you want to convert such multi-line events into a single-line
> format with builtin features, sec 'rewrite' action allows for that. In the
> following example, the first rule takes the multi-line data between braces
> and replaces each newline with a space character, and resulting single-line
> string (with a prefix "Converted event: ") is used for overwriting sec
> event buffer. The second rule is now able to match such converted events:
>
> type=Single
> ptype=RegExp20
> pattern=(?s)^(?:.+\n)?\{\n(?:(.+)\n)?\}$
> continue=TakeNext
> desc=convert a multiline event between braces to single-line format
> action=lcall %ret $1 -> ( sub { my($t) = $_[0]; $t =~ s/\n/ /g; return $t;
> } ); \
>        rewrite 20 Converted event: %ret
>
> type=Single
> ptype=RegExp
> pattern=Converted event: (.*)
> desc=match any event
> action=write - $1
>
> Maybe above examples are helpful for getting additional insights into
> different ways of processing multi-line events.
>
> kind regards,
> risto
>
>
> hi Richard,
>>>>
>>> ...
>>>
>>>> In the current code base, identifying the end of each line is done with
>>>> a simple search for newline character. The newline is searched not with a
>>>> regular expression, but rather with index() function which is much faster.
>>>> It is of course possible to change the code, so that a regular expression
>>>> pattern is utilized instead, but that would introduce a noticeable
>>>> performance penalty. For example, I made couple of quick tests with
>>>> replacing the index() function with a regular expression that identifies
>>>> the newline separator, and when testing modified sec code against log files
>>>> of 4-5 million events, cpu time consumption increased by 25%.
>>>>
>>>
>>> Hmm, this is interesting. The philosophically principial question came
>>> to my mind, if this penalty could be decreased (optimized), when doing
>>> replacements of these regular newline characters ("\n") and matching
>>> endings of "lines" with regexp - through rules (or by other more external
>>> way) - before further processing by subsequent rules, instead of potential
>>> built-in feature (used optionally on particular logfiles).
>>>
>>>
>> Perhaps I can add few thoughts here. Since the number of multi-line
>> formats is essentially infinite, converting multi-line format into
>> single-line representation externally (i.e., outside sec) offers most
>> flexibility. For instance, in many cases there is no delimiter as such
>> between messages, but beginning and end of the message contain different
>> character sequences that are part of the message. In addition, any lines
>> that are not between valid beginning and end should be discarded. It is
>> clear that using one regular expression for matching delimiters is not
>> addressing this scenario properly. Also, one can imagine many other
>> multi-line formats, and coming up with a single builtin approach for all of
>> them is not possible. On the other hand, a custom external converter allows
>> for addressing a given event format exactly as we like. For example,
>> suppose we are dealing with the following format, where multi-line event
>> starts with a lone opening brace on a separate line, and ends with a lone
>> closing brace:
>>
>> {
>>   line1
>>   line2
>>   ...
>> }
>>
>> For converting such events into a single line format, the following
>> simple wrapper could be utilized (written in 10 minutes):
>>
>> #!/usr/bin/perl -w
>> # the name of this wrapper is test.pl
>>
>> if (scalar(@ARGV) != 1) { die "Usage: $0 <file>\n"; }
>> $file = $ARGV[0];
>> if (!open(FILE, "tail -F $file |")) { die "Can't start tail for $file\n";
>> }
>> $| = 1;
>>
>> while (<FILE>) {
>>   chomp;
>>   if (/^{$/) { $message = $_; }
>>   elsif (/^}$/ && defined($message)) {
>>     $message .= $_;
>>     print $message, "\n";
>>     $message = undef;
>>   }
>>   elsif (defined($message)) {
>>     $message .= $_;
>>   }
>> }
>>
>> If this wrapper is then started from sec with 'spawn' or 'cspawn' action,
>> multi-line events from monitored file will appear as single-line synthetic
>> events for sec. For example:
>>
>> type=Single
>> ptype=RegExp
>> pattern=^(?:SEC_STARTUP|SEC_RESTART)$
>> context=SEC_INTERNAL_EVENT
>> desc=fork the converter when sec is started or restarted
>> action=spawn ./test.pl my.log
>>
>> type=Single
>> ptype=RegExp
>> pattern=\{importantmessage\}
>> desc=test
>> action=write - important message was received
>>
>> The second rule fires if the following 4-line event is written into
>> my.log:
>>
>> {
>> important
>> message
>> }
>>
>> My apologies if the above example is a bit laconic, but hopefully it
>> conveys the overall idea how to set up an event converter. And writing a
>> suitable converter is often taking not that much time, plus you get
>> something which is tailored exactly to your needs :)
>>
>> kind regards,
>> risto
>>
>>
>>

_______________________________________________
Simple-evcorr-users mailing list
Simple-evcorr-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/simple-evcorr-users

Re: [Simple-evcorr-users] "multi-line" and multi-file logs - out of box

Reply via email to