Re: [rsyslog] feature request

Kendall Green Tue, 30 Dec 2014 10:37:28 -0800

Hello and thank you for the example configuration for reparse(), which
would help normalization efforts for parsing windows event messages for
different event ids.

How would the refactoring of reparse affect the new RainerScript functions
warp() and replace(), and also mmnormalize can also work with variable, and
are there any example configuration for these functions. The re_extract()
function warning message that it is being deprecated, so I resorted to
using legacy template syntax from the regex generator tool, and assigning
to variable with the exec_template() function. This allows regex to be
used, without putting into the mmnormalize rulebase, but different regex
would be necessary to reparse the subsection of the messages, for
conditions on contents or different events. The spacing depending on the
contents of the message, or event states that affect the formatting
changes, as another challenge, these messages currently have syslog agents
that loose the structure, but can be restored with possibly wrap() and
replace(), to reformat the message parts and mmjsonparse or reparse the key
values into different format, key=value or json, cee, cef, csv. Using
rulebase that is able to define most sections of the message, but the
"iptables" type key=value pairs is limited. This awesome feature would
provide so much ability if could specify separators, or way to define the
key:value wrappers. Can anyone speak to the ability in example use cases
for new features of mmnormalize? Would output name to separate json paths
logically circumvent caveats regarding unknown results from executing
multiple instances of different rulebases against a message, and how
might reparse() might be able to take into account?

 Another odd thing with working with dynafiles and variables, is
that $!vars appears to only work with lowercase letters, so the rulebase
variables that are uppercase and used as output in the omfile name need to
be set to another variable that is lowercase or it doesn't populate the
outfile name.  When dealing with tens of thousands of clients, it doesn't
lend much to changing anything about what comes into the central logging
service. The raw data output by windows is with nested structures
defined by tabs, character return, new lines, that are replaced with 4, 3,
or 2, spaces:

"An account was logged off.\r\n\r\nSubject:\r\n\tSecurity
ID:\t\tS-1-5-21-1343760832-931058557-1943201436-1000\r\n\tAccount
Name:\t\tkgreen\r\n\tAccount Domain:\t\tdell\r\n\tLogon
ID:\t\t0x86c35c\r\n\r\nLogon Type:\t\t\t7\r\n\r\nThis event is generated
when a logon session is destroyed. It may be positively correlated with a
logon event using the Logon ID value. Logon IDs are only unique between
reboots on the same computer."

Windows Syslog Agent sends as:
"An account was logged off.    Subject:   Security ID:
S-1-5-21-1343760832-931058557-1943201436-1000   Account Name:  kgreen
Account Domain:  dell   Logon ID:  0x86c35c    Logon Type:   7    This
event is generated when a logon session is destroyed. It may be positively
correlated with a logon event using the Logon ID value. Logon IDs are only
unique between reboots on the same computer."

The disadvantages in mmnormalize, is for the msg object is included as part
of the json structure after the contents have been parsed to fields,
essentially duplicating into structured data containing an unstructured
mess of a log.
This goes back to a previous message topic where the question of which
properties to include, which I think the option of all properties to
include, defined in a template, for input and output properties and
constants, of mmnormalizer to use for only message rather than msg or
userawmsg. I assume the new feature to define a json path, and for
templates to output to fieldname is related to this. Any illustration is
appreciated.

Looking into workaround for outputting only the json paths related to the
parsed logs, without needing to know the name of every field to define, or
any option to omit a field (%msg/$msg) from the $! path, or output
$!all!objects, as it makes sense to have the json path be
%$!Subject!Security ID% %$!Subject!Account Name% for example... output by
%!Subject%. I want to experiment more with the json paths and variables, as
well as literal types in rulebase to find the best way to go about windows
event log normalization, but most of all feedback and community insight is
most appreciated on this topic.

Thank you!

Best Regards,
Kendall Green

On Fri, Dec 19, 2014 at 8:49 AM, Rainer Gerhards <[email protected]>
wrote:

> David,
>
> I had hope to get to this before departing, but looks bad now. It's a more
> complex request, as probably the complete workflow around parsers needs to
> be refactored. Currently, the design is that the parser stage is run *once*
> *before* message processing begins.
>
> It would be great if you could open a github issue tracker, so that we
> don't forget about this case (piles of mail tend to be not good for
> tracking, at least for me ;)).
>
> Rainer
>
> 2014-12-06 0:27 GMT+01:00 David Lang <[email protected]>:
> >
> > On Sat, 6 Dec 2014, singh.janmejay wrote:
> >
> >  David, can you please elaborate using an pseudo message and pseudo
> config?
> >> I think I get what you are saying, and was wondering about this a little
> >> while ago myself, but this will ensure all of us are on the same page.
> >>
> >
> > so, for the example from today's messages
> >
> > If the format sent is:
> >
> > <pri>syslogtag message
> >
> > this will get parsed in interesting ways depending on if syslogtag looks
> > like a valid hostname or not.
> >
> > the fix I suggested was
> >
> > $format, fixup1="<%pri%>%fromhost% %hostname% %syslogtag%%msg%\n"
> > if $fromhost-ip == '1.2.3.4' then {
> >   @remote;fixup1
> >   stop
> > }
> >
> > if you want to output to a local file as well, this becomes:
> >
> > $format, fixup1="<%pri%>%timestamp% %fromhost% %hostname%
> > %syslogtag%%msg%\n"
> > $format, fixup2="%timestamp% %fromhost% %hostname% %syslogtag%%msg%\n"
> > if $fromhost-ip == '1.2.3.4' then {
> >   @remote;fixup1
> >   /var/log/messages;fixup2
> >   stop
> > }
> >
> > If you have multiple different types of fixups, are doing complicated
> > things ith the logs this gets _really_ ugly.
> >
> > And if you can't have multiple outputs to the same thing you end up
> > needing to do something like:
> >
> > set $.mymessage = format();
> >
> > in many places, then
> >
> > $format myformat="$.mymessage"
> >
> > and then use myformat in your outputs
> >
> > Instead I am suggesting allowing
> >
> > if $fromhost-ip == '1.2.3.4' then {
> >   set $.myraw = format("<%pri%>%timestamp% %fromhost% %hostname%
> > %syslogtag%%msg%\n");
> >   reparse($.myraw)
> > }
> >
> > This would take the contents of $.myraw (crafted by the admin), put it in
> > $raw, clear $. and $! and then run the parser stack against the 'new' raw
> > message (correctly populating the derived properties)
> >
> > so after the reparse() call, $hostname would have the data previously in
> > $fromhost, $syslogtag would have the data previously in $hostname, etc.
> >
> > is this clearer?
> >
> > David Lang
> >
> >
> >
> >  On Fri, Dec 5, 2014 at 11:35 PM, David Lang <[email protected]> wrote:
> >>
> >>  the question about how to fix up a message prompted a thought. This is
> a
> >>> pretty common problem, and it can be dealt with by creating a custom
> >>> parsing module, or a custom message modification module, but most of
> the
> >>> time the fixups that are needed are pretty simple.
> >>>
> >>> so how about adding a reparse($!var) function that would take the
> >>> contents
> >>> of $!var, put it in $rawmsg and run the parser stack against it?
> >>>
> >>> This would allow people to do a lot of the common fixups with a few
> >>> normal
> >>> rsyslog commands and then let the normal parsers populate all the
> >>> variables.
> >>>
> >>> With this approach, there would be a fixup section at the top of the
> >>> config that would clean up the messages, and then clean logic to output
> >>> the
> >>> messages. Currently when you have this sort of thing, you end up with a
> >>> bunch of sections to handle individual broken types of messages with a
> >>> bunch of custom templates, so outputs end up getting specified many
> >>> times.
> >>>
> >>> David Lang
> >>> _______________________________________________
> >>> rsyslog mailing list
> >>> http://lists.adiscon.net/mailman/listinfo/rsyslog
> >>> http://www.rsyslog.com/professional-services/
> >>> What's up with rsyslog? Follow https://twitter.com/rgerhards
> >>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
> myriad
> >>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
> >>> DON'T LIKE THAT.
> >>>
> >>>
> >>
> >>
> >>
> >>  _______________________________________________
> > rsyslog mailing list
> > http://lists.adiscon.net/mailman/listinfo/rsyslog
> > http://www.rsyslog.com/professional-services/
> > What's up with rsyslog? Follow https://twitter.com/rgerhards
> > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
> > of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
> > DON'T LIKE THAT.
> >
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com/professional-services/
> What's up with rsyslog? Follow https://twitter.com/rgerhards
> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
> DON'T LIKE THAT.
>
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Re: [rsyslog] feature request

Reply via email to