Hello and thank you for the example configuration for reparse(), which would help normalization efforts for parsing windows event messages for different event ids.
How would the refactoring of reparse affect the new RainerScript functions warp() and replace(), and also mmnormalize can also work with variable, and are there any example configuration for these functions. The re_extract() function warning message that it is being deprecated, so I resorted to using legacy template syntax from the regex generator tool, and assigning to variable with the exec_template() function. This allows regex to be used, without putting into the mmnormalize rulebase, but different regex would be necessary to reparse the subsection of the messages, for conditions on contents or different events. The spacing depending on the contents of the message, or event states that affect the formatting changes, as another challenge, these messages currently have syslog agents that loose the structure, but can be restored with possibly wrap() and replace(), to reformat the message parts and mmjsonparse or reparse the key values into different format, key=value or json, cee, cef, csv. Using rulebase that is able to define most sections of the message, but the "iptables" type key=value pairs is limited. This awesome feature would provide so much ability if could specify separators, or way to define the key:value wrappers. Can anyone speak to the ability in example use cases for new features of mmnormalize? Would output name to separate json paths logically circumvent caveats regarding unknown results from executing multiple instances of different rulebases against a message, and how might reparse() might be able to take into account? Another odd thing with working with dynafiles and variables, is that $!vars appears to only work with lowercase letters, so the rulebase variables that are uppercase and used as output in the omfile name need to be set to another variable that is lowercase or it doesn't populate the outfile name. When dealing with tens of thousands of clients, it doesn't lend much to changing anything about what comes into the central logging service. The raw data output by windows is with nested structures defined by tabs, character return, new lines, that are replaced with 4, 3, or 2, spaces: "An account was logged off.\r\n\r\nSubject:\r\n\tSecurity ID:\t\tS-1-5-21-1343760832-931058557-1943201436-1000\r\n\tAccount Name:\t\tkgreen\r\n\tAccount Domain:\t\tdell\r\n\tLogon ID:\t\t0x86c35c\r\n\r\nLogon Type:\t\t\t7\r\n\r\nThis event is generated when a logon session is destroyed. It may be positively correlated with a logon event using the Logon ID value. Logon IDs are only unique between reboots on the same computer." Windows Syslog Agent sends as: "An account was logged off. Subject: Security ID: S-1-5-21-1343760832-931058557-1943201436-1000 Account Name: kgreen Account Domain: dell Logon ID: 0x86c35c Logon Type: 7 This event is generated when a logon session is destroyed. It may be positively correlated with a logon event using the Logon ID value. Logon IDs are only unique between reboots on the same computer." The disadvantages in mmnormalize, is for the msg object is included as part of the json structure after the contents have been parsed to fields, essentially duplicating into structured data containing an unstructured mess of a log. This goes back to a previous message topic where the question of which properties to include, which I think the option of all properties to include, defined in a template, for input and output properties and constants, of mmnormalizer to use for only message rather than msg or userawmsg. I assume the new feature to define a json path, and for templates to output to fieldname is related to this. Any illustration is appreciated. Looking into workaround for outputting only the json paths related to the parsed logs, without needing to know the name of every field to define, or any option to omit a field (%msg/$msg) from the $! path, or output $!all!objects, as it makes sense to have the json path be %$!Subject!Security ID% %$!Subject!Account Name% for example... output by %!Subject%. I want to experiment more with the json paths and variables, as well as literal types in rulebase to find the best way to go about windows event log normalization, but most of all feedback and community insight is most appreciated on this topic. Thank you! Best Regards, Kendall Green On Fri, Dec 19, 2014 at 8:49 AM, Rainer Gerhards <[email protected]> wrote: > David, > > I had hope to get to this before departing, but looks bad now. It's a more > complex request, as probably the complete workflow around parsers needs to > be refactored. Currently, the design is that the parser stage is run *once* > *before* message processing begins. > > It would be great if you could open a github issue tracker, so that we > don't forget about this case (piles of mail tend to be not good for > tracking, at least for me ;)). > > Rainer > > 2014-12-06 0:27 GMT+01:00 David Lang <[email protected]>: > > > > On Sat, 6 Dec 2014, singh.janmejay wrote: > > > > David, can you please elaborate using an pseudo message and pseudo > config? > >> I think I get what you are saying, and was wondering about this a little > >> while ago myself, but this will ensure all of us are on the same page. > >> > > > > so, for the example from today's messages > > > > If the format sent is: > > > > <pri>syslogtag message > > > > this will get parsed in interesting ways depending on if syslogtag looks > > like a valid hostname or not. > > > > the fix I suggested was > > > > $format, fixup1="<%pri%>%fromhost% %hostname% %syslogtag%%msg%\n" > > if $fromhost-ip == '1.2.3.4' then { > > @remote;fixup1 > > stop > > } > > > > if you want to output to a local file as well, this becomes: > > > > $format, fixup1="<%pri%>%timestamp% %fromhost% %hostname% > > %syslogtag%%msg%\n" > > $format, fixup2="%timestamp% %fromhost% %hostname% %syslogtag%%msg%\n" > > if $fromhost-ip == '1.2.3.4' then { > > @remote;fixup1 > > /var/log/messages;fixup2 > > stop > > } > > > > If you have multiple different types of fixups, are doing complicated > > things ith the logs this gets _really_ ugly. > > > > And if you can't have multiple outputs to the same thing you end up > > needing to do something like: > > > > set $.mymessage = format(); > > > > in many places, then > > > > $format myformat="$.mymessage" > > > > and then use myformat in your outputs > > > > Instead I am suggesting allowing > > > > if $fromhost-ip == '1.2.3.4' then { > > set $.myraw = format("<%pri%>%timestamp% %fromhost% %hostname% > > %syslogtag%%msg%\n"); > > reparse($.myraw) > > } > > > > This would take the contents of $.myraw (crafted by the admin), put it in > > $raw, clear $. and $! and then run the parser stack against the 'new' raw > > message (correctly populating the derived properties) > > > > so after the reparse() call, $hostname would have the data previously in > > $fromhost, $syslogtag would have the data previously in $hostname, etc. > > > > is this clearer? > > > > David Lang > > > > > > > > On Fri, Dec 5, 2014 at 11:35 PM, David Lang <[email protected]> wrote: > >> > >> the question about how to fix up a message prompted a thought. This is > a > >>> pretty common problem, and it can be dealt with by creating a custom > >>> parsing module, or a custom message modification module, but most of > the > >>> time the fixups that are needed are pretty simple. > >>> > >>> so how about adding a reparse($!var) function that would take the > >>> contents > >>> of $!var, put it in $rawmsg and run the parser stack against it? > >>> > >>> This would allow people to do a lot of the common fixups with a few > >>> normal > >>> rsyslog commands and then let the normal parsers populate all the > >>> variables. > >>> > >>> With this approach, there would be a fixup section at the top of the > >>> config that would clean up the messages, and then clean logic to output > >>> the > >>> messages. Currently when you have this sort of thing, you end up with a > >>> bunch of sections to handle individual broken types of messages with a > >>> bunch of custom templates, so outputs end up getting specified many > >>> times. > >>> > >>> David Lang > >>> _______________________________________________ > >>> rsyslog mailing list > >>> http://lists.adiscon.net/mailman/listinfo/rsyslog > >>> http://www.rsyslog.com/professional-services/ > >>> What's up with rsyslog? Follow https://twitter.com/rgerhards > >>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a > myriad > >>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you > >>> DON'T LIKE THAT. > >>> > >>> > >> > >> > >> > >> _______________________________________________ > > rsyslog mailing list > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > http://www.rsyslog.com/professional-services/ > > What's up with rsyslog? Follow https://twitter.com/rgerhards > > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad > > of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you > > DON'T LIKE THAT. > > > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com/professional-services/ > What's up with rsyslog? Follow https://twitter.com/rgerhards > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad > of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you > DON'T LIKE THAT. > _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.

