Thank you everyone for your thoughtful responses. After contemplating normalization and recent contributions that will be great in helping with much needed to-string rule base type, it looks like these problems will soon be resolved.
The new feature provides functionality to mmnormalize rulebase, similar to how field() function can parse separators on a char or string, and also provide alternates or optional value for 'quoted-string' as 'word' data type if missing quote as single word. I'd rather have a mess of a long rule definition rather than potentially hundreds of rules for one message format (that changes depending on the contents of the fields). These rules are long and difficult to read anyway because of having a lot of fields and so many rules to compose the variations of formatting depending on if fields are populated or not, which is the cause of the situation. -- Although most whom responded to the related message thread dislike idea of an 'or' type, I'm still interested in this feature for select data type combinations. If setting an alternate type on the field, what about alternate tag too, which could also be used to place the hyphen in the tag to auto exclude the field if it were the alternate value, or specific value, such as the tag1 example, which would conflict with the linear syntax model, but perhaps, if tag1 and exclude the tag if 'hyphen', %tag1:quoted-string:|:-:char:\x2d% %<tag>:<type>:<extraData>:|:<alternateTag>:<alternateType>:<alternateData>% I know that is ugly, but Imagine a ugly log, with a lot of fields, and lets take just three key values, which exhibit the potential of output something different, if not populated, a char, (hyphen/space) or string "Null" for example: Username: NT Authority Domain: Null IP Address: - Username: kgreen Domain: kendallarium IP Address: 10.0.0.10 Username: Domain: Null IP Address: 10.0.0.1 If rulebase included char type and string type, to match and capture on fieldname/tag or exclude if is set to hyphen... Username: %User:to-string:Domain:|-:char:\x20% Domain: %Domain:word:|:-:string:Null% IP Address: %IP:ipv4:|:-:char:\x2d% These examples emphasize excluding the data with hyphen tag, although there are plenty of cases where may want extract field only if is matched, simply use the nonexistent char and string literal types would nice to have, but the main feature of '|' or word, number, ip address, quoted string, or character/string to specific char/to-string, and in some cases could be substituted for array of key value pairs which have arbitrary wrappers for if the data is quoted, with equal symbols, or colons, comma, and/or space separated, for user defined format as extradata, or maybe even template name with string to define, to extend the iptables type. This would not give such control to select which fields, and data types, and alternates, nor ability to exclude on alternate, as previously described features do... What this does do, is shorten rule base when there are well defined key value pattern (kvp), which could possibly leverage what formats use colons and spaces and equals with or without quotes, similar to mmjsonparse or mmfields, but with cee declarations not required, as string format... Somehow to define the wrappers for the type of key: "value" or key=value So on that level, with lexical parsing of the same 3 key as tag/label and values, abstracting from the iptables type, as if we had ability to define wrap, in a literal form, perhaps something like this better illustrates key value pairs (kvp) rulebase: Example from above as format [Key: Value ] %kvp:key\x3a\x20value\x20\x20% kvp type would need extra data that includes key and value define format of wrappers, in this case key: value key: value -the 3 chosen fields output would be auto field extracted and labeled, same as iptables type does: "Username"="NT Authority", "Domain"="Null", "IP Address"="-" "Username"="kgreen", "Domain"="kendallarium", "IP Address"="10.0.0.10" "Username"=" ", "Domain"="Null", "IP Address"="10.0.0.1" Example from right above, to better illustrate format wrappers ["Key"="Value", ] %kvp:\x22key\x22\x3d\x22value\x22\x2c\x20% Well, I hope this helps better explain the concepts for normalization features, complementary the new to-string and alt data type options which would probably resolve majority of the obstacles anyway. I appreciate your time to convey and entertaining these ideas, Thank you, Kendall Green On Thu, Jan 8, 2015 at 4:37 PM, Eugene Istomin <[email protected]> wrote: > > Is anyone out there taking in raw windows, proxy, or other complex logs > and > > breaking them apart for either simplification or volume reduction? > > We are using CEE modification on syslog gates like: > > template(name="cee" type="list") { > constant(value="<") property(name="pri") constant(value=">") > property(name="timereported" dateFormat="rfc3339") > constant(value=" ") property(name="$myhostname") > constant(value=" ") property(name="programname") > constant(value=" ") > constant(value="@cee: {") > #SYSLOG > constant(value="\"using_cee_relp\":\"yes\", ") > .... > > > On rsyslog-elasticsearch side -> > > ##STORE RULESETS > ruleset(name="store") { > if $parsesuccess == "OK" then { > > #ES - views > if ( strlen( $!msg_class) >= 1 and strlen( > $!msg_view) >= 1 and $!msg_class != "msg") then { > set $.dynafile = "1.parsed"; > if $.relp_server == '127.0.0.1' then { > call store_es > call file_json > > } else { > call file_raw > } > stop > } > .... > > > > Do you have a more detailed question? Rsyslog is an awesome message > queuing engine =) > /---/ > */Best regards,/* > /Eugene Istomin/ > > > > > this is an interesting discussion, I'd be curious to see what people are > > doing to parse/normalize messages as they are coming in. > > > > Several projects that I'm associated with tend to revolve around > > extrapolating properties of a message and then assigning them > (typically > > post-receipt and generating more load.) > > > > Is anyone out there taking in raw windows, proxy, or other complex logs > and > > breaking them apart for either simplification or volume reduction? (for > > example eliminating the description field of windows events while > enhancing > > the event properties before sending on down the wire?) > > > > On Tue Dec 30 2014 at 3:20:12 PM David Lang <[email protected]> wrote: > > > On Tue, 30 Dec 2014, Kendall Green wrote: > > > > Hello, I would like to share experience with normalization of windows > > > > > > event > > > > > > > logs with rsyslog and have critique of configuration for the latest > > > > > > syntax > > > > > > > directives and supported functions. In response to a previous > message > > > > regarding the reparse() feature enhancement, there appears to be > > > > imminent > > > > refactoring of parser modules. > > > > > > parser modules are not the same as the mmnormalize rulebase, > parser > > > modules are > > > applied to messages as they arrive at the rsyslog server and populate > the > > > standard properties, mmnormalize is intended to populate other > variables.. > > > > > > > Is it possible to output mmnormalize rulebase to json path and > output on > > > > template which does not include the msg/userawmsg field? > > > > > > if you have JSON, you should use mmjsonparse to extract the > variables, but > > > once > > > you have the variables parsed out, you can use them in any template. > > > > > > To give you more information, I would need a better idea of what you > are > > > trying > > > to do. > > > > > > > Thank you for any recommendations or examples related to new > > > > > > normalization > > > > > > > modules. > > > > > > While there may be enhancements to the normalization, that is > completely > > > separate from the parser modules (I know, it's a bit confusing) > > > > > > David Lang > > > _______________________________________________ > > > rsyslog mailing list > > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > > http://www.rsyslog.com/professional-services/ > > > What's up with rsyslog? Follow https://twitter.com/rgerhards > > > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a > myriad > > > of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if > you > > > DON'T LIKE THAT. > > > > _______________________________________________ > > rsyslog mailing list > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > http://www.rsyslog.com/professional-services/ > > What's up with rsyslog? Follow https://twitter.com/rgerhards > > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad > of > > sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you > DON'T > > LIKE THAT. > > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com/professional-services/ > What's up with rsyslog? Follow https://twitter.com/rgerhards > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad > of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you > DON'T LIKE THAT. > _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.

