I think any proper solution first requires (re)design of the sample language.
Rainer 2015-01-28 11:57 GMT+01:00 David Lang <[email protected]>: > you have two separate things here that I'll reply to separately. > > 1. placeholder instead of valua (i.e. apache putting in - instead of IP > address) > > I don't think this needs to be as general as you are making it > > %<tag>:<type>:<extraData>:|:<alternateTag>:<alternateType>: >> <alternateData>% >> > > why would you need to call it something different (an alternate tag) > > If we want to do this on a per-tag basis which is a reasonable option, can > we put in as an extraData tag and have it apply to all types? There's > already manipulation there (lowercase, etc), have alternateconstant:value > or something like that. > > The other approach I was thinking of would require a change to the ruleset > parser to allow something like > > set nullValue = '-' > rule > rule > rule > set nullValue = ':' > rule > rule > rule > set nullValue ='' > rule > rule > > rather than specifying it for every tag. I'm making the assumption that a > given application is going to be consistant in what it's null value is, and > you will probably still have multiple rules for that log type. > > David Lang > > > On Tue, 27 Jan 2015, Kendall Green wrote: > > Date: Tue, 27 Jan 2015 23:26:28 -0700 >> From: Kendall Green <[email protected]> >> Reply-To: rsyslog-users <[email protected]> >> To: rsyslog-users <[email protected]> >> Subject: Re: [rsyslog] rsyslog normalization >> >> >> Thank you everyone for your thoughtful responses. After contemplating >> normalization and recent contributions that will be great in helping with >> much needed to-string rule base type, it looks like these problems will >> soon be resolved. >> >> The new feature provides functionality to mmnormalize rulebase, similar to >> how field() function can parse separators on a char or string, and also >> provide alternates or optional value for 'quoted-string' as 'word' data >> type if missing quote as single word. >> >> I'd rather have a mess of a long rule definition rather than potentially >> hundreds of rules for one message format (that changes depending on the >> contents of the fields). These rules are long and difficult to read anyway >> because of having a lot of fields and so many rules to compose the >> variations of formatting depending on if fields are populated or not, >> which >> is the cause of the situation. -- Although most whom responded to the >> related message thread dislike idea of an 'or' type, I'm still interested >> in this feature for select data type combinations. >> >> If setting an alternate type on the field, what about alternate tag too, >> which could also be used to place the hyphen in the tag to auto exclude >> the >> field if it were the alternate value, or specific value, such as the tag1 >> example, which would conflict with the linear syntax model, but perhaps, >> if >> tag1 and exclude the tag if 'hyphen', %tag1:quoted-string:|:-:char:\x2d% >> >> %<tag>:<type>:<extraData>:|:<alternateTag>:<alternateType>: >> <alternateData>% >> >> I know that is ugly, but Imagine a ugly log, with a lot of fields, and >> lets >> take just three key values, which exhibit the potential of output >> something >> different, if not populated, a char, (hyphen/space) or string "Null" for >> example: >> >> Username: NT Authority Domain: Null IP Address: - >> Username: kgreen Domain: kendallarium IP Address: 10.0.0.10 >> Username: Domain: Null IP Address: 10.0.0.1 >> >> If rulebase included char type and string type, to match and capture on >> fieldname/tag or exclude if is set to hyphen... >> >> Username: %User:to-string:Domain:|-:char:\x20% Domain: >> %Domain:word:|:-:string:Null% IP Address: %IP:ipv4:|:-:char:\x2d% >> >> These examples emphasize excluding the data with hyphen tag, although >> there >> are plenty of cases where may want extract field only if is matched, >> simply >> use the nonexistent char and string literal types would nice to have, but >> the main feature of '|' or word, number, ip address, quoted string, or >> character/string to specific char/to-string, and in some cases could be >> substituted for array of key value pairs which have arbitrary wrappers for >> if the data is quoted, with equal symbols, or colons, comma, and/or space >> separated, for user defined format as extradata, or maybe even template >> name with string to define, to extend the iptables type. This would not >> give such control to select which fields, and data types, and alternates, >> nor ability to exclude on alternate, as previously described features >> do... >> >> What this does do, is shorten rule base when there are well defined key >> value pattern (kvp), which could possibly leverage what formats use colons >> and spaces and equals with or without quotes, similar to mmjsonparse or >> mmfields, but with cee declarations not required, as string format... >> Somehow to define the wrappers for the type of key: "value" or key=value >> >> So on that level, with lexical parsing of the same 3 key as tag/label and >> values, abstracting from the iptables type, as if we had ability to define >> wrap, in a literal form, perhaps something like this better illustrates >> key >> value pairs (kvp) rulebase: >> >> Example from above as format [Key: Value ] >> %kvp:key\x3a\x20value\x20\x20% >> >> kvp type would need extra data that includes key and value define format >> of >> wrappers, in this case key: value key: value >> -the 3 chosen fields output would be auto field extracted and labeled, >> same >> as iptables type does: >> "Username"="NT Authority", "Domain"="Null", "IP Address"="-" >> "Username"="kgreen", "Domain"="kendallarium", "IP Address"="10.0.0.10" >> "Username"=" ", "Domain"="Null", "IP Address"="10.0.0.1" >> >> Example from right above, to better illustrate format wrappers >> ["Key"="Value", ] >> %kvp:\x22key\x22\x3d\x22value\x22\x2c\x20% >> >> Well, I hope this helps better explain the concepts for normalization >> features, complementary the new to-string and alt data type options which >> would probably resolve majority of the obstacles anyway. I appreciate your >> time to convey and entertaining these ideas, >> >> Thank you, >> Kendall Green >> >> On Thu, Jan 8, 2015 at 4:37 PM, Eugene Istomin <[email protected]> wrote: >> >> Is anyone out there taking in raw windows, proxy, or other complex logs >>>> >>> and >>> >>>> breaking them apart for either simplification or volume reduction? >>>> >>> >>> We are using CEE modification on syslog gates like: >>> >>> template(name="cee" type="list") { >>> constant(value="<") property(name="pri") constant(value=">") >>> property(name="timereported" dateFormat="rfc3339") >>> constant(value=" ") property(name="$myhostname") >>> constant(value=" ") property(name="programname") >>> constant(value=" ") >>> constant(value="@cee: {") >>> #SYSLOG >>> constant(value="\"using_cee_relp\":\"yes\", ") >>> .... >>> >>> >>> On rsyslog-elasticsearch side -> >>> >>> ##STORE RULESETS >>> ruleset(name="store") { >>> if $parsesuccess == "OK" then { >>> >>> #ES - views >>> if ( strlen( $!msg_class) >= 1 and strlen( >>> $!msg_view) >= 1 and $!msg_class != "msg") then { >>> set $.dynafile = "1.parsed"; >>> if $.relp_server == '127.0.0.1' then { >>> call store_es >>> call file_json >>> >>> } else { >>> call file_raw >>> } >>> stop >>> } >>> .... >>> >>> >>> >>> Do you have a more detailed question? Rsyslog is an awesome message >>> queuing engine =) >>> /---/ >>> */Best regards,/* >>> /Eugene Istomin/ >>> >>> >>> >>> this is an interesting discussion, I'd be curious to see what people are >>>> doing to parse/normalize messages as they are coming in. >>>> >>>> Several projects that I'm associated with tend to revolve around >>>> extrapolating properties of a message and then assigning them >>>> >>> (typically >>> >>>> post-receipt and generating more load.) >>>> >>>> Is anyone out there taking in raw windows, proxy, or other complex logs >>>> >>> and >>> >>>> breaking them apart for either simplification or volume reduction? (for >>>> example eliminating the description field of windows events while >>>> >>> enhancing >>> >>>> the event properties before sending on down the wire?) >>>> >>>> On Tue Dec 30 2014 at 3:20:12 PM David Lang <[email protected]> wrote: >>>> >>>>> On Tue, 30 Dec 2014, Kendall Green wrote: >>>>> >>>>>> Hello, I would like to share experience with normalization of windows >>>>>> >>>>> >>>>> event >>>>> >>>>> logs with rsyslog and have critique of configuration for the latest >>>>>> >>>>> >>>>> syntax >>>>> >>>>> directives and supported functions. In response to a previous >>>>>> >>>>> message >>> >>>> regarding the reparse() feature enhancement, there appears to be >>>>>> imminent >>>>>> refactoring of parser modules. >>>>>> >>>>> >>>>> parser modules are not the same as the mmnormalize rulebase, >>>>> >>>> parser >>> >>>> modules are >>>>> applied to messages as they arrive at the rsyslog server and populate >>>>> >>>> the >>> >>>> standard properties, mmnormalize is intended to populate other >>>>> >>>> variables.. >>> >>>> >>>>> Is it possible to output mmnormalize rulebase to json path and >>>>>> >>>>> output on >>> >>>> template which does not include the msg/userawmsg field? >>>>>> >>>>> >>>>> if you have JSON, you should use mmjsonparse to extract the >>>>> >>>> variables, but >>> >>>> once >>>>> you have the variables parsed out, you can use them in any template. >>>>> >>>>> To give you more information, I would need a better idea of what you >>>>> >>>> are >>> >>>> trying >>>>> to do. >>>>> >>>>> Thank you for any recommendations or examples related to new >>>>>> >>>>> >>>>> normalization >>>>> >>>>> modules. >>>>>> >>>>> >>>>> While there may be enhancements to the normalization, that is >>>>> >>>> completely >>> >>>> separate from the parser modules (I know, it's a bit confusing) >>>>> >>>>> David Lang >>>>> _______________________________________________ >>>>> rsyslog mailing list >>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>>>> http://www.rsyslog.com/professional-services/ >>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards >>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a >>>>> >>>> myriad >>> >>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if >>>>> >>>> you >>> >>>> DON'T LIKE THAT. >>>>> >>>> >>>> _______________________________________________ >>>> rsyslog mailing list >>>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>>> http://www.rsyslog.com/professional-services/ >>>> What's up with rsyslog? Follow https://twitter.com/rgerhards >>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad >>>> >>> of >>> >>>> sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >>>> >>> DON'T >>> >>>> LIKE THAT. >>>> >>> >>> _______________________________________________ >>> rsyslog mailing list >>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>> http://www.rsyslog.com/professional-services/ >>> What's up with rsyslog? Follow https://twitter.com/rgerhards >>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad >>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >>> DON'T LIKE THAT. >>> >>> _______________________________________________ >> rsyslog mailing list >> http://lists.adiscon.net/mailman/listinfo/rsyslog >> http://www.rsyslog.com/professional-services/ >> What's up with rsyslog? Follow https://twitter.com/rgerhards >> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad >> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >> DON'T LIKE THAT. >> >> _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com/professional-services/ > What's up with rsyslog? Follow https://twitter.com/rgerhards > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad > of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you > DON'T LIKE THAT. > _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.

