I think any proper solution first requires (re)design of the sample
language.

Rainer

2015-01-28 11:57 GMT+01:00 David Lang <[email protected]>:

> you have two separate things here that I'll reply to separately.
>
> 1. placeholder instead of valua (i.e. apache putting in - instead of IP
> address)
>
> I don't think this needs to be as general as you are making it
>
>  %<tag>:<type>:<extraData>:|:<alternateTag>:<alternateType>:
>> <alternateData>%
>>
>
> why would you need to call it something different (an alternate tag)
>
> If we want to do this on a per-tag basis which is a reasonable option, can
> we put in as an extraData tag and have it apply to all types? There's
> already manipulation there (lowercase, etc), have alternateconstant:value
> or something like that.
>
> The other approach I was thinking of would require a change to the ruleset
> parser to allow something like
>
> set nullValue = '-'
> rule
> rule
> rule
> set nullValue = ':'
> rule
> rule
> rule
> set nullValue =''
> rule
> rule
>
> rather than specifying it for every tag. I'm making the assumption that a
> given application is going to be consistant in what it's null value is, and
> you will probably still have multiple rules for that log type.
>
> David Lang
>
>
> On Tue, 27 Jan 2015, Kendall Green wrote:
>
>  Date: Tue, 27 Jan 2015 23:26:28 -0700
>> From: Kendall Green <[email protected]>
>> Reply-To: rsyslog-users <[email protected]>
>> To: rsyslog-users <[email protected]>
>> Subject: Re: [rsyslog] rsyslog normalization
>>
>>
>> Thank you everyone for your thoughtful responses. After contemplating
>> normalization and recent contributions that will be great in helping with
>> much needed to-string rule base type, it looks like these problems will
>> soon be resolved.
>>
>> The new feature provides functionality to mmnormalize rulebase, similar to
>> how field() function can parse separators on a char or string, and also
>> provide alternates or optional value for 'quoted-string' as 'word' data
>> type if missing quote as single word.
>>
>> I'd rather have a mess of a long rule definition rather than potentially
>> hundreds of rules for one message format (that changes depending on the
>> contents of the fields). These rules are long and difficult to read anyway
>> because of having a lot of fields and so many rules to compose the
>> variations of formatting depending on if fields are populated or not,
>> which
>> is the cause of the situation. -- Although most whom responded to the
>> related message thread dislike idea of an 'or' type, I'm still interested
>> in this feature for select data type combinations.
>>
>> If setting an alternate type on the field, what about alternate tag too,
>> which could also be used to place the hyphen in the tag to auto exclude
>> the
>> field if it were the alternate value, or specific value, such as the tag1
>> example, which would conflict with the linear syntax model, but perhaps,
>> if
>> tag1 and exclude the tag if 'hyphen', %tag1:quoted-string:|:-:char:\x2d%
>>
>> %<tag>:<type>:<extraData>:|:<alternateTag>:<alternateType>:
>> <alternateData>%
>>
>> I know that is ugly, but Imagine a ugly log, with a lot of fields, and
>> lets
>> take just three key values, which exhibit the potential of output
>> something
>> different, if not populated, a char, (hyphen/space) or string "Null" for
>> example:
>>
>> Username: NT Authority  Domain: Null  IP Address: -
>> Username: kgreen  Domain: kendallarium  IP Address: 10.0.0.10
>> Username:    Domain: Null  IP Address: 10.0.0.1
>>
>> If rulebase included char type and string type, to match and capture on
>> fieldname/tag or exclude if is set to hyphen...
>>
>> Username: %User:to-string:Domain:|-:char:\x20%  Domain:
>> %Domain:word:|:-:string:Null%  IP Address: %IP:ipv4:|:-:char:\x2d%
>>
>> These examples emphasize excluding the data with hyphen tag, although
>> there
>> are plenty of cases where may want extract field only if is matched,
>> simply
>> use the nonexistent char and string literal types would nice to have, but
>> the main feature of '|' or word, number, ip address, quoted string, or
>> character/string to specific char/to-string, and in some cases could be
>> substituted for array of key value pairs which have arbitrary wrappers for
>> if the data is quoted, with equal symbols, or colons, comma, and/or space
>> separated, for user defined format as extradata, or maybe even template
>> name with string to define, to extend the iptables type. This would not
>> give such control to select which fields, and data types, and alternates,
>> nor ability to exclude on alternate, as previously described features
>> do...
>>
>> What this does do, is shorten rule base when there are well defined key
>> value pattern (kvp), which could possibly leverage what formats use colons
>> and spaces and equals with or without quotes, similar to mmjsonparse or
>> mmfields, but with cee declarations not required, as string format...
>> Somehow to define the wrappers for the type of key: "value" or key=value
>>
>> So on that level, with lexical parsing of the same 3 key as tag/label and
>> values, abstracting from the iptables type, as if we had ability to define
>> wrap, in a literal form, perhaps something like this better illustrates
>> key
>> value pairs (kvp) rulebase:
>>
>> Example from above as format [Key: Value  ]
>> %kvp:key\x3a\x20value\x20\x20%
>>
>> kvp type would need extra data that includes key and value define format
>> of
>> wrappers, in this case key: value  key: value
>> -the 3 chosen fields output would be auto field extracted and labeled,
>> same
>> as iptables type does:
>> "Username"="NT Authority", "Domain"="Null", "IP Address"="-"
>> "Username"="kgreen", "Domain"="kendallarium", "IP Address"="10.0.0.10"
>> "Username"=" ", "Domain"="Null", "IP Address"="10.0.0.1"
>>
>> Example from right above, to better illustrate format wrappers
>> ["Key"="Value", ]
>> %kvp:\x22key\x22\x3d\x22value\x22\x2c\x20%
>>
>> Well, I hope this helps better explain the concepts for normalization
>> features, complementary the new to-string and alt data type options which
>> would probably resolve majority of the obstacles anyway. I appreciate your
>> time to convey and entertaining these ideas,
>>
>> Thank you,
>> Kendall Green
>>
>> On Thu, Jan 8, 2015 at 4:37 PM, Eugene Istomin <[email protected]> wrote:
>>
>>  Is anyone out there taking in raw windows, proxy, or other complex logs
>>>>
>>> and
>>>
>>>> breaking them apart for either simplification or volume reduction?
>>>>
>>>
>>> We are using CEE modification on syslog gates like:
>>>
>>> template(name="cee" type="list") {
>>>     constant(value="<") property(name="pri") constant(value=">")
>>>     property(name="timereported" dateFormat="rfc3339")
>>>     constant(value=" ") property(name="$myhostname")
>>>     constant(value=" ") property(name="programname")
>>>     constant(value=" ")
>>>     constant(value="@cee: {")
>>>     #SYSLOG
>>>     constant(value="\"using_cee_relp\":\"yes\", ")
>>> ....
>>>
>>>
>>> On rsyslog-elasticsearch side ->
>>>
>>> ##STORE RULESETS
>>> ruleset(name="store") {
>>>         if $parsesuccess == "OK" then {
>>>
>>>                 #ES - views
>>>                 if ( strlen( $!msg_class) >= 1 and strlen(
>>> $!msg_view) >= 1 and $!msg_class != "msg") then {
>>>                         set $.dynafile = "1.parsed";
>>>                         if $.relp_server == '127.0.0.1' then {
>>>                                 call store_es
>>>                                 call file_json
>>>
>>>                         } else {
>>>                                 call file_raw
>>>                         }
>>>                         stop
>>>                 }
>>> ....
>>>
>>>
>>>
>>> Do you have a more detailed question? Rsyslog is an awesome message
>>> queuing engine =)
>>> /---/
>>> */Best regards,/*
>>> /Eugene Istomin/
>>>
>>>
>>>
>>>  this is an interesting discussion, I'd be curious to see what people are
>>>> doing to parse/normalize messages as they are coming in.
>>>>
>>>> Several projects that I'm associated with tend to revolve around
>>>> extrapolating properties of a message and then assigning them
>>>>
>>> (typically
>>>
>>>> post-receipt and generating more load.)
>>>>
>>>> Is anyone out there taking in raw windows, proxy, or other complex logs
>>>>
>>> and
>>>
>>>> breaking them apart for either simplification or volume reduction? (for
>>>> example eliminating the description field of windows events while
>>>>
>>> enhancing
>>>
>>>> the event properties before sending on down the wire?)
>>>>
>>>> On Tue Dec 30 2014 at 3:20:12 PM David Lang <[email protected]> wrote:
>>>>
>>>>> On Tue, 30 Dec 2014, Kendall Green wrote:
>>>>>
>>>>>> Hello, I would like to share experience with normalization of windows
>>>>>>
>>>>>
>>>>> event
>>>>>
>>>>>  logs with rsyslog and have critique of configuration for the latest
>>>>>>
>>>>>
>>>>> syntax
>>>>>
>>>>>  directives and supported functions. In response to a previous
>>>>>>
>>>>> message
>>>
>>>> regarding the reparse() feature enhancement, there appears to be
>>>>>> imminent
>>>>>> refactoring of parser modules.
>>>>>>
>>>>>
>>>>> parser modules are not the same as the mmnormalize rulebase,
>>>>>
>>>> parser
>>>
>>>> modules are
>>>>> applied to messages as they arrive at the rsyslog server and populate
>>>>>
>>>> the
>>>
>>>> standard properties, mmnormalize is intended to populate other
>>>>>
>>>> variables..
>>>
>>>>
>>>>>  Is it possible to output mmnormalize rulebase to json path and
>>>>>>
>>>>> output on
>>>
>>>> template which does not include the msg/userawmsg field?
>>>>>>
>>>>>
>>>>> if you have JSON, you should use mmjsonparse to extract the
>>>>>
>>>> variables, but
>>>
>>>> once
>>>>> you have the variables parsed out, you can use them in any template.
>>>>>
>>>>> To give you more information, I would need a better idea of what you
>>>>>
>>>> are
>>>
>>>> trying
>>>>> to do.
>>>>>
>>>>>  Thank you for any recommendations or examples related to new
>>>>>>
>>>>>
>>>>> normalization
>>>>>
>>>>>  modules.
>>>>>>
>>>>>
>>>>> While there may be enhancements to the normalization, that is
>>>>>
>>>> completely
>>>
>>>> separate from the parser modules (I know, it's a bit confusing)
>>>>>
>>>>> David Lang
>>>>> _______________________________________________
>>>>> rsyslog mailing list
>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>>>> http://www.rsyslog.com/professional-services/
>>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
>>>>>
>>>> myriad
>>>
>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if
>>>>>
>>>> you
>>>
>>>> DON'T LIKE THAT.
>>>>>
>>>>
>>>> _______________________________________________
>>>> rsyslog mailing list
>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>>> http://www.rsyslog.com/professional-services/
>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
>>>>
>>> of
>>>
>>>> sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
>>>>
>>> DON'T
>>>
>>>> LIKE THAT.
>>>>
>>>
>>> _______________________________________________
>>> rsyslog mailing list
>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>> http://www.rsyslog.com/professional-services/
>>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
>>> DON'T LIKE THAT.
>>>
>>>  _______________________________________________
>> rsyslog mailing list
>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>> http://www.rsyslog.com/professional-services/
>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
>> DON'T LIKE THAT.
>>
>>  _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com/professional-services/
> What's up with rsyslog? Follow https://twitter.com/rgerhards
> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
> DON'T LIKE THAT.
>
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Reply via email to