Thank you everyone for your thoughtful responses. After contemplating
normalization and recent contributions that will be great in helping with
much needed to-string rule base type, it looks like these problems will
soon be resolved.

The new feature provides functionality to mmnormalize rulebase, similar to
how field() function can parse separators on a char or string, and also
provide alternates or optional value for 'quoted-string' as 'word' data
type if missing quote as single word.

I'd rather have a mess of a long rule definition rather than potentially
hundreds of rules for one message format (that changes depending on the
contents of the fields). These rules are long and difficult to read anyway
because of having a lot of fields and so many rules to compose the
variations of formatting depending on if fields are populated or not, which
is the cause of the situation. -- Although most whom responded to the
related message thread dislike idea of an 'or' type, I'm still interested
in this feature for select data type combinations.

If setting an alternate type on the field, what about alternate tag too,
which could also be used to place the hyphen in the tag to auto exclude the
field if it were the alternate value, or specific value, such as the tag1
example, which would conflict with the linear syntax model, but perhaps, if
tag1 and exclude the tag if 'hyphen', %tag1:quoted-string:|:-:char:\x2d%

%<tag>:<type>:<extraData>:|:<alternateTag>:<alternateType>:<alternateData>%

I know that is ugly, but Imagine a ugly log, with a lot of fields, and lets
take just three key values, which exhibit the potential of output something
different, if not populated, a char, (hyphen/space) or string "Null" for
example:

Username: NT Authority  Domain: Null  IP Address: -
Username: kgreen  Domain: kendallarium  IP Address: 10.0.0.10
Username:    Domain: Null  IP Address: 10.0.0.1

If rulebase included char type and string type, to match and capture on
fieldname/tag or exclude if is set to hyphen...

Username: %User:to-string:Domain:|-:char:\x20%  Domain:
%Domain:word:|:-:string:Null%  IP Address: %IP:ipv4:|:-:char:\x2d%

These examples emphasize excluding the data with hyphen tag, although there
are plenty of cases where may want extract field only if is matched, simply
use the nonexistent char and string literal types would nice to have, but
the main feature of '|' or word, number, ip address, quoted string, or
character/string to specific char/to-string, and in some cases could be
substituted for array of key value pairs which have arbitrary wrappers for
if the data is quoted, with equal symbols, or colons, comma, and/or space
separated, for user defined format as extradata, or maybe even template
name with string to define, to extend the iptables type. This would not
give such control to select which fields, and data types, and alternates,
nor ability to exclude on alternate, as previously described features do...

What this does do, is shorten rule base when there are well defined key
value pattern (kvp), which could possibly leverage what formats use colons
and spaces and equals with or without quotes, similar to mmjsonparse or
mmfields, but with cee declarations not required, as string format...
Somehow to define the wrappers for the type of key: "value" or key=value

So on that level, with lexical parsing of the same 3 key as tag/label and
values, abstracting from the iptables type, as if we had ability to define
wrap, in a literal form, perhaps something like this better illustrates key
value pairs (kvp) rulebase:

Example from above as format [Key: Value  ]
%kvp:key\x3a\x20value\x20\x20%

kvp type would need extra data that includes key and value define format of
wrappers, in this case key: value  key: value
-the 3 chosen fields output would be auto field extracted and labeled, same
as iptables type does:
"Username"="NT Authority", "Domain"="Null", "IP Address"="-"
"Username"="kgreen", "Domain"="kendallarium", "IP Address"="10.0.0.10"
"Username"=" ", "Domain"="Null", "IP Address"="10.0.0.1"

Example from right above, to better illustrate format wrappers
["Key"="Value", ]
%kvp:\x22key\x22\x3d\x22value\x22\x2c\x20%

Well, I hope this helps better explain the concepts for normalization
features, complementary the new to-string and alt data type options which
would probably resolve majority of the obstacles anyway. I appreciate your
time to convey and entertaining these ideas,

Thank you,
Kendall Green

On Thu, Jan 8, 2015 at 4:37 PM, Eugene Istomin <[email protected]> wrote:

> > Is anyone out there taking in raw windows, proxy, or other complex logs
> and
> > breaking them apart for either simplification or volume reduction?
>
> We are using CEE modification on syslog gates like:
>
> template(name="cee" type="list") {
>     constant(value="<") property(name="pri") constant(value=">")
>     property(name="timereported" dateFormat="rfc3339")
>     constant(value=" ") property(name="$myhostname")
>     constant(value=" ") property(name="programname")
>     constant(value=" ")
>     constant(value="@cee: {")
>     #SYSLOG
>     constant(value="\"using_cee_relp\":\"yes\", ")
> ....
>
>
> On rsyslog-elasticsearch side ->
>
> ##STORE RULESETS
> ruleset(name="store") {
>         if $parsesuccess == "OK" then {
>
>                 #ES - views
>                 if ( strlen( $!msg_class) >= 1 and strlen(
> $!msg_view) >= 1 and $!msg_class != "msg") then {
>                         set $.dynafile = "1.parsed";
>                         if $.relp_server == '127.0.0.1' then {
>                                 call store_es
>                                 call file_json
>
>                         } else {
>                                 call file_raw
>                         }
>                         stop
>                 }
> ....
>
>
>
> Do you have a more detailed question? Rsyslog is an awesome message
> queuing engine =)
> /---/
> */Best regards,/*
> /Eugene Istomin/
>
>
>
> > this is an interesting discussion, I'd be curious to see what people are
> > doing to parse/normalize messages as they are coming in.
> >
> > Several projects that I'm associated with tend to revolve around
> > extrapolating properties of a message and then assigning them
> (typically
> > post-receipt and generating more load.)
> >
> > Is anyone out there taking in raw windows, proxy, or other complex logs
> and
> > breaking them apart for either simplification or volume reduction? (for
> > example eliminating the description field of windows events while
> enhancing
> > the event properties before sending on down the wire?)
> >
> > On Tue Dec 30 2014 at 3:20:12 PM David Lang <[email protected]> wrote:
> > > On Tue, 30 Dec 2014, Kendall Green wrote:
> > > > Hello, I would like to share experience with normalization of windows
> > >
> > > event
> > >
> > > > logs with rsyslog and have critique of configuration for the latest
> > >
> > > syntax
> > >
> > > > directives and supported functions. In response to a previous
> message
> > > > regarding the reparse() feature enhancement, there appears to be
> > > > imminent
> > > > refactoring of parser modules.
> > >
> > > parser modules are not the same as the mmnormalize rulebase,
> parser
> > > modules are
> > > applied to messages as they arrive at the rsyslog server and populate
> the
> > > standard properties, mmnormalize is intended to populate other
> variables..
> > >
> > > > Is it possible to output mmnormalize rulebase to json path and
> output on
> > > > template which does not include the msg/userawmsg field?
> > >
> > > if you have JSON, you should use mmjsonparse to extract the
> variables, but
> > > once
> > > you have the variables parsed out, you can use them in any template.
> > >
> > > To give you more information, I would need a better idea of what you
> are
> > > trying
> > > to do.
> > >
> > > > Thank you for any recommendations or examples related to new
> > >
> > > normalization
> > >
> > > > modules.
> > >
> > > While there may be enhancements to the normalization, that is
> completely
> > > separate from the parser modules (I know, it's a bit confusing)
> > >
> > > David Lang
> > > _______________________________________________
> > > rsyslog mailing list
> > > http://lists.adiscon.net/mailman/listinfo/rsyslog
> > > http://www.rsyslog.com/professional-services/
> > > What's up with rsyslog? Follow https://twitter.com/rgerhards
> > > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
> myriad
> > > of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if
> you
> > > DON'T LIKE THAT.
> >
> > _______________________________________________
> > rsyslog mailing list
> > http://lists.adiscon.net/mailman/listinfo/rsyslog
> > http://www.rsyslog.com/professional-services/
> > What's up with rsyslog? Follow https://twitter.com/rgerhards
> > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
> of
> > sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
> DON'T
> > LIKE THAT.
>
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com/professional-services/
> What's up with rsyslog? Follow https://twitter.com/rgerhards
> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
> DON'T LIKE THAT.
>
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Reply via email to