Date: Tue, 27 Jan 2015 23:26:28 -0700
From: Kendall Green <[email protected]>
Reply-To: rsyslog-users <[email protected]>
To: rsyslog-users <[email protected]>
Subject: Re: [rsyslog] rsyslog normalization
Thank you everyone for your thoughtful responses. After contemplating
normalization and recent contributions that will be great in helping with
much needed to-string rule base type, it looks like these problems will
soon be resolved.
The new feature provides functionality to mmnormalize rulebase, similar to
how field() function can parse separators on a char or string, and also
provide alternates or optional value for 'quoted-string' as 'word' data
type if missing quote as single word.
I'd rather have a mess of a long rule definition rather than potentially
hundreds of rules for one message format (that changes depending on the
contents of the fields). These rules are long and difficult to read anyway
because of having a lot of fields and so many rules to compose the
variations of formatting depending on if fields are populated or not, which
is the cause of the situation. -- Although most whom responded to the
related message thread dislike idea of an 'or' type, I'm still interested
in this feature for select data type combinations.
If setting an alternate type on the field, what about alternate tag too,
which could also be used to place the hyphen in the tag to auto exclude the
field if it were the alternate value, or specific value, such as the tag1
example, which would conflict with the linear syntax model, but perhaps, if
tag1 and exclude the tag if 'hyphen', %tag1:quoted-string:|:-:char:\x2d%
%<tag>:<type>:<extraData>:|:<alternateTag>:<alternateType>:<alternateData>%
I know that is ugly, but Imagine a ugly log, with a lot of fields, and lets
take just three key values, which exhibit the potential of output something
different, if not populated, a char, (hyphen/space) or string "Null" for
example:
Username: NT Authority Domain: Null IP Address: -
Username: kgreen Domain: kendallarium IP Address: 10.0.0.10
Username: Domain: Null IP Address: 10.0.0.1
If rulebase included char type and string type, to match and capture on
fieldname/tag or exclude if is set to hyphen...
Username: %User:to-string:Domain:|-:char:\x20% Domain:
%Domain:word:|:-:string:Null% IP Address: %IP:ipv4:|:-:char:\x2d%
These examples emphasize excluding the data with hyphen tag, although there
are plenty of cases where may want extract field only if is matched, simply
use the nonexistent char and string literal types would nice to have, but
the main feature of '|' or word, number, ip address, quoted string, or
character/string to specific char/to-string, and in some cases could be
substituted for array of key value pairs which have arbitrary wrappers for
if the data is quoted, with equal symbols, or colons, comma, and/or space
separated, for user defined format as extradata, or maybe even template
name with string to define, to extend the iptables type. This would not
give such control to select which fields, and data types, and alternates,
nor ability to exclude on alternate, as previously described features do...
What this does do, is shorten rule base when there are well defined key
value pattern (kvp), which could possibly leverage what formats use colons
and spaces and equals with or without quotes, similar to mmjsonparse or
mmfields, but with cee declarations not required, as string format...
Somehow to define the wrappers for the type of key: "value" or key=value
So on that level, with lexical parsing of the same 3 key as tag/label and
values, abstracting from the iptables type, as if we had ability to define
wrap, in a literal form, perhaps something like this better illustrates key
value pairs (kvp) rulebase:
Example from above as format [Key: Value ]
%kvp:key\x3a\x20value\x20\x20%
kvp type would need extra data that includes key and value define format of
wrappers, in this case key: value key: value
-the 3 chosen fields output would be auto field extracted and labeled, same
as iptables type does:
"Username"="NT Authority", "Domain"="Null", "IP Address"="-"
"Username"="kgreen", "Domain"="kendallarium", "IP Address"="10.0.0.10"
"Username"=" ", "Domain"="Null", "IP Address"="10.0.0.1"
Example from right above, to better illustrate format wrappers
["Key"="Value", ]
%kvp:\x22key\x22\x3d\x22value\x22\x2c\x20%
Well, I hope this helps better explain the concepts for normalization
features, complementary the new to-string and alt data type options which
would probably resolve majority of the obstacles anyway. I appreciate your
time to convey and entertaining these ideas,
Thank you,
Kendall Green
On Thu, Jan 8, 2015 at 4:37 PM, Eugene Istomin <[email protected]> wrote:
Is anyone out there taking in raw windows, proxy, or other complex logs
and
breaking them apart for either simplification or volume reduction?
We are using CEE modification on syslog gates like:
template(name="cee" type="list") {
constant(value="<") property(name="pri") constant(value=">")
property(name="timereported" dateFormat="rfc3339")
constant(value=" ") property(name="$myhostname")
constant(value=" ") property(name="programname")
constant(value=" ")
constant(value="@cee: {")
#SYSLOG
constant(value="\"using_cee_relp\":\"yes\", ")
....
On rsyslog-elasticsearch side ->
##STORE RULESETS
ruleset(name="store") {
if $parsesuccess == "OK" then {
#ES - views
if ( strlen( $!msg_class) >= 1 and strlen(
$!msg_view) >= 1 and $!msg_class != "msg") then {
set $.dynafile = "1.parsed";
if $.relp_server == '127.0.0.1' then {
call store_es
call file_json
} else {
call file_raw
}
stop
}
....
Do you have a more detailed question? Rsyslog is an awesome message
queuing engine =)
/---/
*/Best regards,/*
/Eugene Istomin/
this is an interesting discussion, I'd be curious to see what people are
doing to parse/normalize messages as they are coming in.
Several projects that I'm associated with tend to revolve around
extrapolating properties of a message and then assigning them
(typically
post-receipt and generating more load.)
Is anyone out there taking in raw windows, proxy, or other complex logs
and
breaking them apart for either simplification or volume reduction? (for
example eliminating the description field of windows events while
enhancing
the event properties before sending on down the wire?)
On Tue Dec 30 2014 at 3:20:12 PM David Lang <[email protected]> wrote:
On Tue, 30 Dec 2014, Kendall Green wrote:
Hello, I would like to share experience with normalization of windows
event
logs with rsyslog and have critique of configuration for the latest
syntax
directives and supported functions. In response to a previous
message
regarding the reparse() feature enhancement, there appears to be
imminent
refactoring of parser modules.
parser modules are not the same as the mmnormalize rulebase,
parser
modules are
applied to messages as they arrive at the rsyslog server and populate
the
standard properties, mmnormalize is intended to populate other
variables..
Is it possible to output mmnormalize rulebase to json path and
output on
template which does not include the msg/userawmsg field?
if you have JSON, you should use mmjsonparse to extract the
variables, but
once
you have the variables parsed out, you can use them in any template.
To give you more information, I would need a better idea of what you
are
trying
to do.
Thank you for any recommendations or examples related to new
normalization
modules.
While there may be enhancements to the normalization, that is
completely
separate from the parser modules (I know, it's a bit confusing)
David Lang
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
myriad
of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if
you
DON'T LIKE THAT.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
of
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
DON'T
LIKE THAT.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
DON'T LIKE THAT.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE
THAT.