we really need mmscrubnames or similar
1. change all names to lower case
2. replace characters that rsyslog doesn't allow in names with something
3. allow other characters to be added to the list to be replaced
4. change names that are foo!bar into multi-layer structures
5. handle the case where these changes create nultiple objects with the same
name (probably by appending a string until there are no longer conflicts)
#1 may be able to go away in a decade or so if we allow case sensitive names as
an option
#4 is needed because project lumberjack defined names to be able to be foo!bar
instead of foo { bar } to allow things that don't understand multi-dimention
structures to still use them (and I need it to handle a similar problem from my
mmnormalize rulesets) We have talked about having this be a post-parser pass in
liblognorm, but it really should be available no matter where the field names
originate (currently mmnormalize, mmjsonparse, or mmexternal I believe)
#2 needs to be done on the actual variable names, not just on the ES output so
that the variables can be accessed and manipulated in rsyslog
#5 needs to be done or {"Foo":1,"foo":2} will loose data
and finally, #4 is needed to allow the work-around for problems like ES has.
#1,2,3,5 are all extremely similar, once the code is available to do one, all
the others become trivial. #4 is a bit more complex, but I think it's similar
enough to do in the same module.
David Lang
On Sat, 5 Dec 2015, Rainer Gerhards wrote:
Date: Sat, 5 Dec 2015 22:42:17 +0100
From: Rainer Gerhards <[email protected]>
Reply-To: rsyslog-users <[email protected]>
To: rsyslog-users <[email protected]>
Subject: Re: [rsyslog] elasticsearch 2.0 and field names
Can you file a feature request for melasticsesarch on github. I guess a
quick but useful back could be done there.
Rainer
Am 05.12.2015 16:15 schrieb "Brian Knox" <[email protected]>:
David - yes, that exactly describes the situation that I'm in. If I can't
find a short term solution with existing capabilities, I may look into
providing a load balanced pool of sanitization workers that I connect to
over the zeromq plugins I've been working on as a more near term solution.
Ideally, I'd like to be able to handle the sanitization within rsyslog
itself.
For a quick hack, a template on my output from my aggregators replacing "."
characters with "_" might work and I'll give that a spin. I still have an
elasticsearch 1.5 cluster that is our production cluster in parallel with
the new 2.1 cluster, so I have some room to experiment.
As an aside - does anyone have a link to a config example using a regex
replace on a property using the new v8 template format?
Peter - I'd be very interested if you have an approach to this problem that
works with existing syslog capability.
Cheers,
Brian
On Fri, Dec 4, 2015 at 3:28 PM, Peter Portante <[email protected]
wrote:
On Fri, Dec 4, 2015 at 3:00 PM, David Lang <[email protected]> wrote:
On Fri, 4 Dec 2015, Peter Portante wrote:
On Fri, Dec 4, 2015 at 12:40 PM, Brian Knox <[email protected]>
wrote:
In my case, I have "flat" ( 1 level deep ) CEE JSON logs with field
names
that are dot delimited ( @cee { "resp.duration_ms" : 10000,
"resp.code" :
200 } ).
So if you have a "flat" namespace where the fields include dots in
them,
then if you move to a hierarchical namespace then won't the field name
references still work?
the problem he's having is the the field names in his incoming logs are
not hierarchical. He's not hand-crafting the structure the way you are,
he's parsing incoming logs and then outputting $! to ES (or something
similar)
As such, he's pretty much stuck with the names on the incoming data.
We are using rsyslog to normalize the data. I'll post an example config
file for what we are doing shortly (prolly on github).
-peter
Rsyslog hasn't had a requirement before now to change/sanitize the
field
names, so there's nothing setup to do this.
the work-around that I can think of basically involved re-parsing the
message after manipulating it.
you could use omexternal to pass the json data to an external script
that
can muck with the names and pass them back. unfortunantly this
interface
can't delete fields, just alter or add them, so you would want to do
something along the lines of moving everything down a level so instead
of
$!blah you have $!fixed!blah (or in json instead of { 'blah': 'value',
'foo': 'value' } you would have { "fixed": { "blah": "value", "foo":
"value" } }
another possibility would be to do something in rsyslog where you use a
template to replace all '.' with some other character, and then parse
the
result with mmnormalize, but this is ugly as well.
We've got a few cases where field names just don't work (case
sensitivity
, () in field names, etc), so it may be a good idea for someone to
write
a
mm (message modification) module that goes through all the field names
and
sanitizes them, with several options as to what to do (and especially
what
to do if the sanitized version already exists, overwrite, try a
different
name, ??)
David Lang
GIven my lack of control over the incoming logs, I think the simplest
solution to this issue would be a way to change the attribute names
themselves ( "resp_duration_ms", "resp_code" ).
Given that I don't know the total space of all possible keys, I'd
like
this
to work with the $!all-json property.
If there's not already a way to do this that I'm missing, I think
given
the
change in elasticsearch and that the suggested solution to this
problem
is
"use logstash", I'd like to look at the possibility of adding a
property
formatter that could handle this sanitization.
On Fri, Dec 4, 2015 at 11:37 AM, Peter Portante <
[email protected]>
wrote:
We are using sub-objects:
# this is for index names to be like: logstash-YYYY.MM.DD
# WARNING: any rsyslog collecting host MUST be running UTC
# if the proper index is to be chosen to hold the
# log entry. If you are running EDT, e.g., then
# the previous day's index will be chosen even
# though the UTC value is the current day, because
# the pattern logic does not convert "timereported"
# to a UTC value before pulling data out of it.
template(name="logstash-index-pattern" type="list") {
constant(value="logstash-")
property(name="timereported" dateFormat="rfc3339"
position.from="1" position.to="4")
constant(value=".")
property(name="timereported" dateFormat="rfc3339"
position.from="6" position.to="7")
constant(value=".")
property(name="timereported" dateFormat="rfc3339"
position.from="9" position.to="10")
}
# this is for formatting our syslog data in JSON with @timestamp
using
a "hierarchical" metdata namespace
template(name="com-redhat-rsyslog-hier"
type="list") {
constant(value="{")
constant(value="\"@timestamp\":\"")
property(name="timereported" dateFormat="rfc3339")
constant(value="\",\"@version\":\"2015.09.24-0")
constant(value="\",\"message\":\"")
property(name="$.msg" format="json")
constant(value="\",\"hostname\":\"")
property(name="$.hostname")
constant(value="\",\"level\":\"")
property(name="$.level")
constant(value="\",\"pid\":\"")
property(name="$.pid")
constant(value="\",\"tags\":\"")
property(name="$.tags")
constant(value="\",\"CEE\":")
property(name="$!all-json")
constant(value=",\"systemd\":")
property(name="$.systemd")
constant(value=",\"rsyslog\":")
property(name="$.rsyslog")
constant(value="}\n")
}
On Fri, Dec 4, 2015 at 10:44 AM, Brian Knox <[email protected]
wrote:
I found out today that elasticsearch 2.x does not allow field names
to
have
the period character in them. This is making my life interesting
as
I
use
elasticsearch with rsyslog end to end (no logstash), and a lot of
our
field
names have "." as a delimiter in them.
In a perfect world, I'd like an "elasticsearch" property formatter
that
could look for and replace "." in field names with "_", that would
also
work with the all-json property, something like:
property(name="$!all-json" format="elasticsearch")
Or, if this is to ES specific for rsyslog core, perhaps we could
add
this
functionality to the omelasticsearch output itself (I'll look over
the
code
today).
I'd like to not have to introduce logstash to my environment just
to
regex
a character in field names. I'm open to other ideas as well, just
wanted
to start the conversation.
Cheers,
BRian
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
myriad
of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if
you
DON'T LIKE THAT.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
myriad
of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if
you
DON'T LIKE THAT.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
myriad
of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if
you
DON'T LIKE THAT.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
myriad
of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
DON'T LIKE THAT.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
myriad
of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
DON'T LIKE THAT.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
DON'T LIKE THAT.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
DON'T LIKE THAT.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE
THAT.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE
THAT.