David - yes, that exactly describes the situation that I'm in. If I can't
find a short term solution with existing capabilities, I may look into
providing a load balanced pool of sanitization workers that I connect to
over the zeromq plugins I've been working on as a more near term solution.
Ideally, I'd like to be able to handle the sanitization within rsyslog
itself.

For a quick hack, a template on my output from my aggregators replacing "."
characters with "_" might work and I'll give that a spin.  I still have an
elasticsearch 1.5 cluster that is our production cluster in parallel with
the new 2.1 cluster, so I have some room to experiment.

As an aside - does anyone have a link to a config example using a regex
replace on a property using the new v8 template format?

Peter - I'd be very interested if you have an approach to this problem that
works with existing syslog capability.

Cheers,
Brian




On Fri, Dec 4, 2015 at 3:28 PM, Peter Portante <[email protected]>
wrote:

> On Fri, Dec 4, 2015 at 3:00 PM, David Lang <[email protected]> wrote:
>
> > On Fri, 4 Dec 2015, Peter Portante wrote:
> >
> > On Fri, Dec 4, 2015 at 12:40 PM, Brian Knox <[email protected]>
> >> wrote:
> >>
> >> In my case, I have "flat" ( 1 level deep ) CEE JSON logs with field
> names
> >>> that are dot delimited  (  @cee { "resp.duration_ms" : 10000,
> >>> "resp.code" :
> >>> 200 }  ).
> >>>
> >>>
> >> So if you have a "flat" namespace where the fields include dots in them,
> >> then if you move to a hierarchical namespace then won't the field name
> >> references still work?
> >>
> >
> > the problem he's having is the the field names in his incoming logs are
> > not hierarchical. He's not hand-crafting the structure the way you are,
> > he's parsing incoming logs and then outputting $! to ES (or something
> > similar)
> >
> > As such, he's pretty much stuck with the names on the incoming data.
> >
>
> We are using rsyslog to normalize the data.  I'll post an example config
> file for what we are doing shortly (prolly on github).
>
> -peter
>
>
> >
> > Rsyslog hasn't had a requirement before now to change/sanitize the field
> > names, so there's nothing setup to do this.
> >
> > the work-around that I can think of basically involved re-parsing the
> > message after manipulating it.
> >
> > you could use omexternal to pass the json data to an external script that
> > can muck with the names and pass them back. unfortunantly this interface
> > can't delete fields, just alter or add them, so you would want to do
> > something along the lines of moving everything down a level so instead of
> > $!blah you have $!fixed!blah (or in json instead of { 'blah': 'value',
> > 'foo': 'value' } you would have { "fixed": { "blah": "value", "foo":
> > "value" } }
> >
> > another possibility would be to do something in rsyslog where you use a
> > template to replace all '.' with some other character, and then parse the
> > result with mmnormalize, but this is ugly as well.
> >
> > We've got a few cases where field names just don't work (case sensitivity
> > , () in field names, etc), so it may be a good idea for someone to write
> a
> > mm (message modification) module that goes through all the field names
> and
> > sanitizes them, with several options as to what to do (and especially
> what
> > to do if the sanitized version already exists, overwrite, try a different
> > name, ??)
> >
> > David Lang
> >
> >
> >
> >> GIven my lack of control over the incoming logs, I think the simplest
> >>> solution to this issue would be a way to change the attribute names
> >>> themselves  ( "resp_duration_ms", "resp_code" ).
> >>> Given that I don't know the total space of all possible keys, I'd like
> >>> this
> >>> to work with the $!all-json property.
> >>>
> >>> If there's not already a way to do this that I'm missing, I think given
> >>> the
> >>> change in elasticsearch and that the suggested solution to this problem
> >>> is
> >>> "use logstash", I'd like to look at the possibility of adding a
> property
> >>> formatter that could handle this sanitization.
> >>>
> >>>
> >>> On Fri, Dec 4, 2015 at 11:37 AM, Peter Portante <
> >>> [email protected]>
> >>> wrote:
> >>>
> >>> We are using sub-objects:
> >>>>
> >>>> # this is for index names to be like: logstash-YYYY.MM.DD
> >>>> # WARNING: any rsyslog collecting host MUST be running UTC
> >>>> #          if the proper index is to be chosen to hold the
> >>>> #          log entry. If you are running EDT, e.g., then
> >>>> #          the previous day's index will be chosen even
> >>>> #          though the UTC value is the current day, because
> >>>> #          the pattern logic does not convert "timereported"
> >>>> #          to a UTC value before pulling data out of it.
> >>>> template(name="logstash-index-pattern" type="list") {
> >>>>     constant(value="logstash-")
> >>>>     property(name="timereported" dateFormat="rfc3339"
> >>>> position.from="1" position.to="4")
> >>>>     constant(value=".")
> >>>>     property(name="timereported" dateFormat="rfc3339"
> >>>> position.from="6" position.to="7")
> >>>>     constant(value=".")
> >>>>     property(name="timereported" dateFormat="rfc3339"
> >>>> position.from="9" position.to="10")
> >>>>     }
> >>>> # this is for formatting our syslog data in JSON with @timestamp using
> >>>> a "hierarchical" metdata namespace
> >>>> template(name="com-redhat-rsyslog-hier"
> >>>>          type="list") {
> >>>>     constant(value="{")
> >>>>     constant(value="\"@timestamp\":\"")
> >>>> property(name="timereported" dateFormat="rfc3339")
> >>>>     constant(value="\",\"@version\":\"2015.09.24-0")
> >>>>     constant(value="\",\"message\":\"")
> >>>> property(name="$.msg" format="json")
> >>>>     constant(value="\",\"hostname\":\"")
> >>>> property(name="$.hostname")
> >>>>     constant(value="\",\"level\":\"")
> >>>>  property(name="$.level")
> >>>>     constant(value="\",\"pid\":\"")
> >>>>  property(name="$.pid")
> >>>>     constant(value="\",\"tags\":\"")
> >>>> property(name="$.tags")
> >>>>     constant(value="\",\"CEE\":")
> >>>> property(name="$!all-json")
> >>>>     constant(value=",\"systemd\":")
> >>>>  property(name="$.systemd")
> >>>>     constant(value=",\"rsyslog\":")
> >>>>  property(name="$.rsyslog")
> >>>>     constant(value="}\n")
> >>>>     }
> >>>>
> >>>>
> >>>>
> >>>> On Fri, Dec 4, 2015 at 10:44 AM, Brian Knox <[email protected]>
> >>>> wrote:
> >>>>
> >>>> I found out today that elasticsearch 2.x does not allow field names to
> >>>>>
> >>>> have
> >>>>
> >>>>> the period character in them.  This is making my life interesting as
> I
> >>>>>
> >>>> use
> >>>>
> >>>>> elasticsearch with rsyslog end to end (no logstash), and a lot of our
> >>>>>
> >>>> field
> >>>>
> >>>>> names have "." as a delimiter in them.
> >>>>>
> >>>>> In a perfect world, I'd like an "elasticsearch" property formatter
> that
> >>>>> could look for and replace "." in field names with "_", that would
> also
> >>>>> work with the all-json property, something like:
> >>>>>
> >>>>> property(name="$!all-json" format="elasticsearch")
> >>>>>
> >>>>> Or, if this is to ES specific for rsyslog core, perhaps we could add
> >>>>>
> >>>> this
> >>>
> >>>> functionality to the omelasticsearch output itself (I'll look over the
> >>>>>
> >>>> code
> >>>>
> >>>>> today).
> >>>>>
> >>>>> I'd like to not have to introduce logstash to my environment just to
> >>>>>
> >>>> regex
> >>>>
> >>>>> a character in field names.  I'm open to other ideas as well, just
> >>>>>
> >>>> wanted
> >>>
> >>>> to start the conversation.
> >>>>>
> >>>>> Cheers,
> >>>>> BRian
> >>>>> _______________________________________________
> >>>>> rsyslog mailing list
> >>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
> >>>>> http://www.rsyslog.com/professional-services/
> >>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards
> >>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
> >>>>>
> >>>> myriad
> >>>
> >>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
> >>>>> DON'T LIKE THAT.
> >>>>>
> >>>>> _______________________________________________
> >>>> rsyslog mailing list
> >>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
> >>>> http://www.rsyslog.com/professional-services/
> >>>> What's up with rsyslog? Follow https://twitter.com/rgerhards
> >>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
> myriad
> >>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
> >>>> DON'T LIKE THAT.
> >>>>
> >>>> _______________________________________________
> >>> rsyslog mailing list
> >>> http://lists.adiscon.net/mailman/listinfo/rsyslog
> >>> http://www.rsyslog.com/professional-services/
> >>> What's up with rsyslog? Follow https://twitter.com/rgerhards
> >>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
> myriad
> >>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
> >>> DON'T LIKE THAT.
> >>>
> >>> _______________________________________________
> >> rsyslog mailing list
> >> http://lists.adiscon.net/mailman/listinfo/rsyslog
> >> http://www.rsyslog.com/professional-services/
> >> What's up with rsyslog? Follow https://twitter.com/rgerhards
> >> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
> >> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
> >> DON'T LIKE THAT.
> >>
> >> _______________________________________________
> > rsyslog mailing list
> > http://lists.adiscon.net/mailman/listinfo/rsyslog
> > http://www.rsyslog.com/professional-services/
> > What's up with rsyslog? Follow https://twitter.com/rgerhards
> > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
> > of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
> > DON'T LIKE THAT.
> >
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com/professional-services/
> What's up with rsyslog? Follow https://twitter.com/rgerhards
> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
> DON'T LIKE THAT.
>
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Reply via email to