David - yes, that exactly describes the situation that I'm in. If I can't find a short term solution with existing capabilities, I may look into providing a load balanced pool of sanitization workers that I connect to over the zeromq plugins I've been working on as a more near term solution. Ideally, I'd like to be able to handle the sanitization within rsyslog itself.
For a quick hack, a template on my output from my aggregators replacing "." characters with "_" might work and I'll give that a spin. I still have an elasticsearch 1.5 cluster that is our production cluster in parallel with the new 2.1 cluster, so I have some room to experiment. As an aside - does anyone have a link to a config example using a regex replace on a property using the new v8 template format? Peter - I'd be very interested if you have an approach to this problem that works with existing syslog capability. Cheers, Brian On Fri, Dec 4, 2015 at 3:28 PM, Peter Portante <[email protected]> wrote: > On Fri, Dec 4, 2015 at 3:00 PM, David Lang <[email protected]> wrote: > > > On Fri, 4 Dec 2015, Peter Portante wrote: > > > > On Fri, Dec 4, 2015 at 12:40 PM, Brian Knox <[email protected]> > >> wrote: > >> > >> In my case, I have "flat" ( 1 level deep ) CEE JSON logs with field > names > >>> that are dot delimited ( @cee { "resp.duration_ms" : 10000, > >>> "resp.code" : > >>> 200 } ). > >>> > >>> > >> So if you have a "flat" namespace where the fields include dots in them, > >> then if you move to a hierarchical namespace then won't the field name > >> references still work? > >> > > > > the problem he's having is the the field names in his incoming logs are > > not hierarchical. He's not hand-crafting the structure the way you are, > > he's parsing incoming logs and then outputting $! to ES (or something > > similar) > > > > As such, he's pretty much stuck with the names on the incoming data. > > > > We are using rsyslog to normalize the data. I'll post an example config > file for what we are doing shortly (prolly on github). > > -peter > > > > > > Rsyslog hasn't had a requirement before now to change/sanitize the field > > names, so there's nothing setup to do this. > > > > the work-around that I can think of basically involved re-parsing the > > message after manipulating it. > > > > you could use omexternal to pass the json data to an external script that > > can muck with the names and pass them back. unfortunantly this interface > > can't delete fields, just alter or add them, so you would want to do > > something along the lines of moving everything down a level so instead of > > $!blah you have $!fixed!blah (or in json instead of { 'blah': 'value', > > 'foo': 'value' } you would have { "fixed": { "blah": "value", "foo": > > "value" } } > > > > another possibility would be to do something in rsyslog where you use a > > template to replace all '.' with some other character, and then parse the > > result with mmnormalize, but this is ugly as well. > > > > We've got a few cases where field names just don't work (case sensitivity > > , () in field names, etc), so it may be a good idea for someone to write > a > > mm (message modification) module that goes through all the field names > and > > sanitizes them, with several options as to what to do (and especially > what > > to do if the sanitized version already exists, overwrite, try a different > > name, ??) > > > > David Lang > > > > > > > >> GIven my lack of control over the incoming logs, I think the simplest > >>> solution to this issue would be a way to change the attribute names > >>> themselves ( "resp_duration_ms", "resp_code" ). > >>> Given that I don't know the total space of all possible keys, I'd like > >>> this > >>> to work with the $!all-json property. > >>> > >>> If there's not already a way to do this that I'm missing, I think given > >>> the > >>> change in elasticsearch and that the suggested solution to this problem > >>> is > >>> "use logstash", I'd like to look at the possibility of adding a > property > >>> formatter that could handle this sanitization. > >>> > >>> > >>> On Fri, Dec 4, 2015 at 11:37 AM, Peter Portante < > >>> [email protected]> > >>> wrote: > >>> > >>> We are using sub-objects: > >>>> > >>>> # this is for index names to be like: logstash-YYYY.MM.DD > >>>> # WARNING: any rsyslog collecting host MUST be running UTC > >>>> # if the proper index is to be chosen to hold the > >>>> # log entry. If you are running EDT, e.g., then > >>>> # the previous day's index will be chosen even > >>>> # though the UTC value is the current day, because > >>>> # the pattern logic does not convert "timereported" > >>>> # to a UTC value before pulling data out of it. > >>>> template(name="logstash-index-pattern" type="list") { > >>>> constant(value="logstash-") > >>>> property(name="timereported" dateFormat="rfc3339" > >>>> position.from="1" position.to="4") > >>>> constant(value=".") > >>>> property(name="timereported" dateFormat="rfc3339" > >>>> position.from="6" position.to="7") > >>>> constant(value=".") > >>>> property(name="timereported" dateFormat="rfc3339" > >>>> position.from="9" position.to="10") > >>>> } > >>>> # this is for formatting our syslog data in JSON with @timestamp using > >>>> a "hierarchical" metdata namespace > >>>> template(name="com-redhat-rsyslog-hier" > >>>> type="list") { > >>>> constant(value="{") > >>>> constant(value="\"@timestamp\":\"") > >>>> property(name="timereported" dateFormat="rfc3339") > >>>> constant(value="\",\"@version\":\"2015.09.24-0") > >>>> constant(value="\",\"message\":\"") > >>>> property(name="$.msg" format="json") > >>>> constant(value="\",\"hostname\":\"") > >>>> property(name="$.hostname") > >>>> constant(value="\",\"level\":\"") > >>>> property(name="$.level") > >>>> constant(value="\",\"pid\":\"") > >>>> property(name="$.pid") > >>>> constant(value="\",\"tags\":\"") > >>>> property(name="$.tags") > >>>> constant(value="\",\"CEE\":") > >>>> property(name="$!all-json") > >>>> constant(value=",\"systemd\":") > >>>> property(name="$.systemd") > >>>> constant(value=",\"rsyslog\":") > >>>> property(name="$.rsyslog") > >>>> constant(value="}\n") > >>>> } > >>>> > >>>> > >>>> > >>>> On Fri, Dec 4, 2015 at 10:44 AM, Brian Knox <[email protected]> > >>>> wrote: > >>>> > >>>> I found out today that elasticsearch 2.x does not allow field names to > >>>>> > >>>> have > >>>> > >>>>> the period character in them. This is making my life interesting as > I > >>>>> > >>>> use > >>>> > >>>>> elasticsearch with rsyslog end to end (no logstash), and a lot of our > >>>>> > >>>> field > >>>> > >>>>> names have "." as a delimiter in them. > >>>>> > >>>>> In a perfect world, I'd like an "elasticsearch" property formatter > that > >>>>> could look for and replace "." in field names with "_", that would > also > >>>>> work with the all-json property, something like: > >>>>> > >>>>> property(name="$!all-json" format="elasticsearch") > >>>>> > >>>>> Or, if this is to ES specific for rsyslog core, perhaps we could add > >>>>> > >>>> this > >>> > >>>> functionality to the omelasticsearch output itself (I'll look over the > >>>>> > >>>> code > >>>> > >>>>> today). > >>>>> > >>>>> I'd like to not have to introduce logstash to my environment just to > >>>>> > >>>> regex > >>>> > >>>>> a character in field names. I'm open to other ideas as well, just > >>>>> > >>>> wanted > >>> > >>>> to start the conversation. > >>>>> > >>>>> Cheers, > >>>>> BRian > >>>>> _______________________________________________ > >>>>> rsyslog mailing list > >>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog > >>>>> http://www.rsyslog.com/professional-services/ > >>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards > >>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a > >>>>> > >>>> myriad > >>> > >>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you > >>>>> DON'T LIKE THAT. > >>>>> > >>>>> _______________________________________________ > >>>> rsyslog mailing list > >>>> http://lists.adiscon.net/mailman/listinfo/rsyslog > >>>> http://www.rsyslog.com/professional-services/ > >>>> What's up with rsyslog? Follow https://twitter.com/rgerhards > >>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a > myriad > >>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you > >>>> DON'T LIKE THAT. > >>>> > >>>> _______________________________________________ > >>> rsyslog mailing list > >>> http://lists.adiscon.net/mailman/listinfo/rsyslog > >>> http://www.rsyslog.com/professional-services/ > >>> What's up with rsyslog? Follow https://twitter.com/rgerhards > >>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a > myriad > >>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you > >>> DON'T LIKE THAT. > >>> > >>> _______________________________________________ > >> rsyslog mailing list > >> http://lists.adiscon.net/mailman/listinfo/rsyslog > >> http://www.rsyslog.com/professional-services/ > >> What's up with rsyslog? Follow https://twitter.com/rgerhards > >> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad > >> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you > >> DON'T LIKE THAT. > >> > >> _______________________________________________ > > rsyslog mailing list > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > http://www.rsyslog.com/professional-services/ > > What's up with rsyslog? Follow https://twitter.com/rgerhards > > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad > > of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you > > DON'T LIKE THAT. > > > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com/professional-services/ > What's up with rsyslog? Follow https://twitter.com/rgerhards > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad > of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you > DON'T LIKE THAT. > _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.

