On Tue, Dec 8, 2015 at 6:38 AM, Brian Knox <[email protected]> wrote:
> As a short term solution I'm working on a small service (in golang) that > accepts logs over tcp, can replace characters in JSON field names in a @cee > syslog line, and then forward the line to another syslog destination. In > tests on my laptop it handles modifying ~ 50,000 reasonably sized log lines > a second per connection. It gracefully handles tcp connection issues and > I'll test it under adverse circumstances to make sure it's reasonably > robust. I personally find this preferable to deploying logstash just to > substitute one character. I'll release it open source this week in case > any one else needs an immediate solution to this problem like I do. > > It's less than ideal - ideally elasticsearch would support JSON rather than > a subset of characters JSON allows - but it solves the immediate problem > for us. > Have you brought this up with the ElasticSearch community to see what they say? > > Cheers, > Brian > > > > On Sun, Dec 6, 2015 at 2:51 PM, David Lang <[email protected]> wrote: > > > On Sat, 5 Dec 2015, Peter Portante wrote: > > > > On Sat, Dec 5, 2015 at 5:03 PM, David Lang <[email protected]> wrote: > >> > >> we really need mmscrubnames or similar > >>> > >>> 1. change all names to lower case > >>> 2. replace characters that rsyslog doesn't allow in names with > something > >>> 3. allow other characters to be added to the list to be replaced > >>> 4. change names that are foo!bar into multi-layer structures > >>> 5. handle the case where these changes create nultiple objects with the > >>> same name (probably by appending a string until there are no longer > >>> conflicts) > >>> > >>> #1 may be able to go away in a decade or so if we allow case sensitive > >>> names as an option > >>> > >>> > >> Don't we need to make this go away sooner than later? If rsyslog is the > >> link in the chain that prevents someone from getting the key names they > >> expect into ES, won't they find something else to replace that link? > >> > >> I have made available RPMs for EPEL 7 (which should work on RHEL 7 and > >> CentOS 7)P, and Fedora 21, 22, and 23. Why not make the effort to find > >> out > >> what breaks, and put in a switch so that folks can opt-in to > >> case-sensitive > >> names in config files? I'd be happy to implement the switch, but would > >> need help verifying existing configurations work. > >> > > > > this will break some existing configs, won't it? If someone has something > > that's assuming everything is squished to lower case, and it becomes case > > sensitive, won't that break? > > > > We can add the new case sensitivity as an option quickly, but can't make > > it the default for quite a while (a cycle or two of the enterprise > distros) > > > > #2 needs to be done on the actual variable names, not just on the ES > >>> output so that the variables can be accessed and manipulated in rsyslog > >>> > >>> > >> Why do we need to do this? Is this because we need to reference them in > >> the configuration files? If so, why not provide an escape syntax for > the > >> configuration file? > >> > >> Do we really want rsyslog in the position where it adds restrictions to > >> the > >> data handling pipeline because of how it operates? I think we all agree > >> that an mmscrubnames module would be good to help put rsyslog in the > >> position of transforming data from one source to another in the overall > >> pipeline. > >> > > > > AFAIK, JSON imposes no limits of field names, so any strange character > (or > > unicode character, or even control character) could be part of a field > > name. And even if the JSON spec imposes some limits, do the libraries > > impose such limits in practice? > > > > I don't think it makes sense to support all of this in rsyslog, I think > > it's reasonable to impose something sane. Other log handling software > does > > this (for example, logstash doesn't allow '.' in the name, but also is > case > > insensitive :-) > > > > and finally, #4 is needed to allow the work-around for problems like ES > >>> has. > >>> > >>> > >> I am not sure I follow why this allows us to work-around problems like > ES > >> has. > >> > >> The dots in field names are confusing and ambiguous in ES because you > can > >> reference a hierarchical set of objects in the json objects indexed. So > >> if > >> one has a field name with dots in it in one document and another > document > >> in the index has a hierarchy with sub objects, then it is ambiguous > which > >> we are dealing with, if I understand the problem correctly. > >> > > > > Ok, that explains why this is an issue, it makes sense. We have the same > > problem with '!'. It's a problem in ES because it's a new requirement, > > breaking existing input. > > > > But #4 would let us say that '.' is an illegal character, along with > > control characters, anything above plain ASCII, and other punctuation > > characters we don't allow and get them replaced by something we do allow. > > > > Folks can stay with ES 1.7 if they need the dots in names. > >> > > > > not long term. > > > > David Lang > > > > _______________________________________________ > > rsyslog mailing list > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > http://www.rsyslog.com/professional-services/ > > What's up with rsyslog? Follow https://twitter.com/rgerhards > > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad > > of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you > > DON'T LIKE THAT. > > > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com/professional-services/ > What's up with rsyslog? Follow https://twitter.com/rgerhards > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad > of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you > DON'T LIKE THAT. > _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.

