On Wed, Sep 18, 2013 at 2:02 PM, David Lang <[email protected]> wrote: > As far as I know, this is not available in rsyslog. It's something that > comes up once in a while as a need, but usually it's been fairly easy to > work-around by interposing an external program between rsyslog and the > output. > > Rsyslog doesn't know or care about character sets internally, it deals > with straight C strings that can contain arbitrary non-null bytes. > > Currently rsyslog has the control character escaping code because most of > what it has traditionally dealt with has been ascii text data > > What I think you are needing is the ability to define a sed like filter to > an output so that you can define a mapping of iso8859 characters to UTF8 > characters. > > Another option would be to create a message modification module to make > the changes to the message that rsyslog is processing instead of to the > output. > > a third option would be to create a new function that would take the value > of a variable/property, transform it (per additional parameters), and > assign the result to another variable > > Since you are dealing with elasticsearch output, a fourth option (for your > use case) would be to create a string generator module that created the > string format that you need for elasticsearch, and cleaned up character > sets in the meantime > > > I think the fourth would be the quickest to get implemented, but the most > limited > > the third would be the most flexible, but the most complicated to use > > thinking about the second, I wonder how hard it would be to make a mm > module that was little more than a wrapper around an external program to be > able to offload the work to something already optimized for the work. > > Rainer, any thoughts as to which option would be easiest to implement > (both for this specific character set conversion and the more general > conversion problem)? >
I need to think a bit more about this problem, especially as it boils up every now and then. I remember that one proposed solution was the ability to assign a character set to inputs and make the database outputs aware of the charset of the message in question (which could change very frequently). I haven't explored this in enough depth right now, but if it works, it sounds like a good solution (with medium time-to-implement footprint). So let's add this as #5 ;) >From what you gave, I think #1 or 2 is probably the quickest to implement. @Risto: what would be the minimal solution to solve your problem? What would be the best one? Rainer > > > I have not had to deal with character set conversion very much, so I'm not > familiar with what tools are available to do this sort of conversion > already. > > David Lang > > > > > > On Wed, 18 Sep 2013, Risto Vaarandi wrote: > > hi folks, >> >> I've been using the omelasticsearch output module for quite some time, >> and I am happy with it. However, there is one issue I haven't been able to >> tackle. Since I am writing data to Elasticsearch from wide variety of >> sources, I am accidentally running into syslog messages which contain some >> iso8859 characters. Unfortunately, when trying to write them into >> Elasticsearch as-is, you would get back the following error: >> >> org.elasticsearch.index.**mapper.MapperParsingException: failed to parse >> [@message] >> ... >> ... >> ... >> Caused by: org.elasticsearch.common.**jackson.core.**JsonParseException: >> Invalid UTF-8 start byte 0x99 >> >> Apparently, the 'json' property replacer is not able to detect and remove >> (or replace) such characters. >> >> As a solution, I have tried to add space-cc or drop-cc property replacer >> to json, for example: >> >> \"@message\":\"%rawmsg:::**space-cc,json%\" >> >> but they have no effect (in addition, I have specified >> $**EscapeControlCharactersOnRecei**ve off >> as recommended by rsyslog documentation). >> >> Is there any way to handle this problem? So far, I've been happy with >> rsyslog+Elasticsearch setup, and I wouldn't like to add any Java based tool >> into the processing pipeline. >> >> kind regards, >> risto >> ______________________________**_________________ >> rsyslog mailing list >> http://lists.adiscon.net/**mailman/listinfo/rsyslog<http://lists.adiscon.net/mailman/listinfo/rsyslog> >> http://www.rsyslog.com/**professional-services/<http://www.rsyslog.com/professional-services/> >> What's up with rsyslog? Follow https://twitter.com/rgerhards >> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad >> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >> DON'T LIKE THAT. >> >> ______________________________**_________________ > rsyslog mailing list > http://lists.adiscon.net/**mailman/listinfo/rsyslog<http://lists.adiscon.net/mailman/listinfo/rsyslog> > http://www.rsyslog.com/**professional-services/<http://www.rsyslog.com/professional-services/> > What's up with rsyslog? Follow https://twitter.com/rgerhards > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad > of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you > DON'T LIKE THAT. > _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.

