On Wed, Sep 18, 2013 at 2:02 PM, David Lang <[email protected]> wrote:

> As far as I know, this is not available in rsyslog. It's something that
> comes up once in a while as a need, but usually it's been fairly easy to
> work-around by interposing an external program between rsyslog and the
> output.
>
> Rsyslog doesn't know or care about character sets internally, it deals
> with straight C strings that can contain arbitrary non-null bytes.
>
> Currently rsyslog has the control character escaping code because most of
> what it has traditionally dealt with has been ascii text data
>
> What I think you are needing is the ability to define a sed like filter to
> an output so that you can define a mapping of iso8859 characters to UTF8
> characters.
>
> Another option would be to create a message modification module to make
> the changes to the message that rsyslog is processing instead of to the
> output.
>
> a third option would be to create a new function that would take the value
> of a variable/property, transform it (per additional parameters), and
> assign the result to another variable
>
> Since you are dealing with elasticsearch output, a fourth option (for your
> use case) would be to create a string generator module that created the
> string format that you need for elasticsearch, and cleaned up character
> sets in the meantime
>
>
> I think the fourth would be the quickest to get implemented, but the most
> limited
>
> the third would be the most flexible, but the most complicated to use
>
> thinking about the second, I wonder how hard it would be to make a mm
> module that was little more than a wrapper around an external program to be
> able to offload the work to something already optimized for the work.
>
> Rainer, any thoughts as to which option would be easiest to implement
> (both for this specific character set conversion and the more general
> conversion problem)?
>

I need to think a bit more about this problem, especially as it boils up
every now and then. I remember that one proposed solution was the ability
to assign a character set to inputs and make the database outputs aware of
the charset of the message in question (which could change very
frequently). I haven't explored this in enough depth right now, but if it
works, it sounds like a good solution (with medium time-to-implement
footprint). So let's add this as #5 ;)

>From what you gave, I think #1 or 2 is probably the quickest to implement.

@Risto: what would be the minimal solution to solve your problem? What
would be the best one?

Rainer


>
>
> I have not had to deal with character set conversion very much, so I'm not
> familiar with what tools are available to do this sort of conversion
> already.
>
> David Lang
>
>
>
>
>
> On Wed, 18 Sep 2013, Risto Vaarandi wrote:
>
>  hi folks,
>>
>> I've been using the omelasticsearch output module for quite some time,
>> and I am happy with it. However, there is one issue I haven't been able to
>> tackle. Since I am writing data to Elasticsearch from wide variety of
>> sources, I am accidentally running into syslog messages which contain some
>> iso8859 characters. Unfortunately, when trying to write them into
>> Elasticsearch as-is, you would get back the following error:
>>
>> org.elasticsearch.index.**mapper.MapperParsingException: failed to parse
>> [@message]
>> ...
>> ...
>> ...
>> Caused by: org.elasticsearch.common.**jackson.core.**JsonParseException:
>> Invalid UTF-8 start byte 0x99
>>
>> Apparently, the 'json' property replacer is not able to detect and remove
>> (or replace) such characters.
>>
>> As a solution, I have tried to add space-cc or drop-cc property replacer
>> to json, for example:
>>
>> \"@message\":\"%rawmsg:::**space-cc,json%\"
>>
>> but they have no effect (in addition, I have specified
>> $**EscapeControlCharactersOnRecei**ve off
>> as recommended by rsyslog documentation).
>>
>> Is there any way to handle this problem? So far, I've been happy with
>> rsyslog+Elasticsearch setup, and I wouldn't like to add any Java based tool
>> into the processing pipeline.
>>
>> kind regards,
>> risto
>> ______________________________**_________________
>> rsyslog mailing list
>> http://lists.adiscon.net/**mailman/listinfo/rsyslog<http://lists.adiscon.net/mailman/listinfo/rsyslog>
>> http://www.rsyslog.com/**professional-services/<http://www.rsyslog.com/professional-services/>
>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
>> DON'T LIKE THAT.
>>
>>  ______________________________**_________________
> rsyslog mailing list
> http://lists.adiscon.net/**mailman/listinfo/rsyslog<http://lists.adiscon.net/mailman/listinfo/rsyslog>
> http://www.rsyslog.com/**professional-services/<http://www.rsyslog.com/professional-services/>
> What's up with rsyslog? Follow https://twitter.com/rgerhards
> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
> DON'T LIKE THAT.
>
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Reply via email to