As far as I know, this is not available in rsyslog. It's something that comes up
once in a while as a need, but usually it's been fairly easy to work-around by
interposing an external program between rsyslog and the output.
Rsyslog doesn't know or care about character sets internally, it deals with
straight C strings that can contain arbitrary non-null bytes.
Currently rsyslog has the control character escaping code because most of what
it has traditionally dealt with has been ascii text data
What I think you are needing is the ability to define a sed like filter to an
output so that you can define a mapping of iso8859 characters to UTF8
characters.
Another option would be to create a message modification module to make the
changes to the message that rsyslog is processing instead of to the output.
a third option would be to create a new function that would take the value of a
variable/property, transform it (per additional parameters), and assign the
result to another variable
Since you are dealing with elasticsearch output, a fourth option (for your
use case) would be to create a string generator module that created the string
format that you need for elasticsearch, and cleaned up character sets in the
meantime
I think the fourth would be the quickest to get implemented, but the most
limited
the third would be the most flexible, but the most complicated to use
thinking about the second, I wonder how hard it would be to make a mm module
that was little more than a wrapper around an external program to be able to
offload the work to something already optimized for the work.
Rainer, any thoughts as to which option would be easiest to implement (both for
this specific character set conversion and the more general conversion problem)?
I have not had to deal with character set conversion very much, so I'm not
familiar with what tools are available to do this sort of conversion already.
David Lang
On Wed, 18 Sep 2013, Risto Vaarandi wrote:
hi folks,
I've been using the omelasticsearch output module for quite some time, and I
am happy with it. However, there is one issue I haven't been able to tackle.
Since I am writing data to Elasticsearch from wide variety of sources, I am
accidentally running into syslog messages which contain some iso8859
characters. Unfortunately, when trying to write them into Elasticsearch
as-is, you would get back the following error:
org.elasticsearch.index.mapper.MapperParsingException: failed to parse
[@message]
...
...
...
Caused by: org.elasticsearch.common.jackson.core.JsonParseException: Invalid
UTF-8 start byte 0x99
Apparently, the 'json' property replacer is not able to detect and remove (or
replace) such characters.
As a solution, I have tried to add space-cc or drop-cc property replacer to
json, for example:
\"@message\":\"%rawmsg:::space-cc,json%\"
but they have no effect (in addition, I have specified
$EscapeControlCharactersOnReceive off
as recommended by rsyslog documentation).
Is there any way to handle this problem? So far, I've been happy with
rsyslog+Elasticsearch setup, and I wouldn't like to add any Java based tool
into the processing pipeline.
kind regards,
risto
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T
LIKE THAT.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE
THAT.