On Wed, 18 Sep 2013, Rainer Gerhards wrote:
On Wed, Sep 18, 2013 at 2:02 PM, David Lang <[email protected]> wrote:
As far as I know, this is not available in rsyslog. It's something that
comes up once in a while as a need, but usually it's been fairly easy to
work-around by interposing an external program between rsyslog and the
output.
Rsyslog doesn't know or care about character sets internally, it deals
with straight C strings that can contain arbitrary non-null bytes.
Currently rsyslog has the control character escaping code because most of
what it has traditionally dealt with has been ascii text data
What I think you are needing is the ability to define a sed like filter to
an output so that you can define a mapping of iso8859 characters to UTF8
characters.
Another option would be to create a message modification module to make
the changes to the message that rsyslog is processing instead of to the
output.
a third option would be to create a new function that would take the value
of a variable/property, transform it (per additional parameters), and
assign the result to another variable
Since you are dealing with elasticsearch output, a fourth option (for your
use case) would be to create a string generator module that created the
string format that you need for elasticsearch, and cleaned up character
sets in the meantime
I think the fourth would be the quickest to get implemented, but the most
limited
the third would be the most flexible, but the most complicated to use
thinking about the second, I wonder how hard it would be to make a mm
module that was little more than a wrapper around an external program to be
able to offload the work to something already optimized for the work.
Rainer, any thoughts as to which option would be easiest to implement
(both for this specific character set conversion and the more general
conversion problem)?
I need to think a bit more about this problem, especially as it boils up
every now and then. I remember that one proposed solution was the ability
to assign a character set to inputs and make the database outputs aware of
the charset of the message in question (which could change very
frequently). I haven't explored this in enough depth right now, but if it
works, it sounds like a good solution (with medium time-to-implement
footprint). So let's add this as #5 ;)
here's a very ugly situation that could now develop from this approach
you receive data from intput1 with character set 1
you create a global variable
you now receive data from input2 with character set 2
you try to use that global variable with your current message, which character
set do you tell the output?
I think the only way to address this problem is to have a conversion function
that can be used in variable assignment. This could be either an explicit
conversion, or the ability to generate a string with a template (to use option
#1 for the conversion). There are other cases where using template type string
generation to create a variable would be very nice (being able to control
timestamp formatting for example)
David Lang
From what you gave, I think #1 or 2 is probably the quickest to implement.
@Risto: what would be the minimal solution to solve your problem? What
would be the best one?
Rainer
I have not had to deal with character set conversion very much, so I'm not
familiar with what tools are available to do this sort of conversion
already.
David Lang
On Wed, 18 Sep 2013, Risto Vaarandi wrote:
hi folks,
I've been using the omelasticsearch output module for quite some time,
and I am happy with it. However, there is one issue I haven't been able to
tackle. Since I am writing data to Elasticsearch from wide variety of
sources, I am accidentally running into syslog messages which contain some
iso8859 characters. Unfortunately, when trying to write them into
Elasticsearch as-is, you would get back the following error:
org.elasticsearch.index.**mapper.MapperParsingException: failed to parse
[@message]
...
...
...
Caused by: org.elasticsearch.common.**jackson.core.**JsonParseException:
Invalid UTF-8 start byte 0x99
Apparently, the 'json' property replacer is not able to detect and remove
(or replace) such characters.
As a solution, I have tried to add space-cc or drop-cc property replacer
to json, for example:
\"@message\":\"%rawmsg:::**space-cc,json%\"
but they have no effect (in addition, I have specified
$**EscapeControlCharactersOnRecei**ve off
as recommended by rsyslog documentation).
Is there any way to handle this problem? So far, I've been happy with
rsyslog+Elasticsearch setup, and I wouldn't like to add any Java based tool
into the processing pipeline.
kind regards,
risto
______________________________**_________________
rsyslog mailing list
http://lists.adiscon.net/**mailman/listinfo/rsyslog<http://lists.adiscon.net/mailman/listinfo/rsyslog>
http://www.rsyslog.com/**professional-services/<http://www.rsyslog.com/professional-services/>
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
DON'T LIKE THAT.
______________________________**_________________
rsyslog mailing list
http://lists.adiscon.net/**mailman/listinfo/rsyslog<http://lists.adiscon.net/mailman/listinfo/rsyslog>
http://www.rsyslog.com/**professional-services/<http://www.rsyslog.com/professional-services/>
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
DON'T LIKE THAT.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE
THAT.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE
THAT.