On Thu, Oct 3, 2013 at 9:09 AM, Risto Vaarandi <[email protected]>wrote:

> On 09/27/2013 02:33 PM, Risto Vaarandi wrote:
>
>> On 09/20/2013 06:29 PM, Rainer Gerhards wrote:
>>
>>> On Fri, Sep 20, 2013 at 12:01 PM, Axel Rau <[email protected]> wrote:
>>>
>>>  Obviously, there are many ways to improve that module. But I thought
>>>>> I at
>>>>> least get it started and gather some feedback. If time permits, I'll
>>>>> add
>>>>> some more functionality later today. But the basic need should be
>>>>> solved
>>>>> (if I understood correctly ;)).
>>>>>
>>>> I think, the "basic need" would be to replace only invalid UTF-8
>>>> sequences, not every character with code < 32 or > 126.
>>>>
>>>>
>>> thanks. I have just committed to master branch a version who does this
>>> (by
>>> default).
>>>
>>>
>>>  Does the restriction to IPv4 apply only to expressions like
>>>>          if $fromhost-ip == "10.0.0.1" then
>>>> or in general?
>>>>
>>>>
>>> That was a leftover from the doc I used as basis, mmutf8fix doesn't care
>>> about IPvx. It's removed now.
>>>
>>> @risto: you now need to specify mode="controlcharacters" to replace all
>>> non-printable US-ASCII characters, as proper UTF-8 checks are default
>>> now.
>>> I thought that's more appropriate for this module ;)
>>>
>>> See last sample in doc:
>>> http://www.rsyslog.com/doc/**mmutf8fix.html<http://www.rsyslog.com/doc/mmutf8fix.html>
>>>
>>> Feedback on this module is appreciated.
>>>
>>
>> I have had it running for about a week and so far it has been able to
>> write all log messages into Elasticsearch without issues (many millions
>> of messages per day). Looks like it is working just as expected.
>> kind regards,
>> risto
>>
>>
> Actually, yesterday there was one write failure into elasticsearch.
> I had rsyslogd running with
>
> action(type="mmutf8fix" replacementChar="_")
>
> statement which accepts not only us-ascii, but also all utf8 characters.
>
> The log message was badly malformed and a number of replacements were done
> (my replacement character is "_" as you can see from above statement).
> However, in the very end of the log message the final byte was a non-utf8
> character, and that was left unreplaced. The last 4 bytes of the message
> look like follows:
>
> <us-ascii char><us-ascii char><replaced char><unreplaced non-utf8 char>
>
> The error message produced by elasticsearch looks like follows:
> Invalid UTF-8 middle byte 0x22
>
> The issue is not very urgent, because I mostly care about us-ascii
> characters and can thus enable mode="controlcharacters" for a workaround.
>
>
I finally fixed this border case, patch here:

http://git.adiscon.com/?p=rsyslog.git;a=commitdiff;h=97bda43e372a506671cb7007b6041e4160a02b04

I would appreciate if you could apply and test the patch, as I had only
time to do a very quick test (plus obviously nothing beats high-volume real
traffic ;)).

Thanks,
Rainer
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Reply via email to