On Mon Jan 25 03:12 AM, Rainer Gerhards wrote:
> So I don't think it would serve the non-US-ASCII world well to process 
> the transformation formats. I guess that's a good option if you have a 
> US-ASCII based system that only very occasionally needs to process a 
> foreign language string (and even then, you need to parse the message
> *each* time you access it, specifically when obtaining substrings...).
> 
> My conclusion is that rsyslog needs to do a UTF to UCS conversion on 
> entry to the system and then uses UCS internally (and converts back 
> when messages are output). Many software systems do so, and, as I 
> said, IMHO do so for good reasons.
> 

What about adding a property option ~ 'normalize-utf8' where invalid utf8
bytes would be escaped?

$template dbFormat,"insert into text_logs (utf8_message) values
('%msg:::normalize-utf8%')",stdsql

I can probably dig through postgresql to find the code to detect invalid
utf8 bytes.

I'm not sure if I understood but are you suggesting that all input to
rsyslog is converted to UCS internally?
That seems like a huge performance penalty to pay when most people (?) log
US-ascii or UTF-8 data.


_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com

Reply via email to