On Thu, 2013-03-14 at 14:04 +0100, Nicolas HAHN wrote: > Rainer Gerhards <[email protected]> a écrit : > > > I think this is probably a good short-term solution. Would a > > parameter for ompgsql sufficient (for your cases)? If I need to add > > it to the input, that's a much larger change, as I would need to > > carry that encoding all through the engine. > > I think yes. > OK, that simplifies things. Please ping me if you have not heard back by end of next week.
> >> Why re-implementing a translation module doing the same directly in > >> Rsyslog, as it is handled by PostgreSQL? > > Well, obviously this is just a pg solution, so in long term, doing it > > generally would make sense IMHO. > > Well, the databases engines rsyslog is able to work with until now (MySQL, > PostgreSQL, Oracle) all allow the definition of the character set. > > In Postgresql this is SET client_encoding='value'; > In MySQL this is SET NAMES 'value' which combine in itself 3 other MySQL > instructions that are > > SET character_set_client = 'value'; SET character_set_results = /'value'/; > SET character_set_connection = /'value'/; > > The SET NAMES syntax is SQL92 standard. > > In Oracle, I may be wrong, but it should be SET NLS_LANG='value' > > So generally, any database should have this instruction available in a way or > another. > That's good to know, but still limits it to databases. A very important use-case IMHO is relaying, to solve exactly the problem you mentioned that a single sender sends in multiple character sets. Rainer > Saying that, having both possibilities (let the DB do the job by setting the > correct SET instruction, and implementing something directly in Rsyslog on a > longer term) would be a good idea. > > KR. > Nicolas > - > United Nations International Computing Center > Geneva > > >> I don't see the interest except... > >> > >> ...if the client sending logs to the rsyslog server is so crappy > >> that it does mix several encoding types > > well... I would consider this to be the regular use case in relay > > chains (for the upper-level relays). To prevent it, the leaf-level > > released would need to do the code translation. > > > > Rainer > >> in the same log line or even in the same block of log lines. Here, > >> yes, you would need something "more granular" like a property > >> replacer module... Maybe... > >> > >> KR. > >> Nicolas > >> - > >> United Nations International Computing Center > >> Geneva > >> > >> That would solve 90% of users issues related to encoding transformations. > >> > >> David Lang <[email protected]> a écrit : > >> > >> > I haven't seen any changes mentioned that would affect this, so I > >> > doubt if anything has changed in newer versions. > >> > > >> > If you know that your data is in LATIN1, then I would suggest that > >> > you look at getting a plugin written that converts all log data from > >> > LATIN1 to UTF8. This could either be a parser module or a message > >> > manipulation module. This should be a fairly simple module to write. > >> > > >> > David Lang > >> > > >> > On Wed, 13 Mar 2013, Nicolas HAHN wrote: > >> > > >> >> Hello there, > >> >> > >> >> I suppose plenty of you already discussed the problem I describe > >> >> below because I've retrieved discussions speaking about that in Jan. > >> >> 2010. > >> >> > >> >> In fact, as Rsyslog doesn't handle any encoding conversions (from > >> >> SQL_ASCII to UTF8 or whatever) when configured to send logs in a > >> >> PostgreSQL database, and it all depends of the original sender of > >> >> the logs, that can generate a lot of errors like that in PostgreSQL: > >> >> > >> >> ERROR: invalid byte sequence for encoding "UTF8": 0xa0 > >> >> > >> >> The side effect is that in this case Rsyslog stop to send logs to > >> >> the DB and start to spool in a directory (if configured to do so). > >> >> > >> >> In the various threads I've seen speaking about this problem, I > >> >> found that at this time Rainer proposed to implement something in > >> >> Rsyslog to handle such specific case in a way or other. Like a > >> >> property replacer or whatever. > >> >> > >> >> In parallel, there were also several suggestions like running > >> >> postgresql with SQL-ASCII encoded databases for example. Or find a > >> >> way in the OMPGSQL module, when the connection is opened to the DB, > >> >> to set the client_encoding variable which is a simple SQL command, > >> >> for the time of the session. > >> >> > >> >> I don't know if some of you found a way to fix this kind of issue, > >> >> if it's just impossible to have the DB in SQL-ASCII like us. > >> >> Furthermore SQL-ASCII is clearly deprecated for UTF-8 instead. > >> >> > >> >> The first question: > >> >> is the latest version of Rsyslog able to deal with encoding > >> >> translations? > >> >> > >> >> The workaround I use personally: > >> >> for me it's not possible to set the DB as SQL-ASCII. I've to keep it > >> >> as UTF-8. > >> >> Then I had to found a way for rsyslog to implement the translation > >> >> from an encoding to another directly. > >> >> Here is what I did: I've simply added the SET > >> >> client_encoding='SOMETHING' in the SQL template, just before the > >> >> INSERT command. Here is what I have by the example: > >> >> Originally, the SQL template was like that in my config: > >> >> > >> >> $template mail_pgsql,"INSERT INTO es_systemevents(Message, Facility, > >> >> FromHost, Priority, DeviceReportedTime, ReceivedAt, InfoUnitID, > >> >> SysLogTag) values ('%msg%', %syslogfacility%, '%HOSTNAME%', > >> >> %syslogpriority%, '%timereported:::date-rfc3339%', > >> >> '%timegenerated:::date-rfc3339%', %iut%, '%syslogtag%')", STDSQL > >> >> > >> >> Now, the workaround is like that: > >> >> > >> >> $template es2_mail_pgsql,"set client_encoding='LATIN1';INSERT INTO > >> >> es_systemevents(Message, Facility, FromHost, Priority, > >> >> DeviceReportedTime, ReceivedAt, InfoUnitID, SysLogTag) values > >> >> ('%msg%', %syslogfacility%, '%HOSTNAME%', %syslogpriority%, > >> >> '%timereported:::date-rfc3339%', '%timegenerated:::date-rfc3339%', > >> >> %iut%, '%syslogtag%')", STDSQL > >> >> > >> >> This is the way I fixed the conversion issue while keeping the > >> >> database as UTF8. I used LATIN1 because I know my incoming data are > >> >> in this format. You can use the same but be careful to use the > >> >> correct encoding model for your installation. > >> >> > >> >> Of course, that would be really much better to have an option > >> >> available directly at the level of the ompgsql module to set the > >> >> client_encoding option. Something like: > >> >> output(type="omrelp" client_encoding="LATIN1") > >> >> in the /etc/rsyslog.conf file (that's a suggestion :-)) > >> >> > >> >> And now my second question: if some of you have fixed this issue > >> >> another way (except by having a DB directly with SQL-ASCII > >> >> encoding), could you please share how you did it? > >> >> > >> >> Thanks a lot for your answers. > >> >> > >> >> KR. > >> >> Nicolas > >> >> - > >> >> United Nations International Computing Center > >> >> Geneva > >> >> > >> >> > >> >> > >> >> ---------------------------------------------------------------- > >> >> This message was sent using IMP, the Internet Messaging Program. > >> >> _______________________________________________ > >> >> rsyslog mailing list > >> >> http://lists.adiscon.net/mailman/listinfo/rsyslog > >> >> http://www.rsyslog.com/professional-services/ > >> >> What's up with rsyslog? Follow https://twitter.com/rgerhards > >> >> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a > >> >> myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT > >> >> POST if you DON'T LIKE THAT. > >> >> > >> > _______________________________________________ > >> > rsyslog mailing list > >> > http://lists.adiscon.net/mailman/listinfo/rsyslog > >> > http://www.rsyslog.com/professional-services/ > >> > What's up with rsyslog? Follow https://twitter.com/rgerhards > >> > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a > >> > myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT > >> > POST if you DON'T LIKE THAT. > >> > > >> > >> > >> ---------------------------------------------------------------- > >> This message was sent using IMP, the Internet Messaging Program. > >> _______________________________________________ > >> rsyslog mailing list > >> http://lists.adiscon.net/mailman/listinfo/rsyslog > >> http://www.rsyslog.com/professional-services/ > >> What's up with rsyslog? Follow https://twitter.com/rgerhards > >> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a > >> myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT > >> POST if you DON'T LIKE THAT. > > > > _______________________________________________ > > rsyslog mailing list > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > http://www.rsyslog.com/professional-services/ > > What's up with rsyslog? Follow https://twitter.com/rgerhards > > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a > > myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT > > POST if you DON'T LIKE THAT. > > > ---------------------------------------------------------------- > This message was sent using IMP, the Internet Messaging Program. > > Rainer Gerhards <[email protected]> a écrit : > > > I think this is probably a good short-term solution. Would a > > parameter for ompgsql sufficient (for your cases)? If I need to add > > it to the input, that's a much larger change, as I would need to > > carry that encoding all through the engine. > > I think yes. > > >> Why re-implementing a translation module doing the same directly in > >> Rsyslog, as it is handled by PostgreSQL? > > Well, obviously this is just a pg solution, so in long term, doing > it > > generally would make sense IMHO. > > Well, the databases engines rsyslog is able to work with until now > (MySQL, PostgreSQL, Oracle) all allow the definition of the character > set. > > In Postgresql this is SET client_encoding='value'; > In MySQL this is SET NAMES 'value' which combine in itself 3 other > MySQL instructions that are > > > SET character_set_client = 'value'; > SET character_set_results = 'value'; > SET character_set_connection = 'value'; > > The SET NAMES syntax is SQL92 standard. > > In Oracle, I may be wrong, but it should be SET NLS_LANG='value' > > So generally, any database should have this instruction available in a > way or another. > > Saying that, having both possibilities (let the DB do the job by > setting the correct SET instruction, and implementing something > directly in Rsyslog on a longer term) would be a good idea. > > > KR. > Nicolas > - > United Nations International Computing Center > Geneva > > > > > >> I don't see the interest except... > >> > >> ...if the client sending logs to the rsyslog server is so crappy > >> that it does mix several encoding types > > well... I would consider this to be the regular use case in relay > > chains (for the upper-level relays). To prevent it, the leaf-level > > released would need to do the code translation. > > > > Rainer > >> in the same log line or even in the same block of log lines. Here, > >> yes, you would need something "more granular" like a property > >> replacer module... Maybe... > >> > >> KR. > >> Nicolas > >> - > >> United Nations International Computing Center > >> Geneva > >> > >> That would solve 90% of users issues related to encoding > transformations. > >> > >> David Lang <[email protected]> a écrit : > >> > >> > I haven't seen any changes mentioned that would affect this, so I > >> > doubt if anything has changed in newer versions. > >> > > >> > If you know that your data is in LATIN1, then I would suggest > that > >> > you look at getting a plugin written that converts all log data > from > >> > LATIN1 to UTF8. This could either be a parser module or a message > >> > manipulation module. This should be a fairly simple module to > write. > >> > > >> > David Lang > >> > > >> > On Wed, 13 Mar 2013, Nicolas HAHN wrote: > >> > > >> >> Hello there, > >> >> > >> >> I suppose plenty of you already discussed the problem I describe > >> >> below because I've retrieved discussions speaking about that in > Jan. > >> >> 2010. > >> >> > >> >> In fact, as Rsyslog doesn't handle any encoding conversions > (from > >> >> SQL_ASCII to UTF8 or whatever) when configured to send logs in a > >> >> PostgreSQL database, and it all depends of the original sender > of > >> >> the logs, that can generate a lot of errors like that in > PostgreSQL: > >> >> > >> >> ERROR: invalid byte sequence for encoding "UTF8": 0xa0 > >> >> > >> >> The side effect is that in this case Rsyslog stop to send logs > to > >> >> the DB and start to spool in a directory (if configured to do > so). > >> >> > >> >> In the various threads I've seen speaking about this problem, I > >> >> found that at this time Rainer proposed to implement something > in > >> >> Rsyslog to handle such specific case in a way or other. Like a > >> >> property replacer or whatever. > >> >> > >> >> In parallel, there were also several suggestions like running > >> >> postgresql with SQL-ASCII encoded databases for example. Or find > a > >> >> way in the OMPGSQL module, when the connection is opened to the > DB, > >> >> to set the client_encoding variable which is a simple SQL > command, > >> >> for the time of the session. > >> >> > >> >> I don't know if some of you found a way to fix this kind of > issue, > >> >> if it's just impossible to have the DB in SQL-ASCII like us. > >> >> Furthermore SQL-ASCII is clearly deprecated for UTF-8 instead. > >> >> > >> >> The first question: > >> >> is the latest version of Rsyslog able to deal with encoding > translations? > >> >> > >> >> The workaround I use personally: > >> >> for me it's not possible to set the DB as SQL-ASCII. I've to > keep it > >> >> as UTF-8. > >> >> Then I had to found a way for rsyslog to implement the > translation > >> >> from an encoding to another directly. > >> >> Here is what I did: I've simply added the SET > >> >> client_encoding='SOMETHING' in the SQL template, just before the > >> >> INSERT command. Here is what I have by the example: > >> >> Originally, the SQL template was like that in my config: > >> >> > >> >> $template mail_pgsql,"INSERT INTO es_systemevents(Message, > Facility, > >> >> FromHost, Priority, DeviceReportedTime, ReceivedAt, InfoUnitID, > >> >> SysLogTag) values ('%msg%', %syslogfacility%, '%HOSTNAME%', > >> >> %syslogpriority%, '%timereported:::date-rfc3339%', > >> >> '%timegenerated:::date-rfc3339%', %iut%, '%syslogtag%')", STDSQL > >> >> > >> >> Now, the workaround is like that: > >> >> > >> >> $template es2_mail_pgsql,"set client_encoding='LATIN1';INSERT > INTO > >> >> es_systemevents(Message, Facility, FromHost, Priority, > >> >> DeviceReportedTime, ReceivedAt, InfoUnitID, SysLogTag) values > >> >> ('%msg%', %syslogfacility%, '%HOSTNAME%', %syslogpriority%, > >> >> '%timereported:::date-rfc3339%', '% > timegenerated:::date-rfc3339%', > >> >> %iut%, '%syslogtag%')", STDSQL > >> >> > >> >> This is the way I fixed the conversion issue while keeping the > >> >> database as UTF8. I used LATIN1 because I know my incoming data > are > >> >> in this format. You can use the same but be careful to use the > >> >> correct encoding model for your installation. > >> >> > >> >> Of course, that would be really much better to have an option > >> >> available directly at the level of the ompgsql module to set the > >> >> client_encoding option. Something like: > >> >> output(type="omrelp" client_encoding="LATIN1") > >> >> in the /etc/rsyslog.conf file (that's a suggestion :-)) > >> >> > >> >> And now my second question: if some of you have fixed this issue > >> >> another way (except by having a DB directly with SQL-ASCII > >> >> encoding), could you please share how you did it? > >> >> > >> >> Thanks a lot for your answers. > >> >> > >> >> KR. > >> >> Nicolas > >> >> - > >> >> United Nations International Computing Center > >> >> Geneva > >> >> > >> >> > >> >> > >> >> ---------------------------------------------------------------- > >> >> This message was sent using IMP, the Internet Messaging Program. > >> >> _______________________________________________ > >> >> rsyslog mailing list > >> >> http://lists.adiscon.net/mailman/listinfo/rsyslog > >> >> http://www.rsyslog.com/professional-services/ > >> >> What's up with rsyslog? Follow https://twitter.com/rgerhards > >> >> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by > a > >> >> myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO > NOT > >> >> POST if you DON'T LIKE THAT. > >> >> > >> > _______________________________________________ > >> > rsyslog mailing list > >> > http://lists.adiscon.net/mailman/listinfo/rsyslog > >> > http://www.rsyslog.com/professional-services/ > >> > What's up with rsyslog? Follow https://twitter.com/rgerhards > >> > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a > >> > myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT > >> > POST if you DON'T LIKE THAT. > >> > > >> > >> > >> ---------------------------------------------------------------- > >> This message was sent using IMP, the Internet Messaging Program. > >> _______________________________________________ > >> rsyslog mailing list > >> http://lists.adiscon.net/mailman/listinfo/rsyslog > >> http://www.rsyslog.com/professional-services/ > >> What's up with rsyslog? Follow https://twitter.com/rgerhards > >> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a > >> myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT > >> POST if you DON'T LIKE THAT. > > > > _______________________________________________ > > rsyslog mailing list > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > http://www.rsyslog.com/professional-services/ > > What's up with rsyslog? Follow https://twitter.com/rgerhards > > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a > > myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT > > POST if you DON'T LIKE THAT. > > > ---------------------------------------------------------------- > This message was sent using IMP, the Internet Messaging Program. > > _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.

