On Thu, 2013-03-14 at 11:26 +0100, Nicolas HAHN wrote: > Hello David, > > Well, my opinion is that we should avoid redo the wheel when possible. > > What I mean is that encoding translations is supported by the PostgreSQL > engine on the fly in a transparent manner between tens of encoding schemes. > So the simplest and fastest thing would be to simply have OMPGSQL module > accept an additional parameter (client_encoding), and each time OMPGSQL open > a connection to the database, the first SQL command it should send to the SQL > backend should be "SET client_encoding='value'". > I think this is probably a good short-term solution. Would a parameter for ompgsql sufficient (for your cases)? If I need to add it to the input, that's a much larger change, as I would need to carry that encoding all through the engine.
> Why re-implementing a translation module doing the same directly in Rsyslog, > as it is handled by PostgreSQL? Well, obviously this is just a pg solution, so in long term, doing it generally would make sense IMHO. > I don't see the interest except... > > ...if the client sending logs to the rsyslog server is so crappy that it does > mix several encoding types well... I would consider this to be the regular use case in relay chains (for the upper-level relays). To prevent it, the leaf-level released would need to do the code translation. Rainer > in the same log line or even in the same block of log lines. Here, yes, you > would need something "more granular" like a property replacer module... > Maybe... > > KR. > Nicolas > - > United Nations International Computing Center > Geneva > > That would solve 90% of users issues related to encoding transformations. > > David Lang <[email protected]> a écrit : > > > I haven't seen any changes mentioned that would affect this, so I > > doubt if anything has changed in newer versions. > > > > If you know that your data is in LATIN1, then I would suggest that > > you look at getting a plugin written that converts all log data from > > LATIN1 to UTF8. This could either be a parser module or a message > > manipulation module. This should be a fairly simple module to write. > > > > David Lang > > > > On Wed, 13 Mar 2013, Nicolas HAHN wrote: > > > >> Hello there, > >> > >> I suppose plenty of you already discussed the problem I describe > >> below because I've retrieved discussions speaking about that in Jan. > >> 2010. > >> > >> In fact, as Rsyslog doesn't handle any encoding conversions (from > >> SQL_ASCII to UTF8 or whatever) when configured to send logs in a > >> PostgreSQL database, and it all depends of the original sender of > >> the logs, that can generate a lot of errors like that in PostgreSQL: > >> > >> ERROR: invalid byte sequence for encoding "UTF8": 0xa0 > >> > >> The side effect is that in this case Rsyslog stop to send logs to > >> the DB and start to spool in a directory (if configured to do so). > >> > >> In the various threads I've seen speaking about this problem, I > >> found that at this time Rainer proposed to implement something in > >> Rsyslog to handle such specific case in a way or other. Like a > >> property replacer or whatever. > >> > >> In parallel, there were also several suggestions like running > >> postgresql with SQL-ASCII encoded databases for example. Or find a > >> way in the OMPGSQL module, when the connection is opened to the DB, > >> to set the client_encoding variable which is a simple SQL command, > >> for the time of the session. > >> > >> I don't know if some of you found a way to fix this kind of issue, > >> if it's just impossible to have the DB in SQL-ASCII like us. > >> Furthermore SQL-ASCII is clearly deprecated for UTF-8 instead. > >> > >> The first question: > >> is the latest version of Rsyslog able to deal with encoding translations? > >> > >> The workaround I use personally: > >> for me it's not possible to set the DB as SQL-ASCII. I've to keep it > >> as UTF-8. > >> Then I had to found a way for rsyslog to implement the translation > >> from an encoding to another directly. > >> Here is what I did: I've simply added the SET > >> client_encoding='SOMETHING' in the SQL template, just before the > >> INSERT command. Here is what I have by the example: > >> Originally, the SQL template was like that in my config: > >> > >> $template mail_pgsql,"INSERT INTO es_systemevents(Message, Facility, > >> FromHost, Priority, DeviceReportedTime, ReceivedAt, InfoUnitID, > >> SysLogTag) values ('%msg%', %syslogfacility%, '%HOSTNAME%', > >> %syslogpriority%, '%timereported:::date-rfc3339%', > >> '%timegenerated:::date-rfc3339%', %iut%, '%syslogtag%')", STDSQL > >> > >> Now, the workaround is like that: > >> > >> $template es2_mail_pgsql,"set client_encoding='LATIN1';INSERT INTO > >> es_systemevents(Message, Facility, FromHost, Priority, > >> DeviceReportedTime, ReceivedAt, InfoUnitID, SysLogTag) values > >> ('%msg%', %syslogfacility%, '%HOSTNAME%', %syslogpriority%, > >> '%timereported:::date-rfc3339%', '%timegenerated:::date-rfc3339%', > >> %iut%, '%syslogtag%')", STDSQL > >> > >> This is the way I fixed the conversion issue while keeping the > >> database as UTF8. I used LATIN1 because I know my incoming data are > >> in this format. You can use the same but be careful to use the > >> correct encoding model for your installation. > >> > >> Of course, that would be really much better to have an option > >> available directly at the level of the ompgsql module to set the > >> client_encoding option. Something like: > >> output(type="omrelp" client_encoding="LATIN1") > >> in the /etc/rsyslog.conf file (that's a suggestion :-)) > >> > >> And now my second question: if some of you have fixed this issue > >> another way (except by having a DB directly with SQL-ASCII > >> encoding), could you please share how you did it? > >> > >> Thanks a lot for your answers. > >> > >> KR. > >> Nicolas > >> - > >> United Nations International Computing Center > >> Geneva > >> > >> > >> > >> ---------------------------------------------------------------- > >> This message was sent using IMP, the Internet Messaging Program. > >> _______________________________________________ > >> rsyslog mailing list > >> http://lists.adiscon.net/mailman/listinfo/rsyslog > >> http://www.rsyslog.com/professional-services/ > >> What's up with rsyslog? Follow https://twitter.com/rgerhards > >> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a > >> myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT > >> POST if you DON'T LIKE THAT. > >> > > _______________________________________________ > > rsyslog mailing list > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > http://www.rsyslog.com/professional-services/ > > What's up with rsyslog? Follow https://twitter.com/rgerhards > > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a > > myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT > > POST if you DON'T LIKE THAT. > > > > > ---------------------------------------------------------------- > This message was sent using IMP, the Internet Messaging Program. > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com/professional-services/ > What's up with rsyslog? Follow https://twitter.com/rgerhards > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of > sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T > LIKE THAT. _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.

