On Thu, 2013-03-14 at 14:04 +0100, Nicolas HAHN wrote:
> Rainer Gerhards <[email protected]> a écrit :
> 
> > I think this is probably a good short-term solution. Would a 
> > parameter for ompgsql sufficient (for your cases)? If I need to add 
> > it to the input, that's a much larger change, as I would need to 
> > carry that encoding all through the engine.
> 
> I think yes.
> 
OK, that simplifies things. Please ping me if you have not heard back by
end of next week.

> >> Why re-implementing a translation module doing the same directly in 
> >> Rsyslog, as it is handled by PostgreSQL?
> > Well, obviously this is just a pg solution, so in long term, doing it
> > generally would make sense IMHO.
> 
> Well, the databases engines rsyslog is able to work with until now (MySQL, 
> PostgreSQL, Oracle) all allow the definition of the character set.
> 
> In Postgresql this is SET client_encoding='value';
> In MySQL this is SET NAMES 'value' which combine in itself 3 other MySQL 
> instructions that are 
> 
> SET character_set_client = 'value'; SET character_set_results = /'value'/; 
> SET character_set_connection = /'value'/;  
> 
> The SET NAMES syntax is SQL92 standard. 
> 
> In Oracle, I may be wrong, but it should be SET NLS_LANG='value' 
> 
> So generally, any database should have this instruction available in a way or 
> another. 
> 
That's good to know, but still limits it to databases. A very important
use-case IMHO is relaying, to solve exactly the problem you mentioned
that a single sender sends in multiple character sets.

Rainer
> Saying that, having both possibilities (let the DB do the job by setting the 
> correct SET instruction, and implementing something directly in Rsyslog on a 
> longer term) would be a good idea. 
> 
> KR.
> Nicolas
> -
> United Nations International Computing Center
> Geneva
> 
> >> I don't see the interest except...
> >>
> >> ...if the client sending logs to the rsyslog server is so crappy 
> >> that it does mix several encoding types
> > well... I would consider this to be the regular use case in relay 
> > chains (for the upper-level relays). To prevent it, the leaf-level 
> > released would need to do the code translation.
> >
> > Rainer
> >>  in the same log line or even in the same block of log lines. Here, 
> >> yes, you would need something "more granular" like a property 
> >> replacer module... Maybe...
> >>
> >> KR.
> >> Nicolas
> >> -
> >> United Nations International Computing Center
> >> Geneva
> >>
> >> That would solve 90% of users issues related to encoding transformations.
> >>
> >> David Lang <[email protected]> a écrit :
> >>
> >> > I haven't seen any changes mentioned that would affect this, so I
> >> > doubt if anything has changed in newer versions.
> >> >
> >> > If you know that your data is in LATIN1, then I would suggest that
> >> > you look at getting a plugin written that converts all log data from
> >> > LATIN1 to UTF8. This could either be a parser module or a message
> >> > manipulation module. This should be a fairly simple module to write.
> >> >
> >> > David Lang
> >> >
> >> > On Wed, 13 Mar 2013, Nicolas HAHN wrote:
> >> >
> >> >> Hello there,
> >> >>
> >> >> I suppose plenty of you already discussed the problem I describe
> >> >> below because I've retrieved discussions speaking about that in Jan.
> >> >> 2010.
> >> >>
> >> >> In fact, as Rsyslog doesn't handle any encoding conversions (from
> >> >> SQL_ASCII to UTF8 or whatever) when configured to send logs in a
> >> >> PostgreSQL database, and it all depends of the original sender of
> >> >> the logs, that can generate a lot of errors like that in PostgreSQL:
> >> >>
> >> >> ERROR:  invalid byte sequence for encoding "UTF8": 0xa0
> >> >>
> >> >> The side effect is that in this case Rsyslog stop to send logs to
> >> >> the DB and start to spool in a directory (if configured to do so).
> >> >>
> >> >> In the various threads I've seen speaking about this problem, I
> >> >> found that at this time Rainer proposed to implement something in
> >> >> Rsyslog to handle such specific case in a way or other. Like a
> >> >> property replacer or whatever.
> >> >>
> >> >> In parallel, there were also several suggestions like running
> >> >> postgresql with SQL-ASCII encoded databases for example. Or find a
> >> >> way in the OMPGSQL module, when the connection is opened to the DB,
> >> >> to set the client_encoding variable which is a simple SQL command,
> >> >> for the time of the session.
> >> >>
> >> >> I don't know if some of you found a way to fix this kind of issue,
> >> >> if it's just impossible to have the DB in SQL-ASCII like us.
> >> >> Furthermore SQL-ASCII is clearly deprecated for UTF-8 instead.
> >> >>
> >> >> The first question:
> >> >> is the latest version of Rsyslog able to deal with encoding 
> >> >> translations?
> >> >>
> >> >> The workaround I use personally:
> >> >> for me it's not possible to set the DB as SQL-ASCII. I've to keep it
> >> >> as UTF-8.
> >> >> Then I had to found a way for rsyslog to implement the translation
> >> >> from an encoding to another directly.
> >> >> Here is what I did: I've simply added the SET
> >> >> client_encoding='SOMETHING' in the SQL template, just before the
> >> >> INSERT command. Here is what I have by the example:
> >> >> Originally, the SQL template was like that in my config:
> >> >>
> >> >> $template mail_pgsql,"INSERT INTO es_systemevents(Message, Facility,
> >> >> FromHost, Priority, DeviceReportedTime, ReceivedAt, InfoUnitID,
> >> >> SysLogTag) values ('%msg%', %syslogfacility%, '%HOSTNAME%',
> >> >> %syslogpriority%, '%timereported:::date-rfc3339%',
> >> >> '%timegenerated:::date-rfc3339%', %iut%, '%syslogtag%')", STDSQL
> >> >>
> >> >> Now, the workaround is like that:
> >> >>
> >> >> $template es2_mail_pgsql,"set client_encoding='LATIN1';INSERT INTO
> >> >> es_systemevents(Message, Facility, FromHost, Priority,
> >> >> DeviceReportedTime, ReceivedAt, InfoUnitID, SysLogTag) values
> >> >> ('%msg%', %syslogfacility%, '%HOSTNAME%', %syslogpriority%,
> >> >> '%timereported:::date-rfc3339%', '%timegenerated:::date-rfc3339%',
> >> >> %iut%, '%syslogtag%')", STDSQL
> >> >>
> >> >> This is the way I fixed the conversion issue while keeping the
> >> >> database as UTF8. I used LATIN1 because I know my incoming data are
> >> >> in this format. You can use the same but be careful to use the
> >> >> correct encoding model for your installation.
> >> >>
> >> >> Of course, that would be really much better to have an option
> >> >> available directly at the level of the ompgsql module to set the
> >> >> client_encoding option. Something like:
> >> >> output(type="omrelp" client_encoding="LATIN1")
> >> >> in the /etc/rsyslog.conf file (that's a suggestion :-))
> >> >>
> >> >> And now my second question: if some of you have fixed this issue
> >> >> another way (except by having a DB directly with SQL-ASCII
> >> >> encoding), could you please share how you did it?
> >> >>
> >> >> Thanks a lot for your answers.
> >> >>
> >> >> KR.
> >> >> Nicolas
> >> >> -
> >> >> United Nations International Computing Center
> >> >> Geneva
> >> >>
> >> >>
> >> >>
> >> >> ----------------------------------------------------------------
> >> >> This message was sent using IMP, the Internet Messaging Program.
> >> >> _______________________________________________
> >> >> rsyslog mailing list
> >> >> http://lists.adiscon.net/mailman/listinfo/rsyslog
> >> >> http://www.rsyslog.com/professional-services/
> >> >> What's up with rsyslog? Follow https://twitter.com/rgerhards
> >> >> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
> >> >> myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT
> >> >> POST if you DON'T LIKE THAT.
> >> >>
> >> > _______________________________________________
> >> > rsyslog mailing list
> >> > http://lists.adiscon.net/mailman/listinfo/rsyslog
> >> > http://www.rsyslog.com/professional-services/
> >> > What's up with rsyslog? Follow https://twitter.com/rgerhards
> >> > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
> >> > myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT
> >> > POST if you DON'T LIKE THAT.
> >> >
> >>
> >>
> >> ----------------------------------------------------------------
> >> This message was sent using IMP, the Internet Messaging Program.
> >> _______________________________________________
> >> rsyslog mailing list
> >> http://lists.adiscon.net/mailman/listinfo/rsyslog
> >> http://www.rsyslog.com/professional-services/
> >> What's up with rsyslog? Follow https://twitter.com/rgerhards
> >> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a 
> >> myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT 
> >> POST if you DON'T LIKE THAT.
> >
> > _______________________________________________
> > rsyslog mailing list
> > http://lists.adiscon.net/mailman/listinfo/rsyslog
> > http://www.rsyslog.com/professional-services/
> > What's up with rsyslog? Follow https://twitter.com/rgerhards
> > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a 
> > myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT 
> > POST if you DON'T LIKE THAT.
> 
> 
> ----------------------------------------------------------------
> This message was sent using IMP, the Internet Messaging Program.
> 
> Rainer Gerhards <[email protected]> a écrit :
> 
> > I think this is probably a good short-term solution. Would a
> > parameter for ompgsql sufficient (for your cases)? If I need to add
> > it to the input, that's a much larger change, as I would need to
> > carry that encoding all through the engine.
> 
> I think yes.
> 
> >> Why re-implementing a translation module doing the same directly in
> >> Rsyslog, as it is handled by PostgreSQL?
> > Well, obviously this is just a pg solution, so in long term, doing
> it
> > generally would make sense IMHO.
> 
> Well, the databases engines rsyslog is able to work with until now
> (MySQL, PostgreSQL, Oracle) all allow the definition of the character
> set.
> 
> In Postgresql this is SET client_encoding='value';
> In MySQL this is SET NAMES 'value' which combine in itself 3 other
> MySQL instructions that are
> 
> 
> SET character_set_client = 'value';
> SET character_set_results = 'value';
> SET character_set_connection = 'value';
> 
> The SET NAMES syntax is SQL92 standard. 
> 
> In Oracle, I may be wrong, but it should be SET NLS_LANG='value' 
> 
> So generally, any database should have this instruction available in a
> way or another. 
> 
> Saying that, having both possibilities (let the DB do the job by
> setting the correct SET instruction, and implementing something
> directly in Rsyslog on a longer term) would be a good idea. 
> 
> 
> KR.
> Nicolas
> -
> United Nations International Computing Center
> Geneva
> 
> 
> 
> 
> >> I don't see the interest except...
> >>
> >> ...if the client sending logs to the rsyslog server is so crappy
> >> that it does mix several encoding types
> > well... I would consider this to be the regular use case in relay
> > chains (for the upper-level relays). To prevent it, the leaf-level
> > released would need to do the code translation.
> >
> > Rainer
> >>  in the same log line or even in the same block of log lines. Here,
> >> yes, you would need something "more granular" like a property
> >> replacer module... Maybe...
> >>
> >> KR.
> >> Nicolas
> >> -
> >> United Nations International Computing Center
> >> Geneva
> >>
> >> That would solve 90% of users issues related to encoding
> transformations.
> >>
> >> David Lang <[email protected]> a écrit :
> >>
> >> > I haven't seen any changes mentioned that would affect this, so I
> >> > doubt if anything has changed in newer versions.
> >> >
> >> > If you know that your data is in LATIN1, then I would suggest
> that
> >> > you look at getting a plugin written that converts all log data
> from
> >> > LATIN1 to UTF8. This could either be a parser module or a message
> >> > manipulation module. This should be a fairly simple module to
> write.
> >> >
> >> > David Lang
> >> >
> >> > On Wed, 13 Mar 2013, Nicolas HAHN wrote:
> >> >
> >> >> Hello there,
> >> >>
> >> >> I suppose plenty of you already discussed the problem I describe
> >> >> below because I've retrieved discussions speaking about that in
> Jan.
> >> >> 2010.
> >> >>
> >> >> In fact, as Rsyslog doesn't handle any encoding conversions
> (from
> >> >> SQL_ASCII to UTF8 or whatever) when configured to send logs in a
> >> >> PostgreSQL database, and it all depends of the original sender
> of
> >> >> the logs, that can generate a lot of errors like that in
> PostgreSQL:
> >> >>
> >> >> ERROR:  invalid byte sequence for encoding "UTF8": 0xa0
> >> >>
> >> >> The side effect is that in this case Rsyslog stop to send logs
> to
> >> >> the DB and start to spool in a directory (if configured to do
> so).
> >> >>
> >> >> In the various threads I've seen speaking about this problem, I
> >> >> found that at this time Rainer proposed to implement something
> in
> >> >> Rsyslog to handle such specific case in a way or other. Like a
> >> >> property replacer or whatever.
> >> >>
> >> >> In parallel, there were also several suggestions like running
> >> >> postgresql with SQL-ASCII encoded databases for example. Or find
> a
> >> >> way in the OMPGSQL module, when the connection is opened to the
> DB,
> >> >> to set the client_encoding variable which is a simple SQL
> command,
> >> >> for the time of the session.
> >> >>
> >> >> I don't know if some of you found a way to fix this kind of
> issue,
> >> >> if it's just impossible to have the DB in SQL-ASCII like us.
> >> >> Furthermore SQL-ASCII is clearly deprecated for UTF-8 instead.
> >> >>
> >> >> The first question:
> >> >> is the latest version of Rsyslog able to deal with encoding
> translations?
> >> >>
> >> >> The workaround I use personally:
> >> >> for me it's not possible to set the DB as SQL-ASCII. I've to
> keep it
> >> >> as UTF-8.
> >> >> Then I had to found a way for rsyslog to implement the
> translation
> >> >> from an encoding to another directly.
> >> >> Here is what I did: I've simply added the SET
> >> >> client_encoding='SOMETHING' in the SQL template, just before the
> >> >> INSERT command. Here is what I have by the example:
> >> >> Originally, the SQL template was like that in my config:
> >> >>
> >> >> $template mail_pgsql,"INSERT INTO es_systemevents(Message,
> Facility,
> >> >> FromHost, Priority, DeviceReportedTime, ReceivedAt, InfoUnitID,
> >> >> SysLogTag) values ('%msg%', %syslogfacility%, '%HOSTNAME%',
> >> >> %syslogpriority%, '%timereported:::date-rfc3339%',
> >> >> '%timegenerated:::date-rfc3339%', %iut%, '%syslogtag%')", STDSQL
> >> >>
> >> >> Now, the workaround is like that:
> >> >>
> >> >> $template es2_mail_pgsql,"set client_encoding='LATIN1';INSERT
> INTO
> >> >> es_systemevents(Message, Facility, FromHost, Priority,
> >> >> DeviceReportedTime, ReceivedAt, InfoUnitID, SysLogTag) values
> >> >> ('%msg%', %syslogfacility%, '%HOSTNAME%', %syslogpriority%,
> >> >> '%timereported:::date-rfc3339%', '%
> timegenerated:::date-rfc3339%',
> >> >> %iut%, '%syslogtag%')", STDSQL
> >> >>
> >> >> This is the way I fixed the conversion issue while keeping the
> >> >> database as UTF8. I used LATIN1 because I know my incoming data
> are
> >> >> in this format. You can use the same but be careful to use the
> >> >> correct encoding model for your installation.
> >> >>
> >> >> Of course, that would be really much better to have an option
> >> >> available directly at the level of the ompgsql module to set the
> >> >> client_encoding option. Something like:
> >> >> output(type="omrelp" client_encoding="LATIN1")
> >> >> in the /etc/rsyslog.conf file (that's a suggestion :-))
> >> >>
> >> >> And now my second question: if some of you have fixed this issue
> >> >> another way (except by having a DB directly with SQL-ASCII
> >> >> encoding), could you please share how you did it?
> >> >>
> >> >> Thanks a lot for your answers.
> >> >>
> >> >> KR.
> >> >> Nicolas
> >> >> -
> >> >> United Nations International Computing Center
> >> >> Geneva
> >> >>
> >> >>
> >> >>
> >> >> ----------------------------------------------------------------
> >> >> This message was sent using IMP, the Internet Messaging Program.
> >> >> _______________________________________________
> >> >> rsyslog mailing list
> >> >> http://lists.adiscon.net/mailman/listinfo/rsyslog
> >> >> http://www.rsyslog.com/professional-services/
> >> >> What's up with rsyslog? Follow https://twitter.com/rgerhards
> >> >> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by
> a
> >> >> myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO
> NOT
> >> >> POST if you DON'T LIKE THAT.
> >> >>
> >> > _______________________________________________
> >> > rsyslog mailing list
> >> > http://lists.adiscon.net/mailman/listinfo/rsyslog
> >> > http://www.rsyslog.com/professional-services/
> >> > What's up with rsyslog? Follow https://twitter.com/rgerhards
> >> > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
> >> > myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT
> >> > POST if you DON'T LIKE THAT.
> >> >
> >>
> >>
> >> ----------------------------------------------------------------
> >> This message was sent using IMP, the Internet Messaging Program.
> >> _______________________________________________
> >> rsyslog mailing list
> >> http://lists.adiscon.net/mailman/listinfo/rsyslog
> >> http://www.rsyslog.com/professional-services/
> >> What's up with rsyslog? Follow https://twitter.com/rgerhards
> >> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
> >> myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT
> >> POST if you DON'T LIKE THAT.
> >
> > _______________________________________________
> > rsyslog mailing list
> > http://lists.adiscon.net/mailman/listinfo/rsyslog
> > http://www.rsyslog.com/professional-services/
> > What's up with rsyslog? Follow https://twitter.com/rgerhards
> > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
> > myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT
> > POST if you DON'T LIKE THAT.
> 
> 
> ----------------------------------------------------------------
> This message was sent using IMP, the Internet Messaging Program.
> 
> 

_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Reply via email to