Am Mittwoch, 20. Januar 2010 19:20:31 schrieb Jakob Haufe:
> On Wed, 6 Jan 2010 16:14:59 +0100
> 
> Marc Schiffbauer <[email protected]> wrote:
> > which encoding should be chosen for the database when using postgres?
> 
> As far as I understand the syslog protocol (at least the legacy one), it
> has no concept of character encodings at all.  So if you simply want to
> make sure that everything ends up in the database "as is", then choose
> SQL_ASCII.

This is what I did in the end. And it works good now.

> 
> > My rsyslog version is 4.4.3.
> > 
> > Which client_encoding does rsyslog use in ompgsql?
> 
> Right now, it does net set an encoding by itself, so the database default
> applies. If I'm not mistaken, you can even set that per user from inside of
> postgres. So I would rather vote against another configuration parameter
> here.

ACK

> 
> > I currently have set UTF-8 on the database. It worked for a while until
> > some special message arrived at the server where postgres denies the
> > INSERT:
> > 
> > 2010-01-06 16:13:11 CET syslog syslog ERROR:  invalid byte sequence for
> > encoding "UTF8": 0xd220
> > 2010-01-06 16:13:11 CET syslog syslog HINT:  This error can also happen
> > if the byte sequence does not match the encoding expected by the server,
> > which is controlled by "client_encoding".
> 
> Were you able to isolate the message? Or find out which program was sending
> it?

I was able to identify it: Some servers sent data about strings found in 
system BIOS (read by dmidecode so something like that)
It was just some strange charcters in a model or device name string set by a 
hardware vendor (compaq IIRC)


> 
> > Now rsyslog is not able to log anything... it is currently spooling to
> > disk because it "hangs" at this message not being accepted by postgres.
> 
> This is bad, because if the machine is an open syslog server that simply
> collects everything it gets, we have a potential DoS vector here.
> 

True.


> I can think of three options:
> 
> * Drop the message and report that we did so. That would be rather easy,
>   but might not be what people want.
> 

But this might be the best option I guess. Maybe the original message could 
then be written to a special logfile on disk.

> * Re-insert the message after converting it from ASCII to UTF-8 or whatever
>   the DB encoding is. But this might/will produce garbage if the input is
> not ASCII. It also creates more load on the system if these messages are
> frequent. Guessing the input encoding is hard or even impossible,
> depending on the set you guess from.

Yes but this would be an option. I would vote for creating a warning message 
in these cases as well.

> 
> * Make the database SQL_ASCII. This will silently accept anything but will
>   create nonsense from UTF/UCS encoded messages. Also might create trouble
>   for programs like phplogcon that analyze the logs.
> 

This is what I did. And phplogcon had no problems at all displaying everything 
as expected. Even those strange messages that were not accepted by postgres 
look as in the original message that came via syslog.

This might only work if apache and the browser all "speak" UTF-8.



> For me, this sums up to one question:
> 
> Can we make ompgsql UTF/UCS-clean and at the same time not choke on
> non-UTF8 strings? Everyone is trying to be UTF-8 clean these days, so it
> would be bad if ompgsql could not keep up.

I think this is a special case because rsyslog is not the originator of those 
messages. It "just" transports them. And because the syslog-Protocol does not 
define something like encoding in any way the best thing to do is just leave 
those strings "as-is" and make the database behind it do so as well with 
SQL_ASCII.

I thing everythign else will be error prone in some way. The Documentation of 
rsyslog should bring a big fat NOTE that the database must be SQL_ASCII as 
other wise thesrings might not be accepted.


-Marc

> 
> Comments please.
> 
> Regards,
> Jakab Haufe (sur5r)
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com

Reply via email to