RE: Syslog Internationalization - Message size

Rainer Gerhards Thu, 17 Jul 2003 05:45:16 -0700

Hi Andrew,

> Can anyone come up with a way that the initial message header
> can be read in any character set and still make sense?


I strongly think the header should NOT be changed. In my view, i18n
should apply to the payload, only. Besides that, US ANSI is always
present in foreign character sets. I know this for sure for European
languages as well as Japanese, Korean and Chineese. With European
alphabets, some characters may be reassigned. But these are seldomly
used ones like "~" and "^". This should be no problem at all as far as
the header is concerned.

>
> If the header contained the message encoding method then we
> could choose to encode it in any method we wanted. UUE,
> Base64, UTF-8 etc. Final syslog server that writes the data
> to file or displays messages would simply decode data into
> users desired character set. Relays could just pass the data
> unaltered.

Yes, I agree that this is a solution to solve the problem of the
potential need (?) for differnt encodings. However, I would stay away
from the header and tend to define a payload format that introduces a
"payload header" which then

A) designates this as an i18n enabled message
B) specifies the charset
C) has another payload section which then includes the actual message

Of course, all of this needs to be backwards compatible. We can't risk
to break existing implementations...
>
> Bearing in mind that the syslog messages were originally
> intended to be read by an operator at some stage. It is no
> use if they remain encoded at the end point. :)

That is a beauty of UTF-7: it remains at least partly readable. The
actual storage is not part of the protocol, I think it should be left to
the syslogd implementation.

>
> Maybe after the <PRI> code, we use a couple of bytes to
> indicate the message encoding type. Better still, make it the
> first two bytes that determine the whole message encoding
> system. FF 01 = uue, FF 02=UTF-8 etc. Avoid using any 00
> bytes so as not to confuse C strings.

I strongly object this - it would break all current RFCs/Ids ;)

>
> My 2c worth on TCP vs UDP.
>
> UDP is great for a last minute "I am dying" type message from
> a device. TCP would require ACK back and a connection setup
> if not already connected. If we go for something like BEEP
> with the extra handshaking we are in for a longer wait before
> sending the message.
>
> Question is, how many "I am dying" type messages do we get vs
> normal "user logged in" type messages? Probably not enough to
> warrant staying with UDP.
>

I posted a syslog-like protocol draft that details TCP transport. It
also nicely shows a number of shortcomings this approach has. I'll dig
out the URL and post it here for those interested.

In general, I (now;)) think that RFC3195 is the right way ... We just
need to get the small libs up and running (I have to admit that my
ambitions in this regard have been severely hit by other work). It is
the right long term solution. If you really think about a reliable
protocol, you'll end up with mtr's point that if you do it up you end up
at a protocol very similar to BEEP ;)

Aynhow, I wouldn't object to participate if there is some support for an
interim solution or at least a documentation of the current state. As
Andrew and others pointed out, there *are* a good number of existing
syslog/tcp applications out there, and at least 4 of them (Pix, Kiwi,
syslog-ng and WinSyslog) are interoperable.

Looking forward to comments!

Rainer

RE: Syslog Internationalization - Message size

Reply via email to