Hi all,
I have just noted down some specifics on how the message could be
encoded and flaged. I hope this more formal spec will raise some
additional comments and describe better what I am thinking along:
----
The CONTENT part of a I18N syslog message has the following ABNF [12]
definition (I just filled in those details that I so far made my
mind up on):
CONTENT = HDR-I18N SP CONTENT-I18N
HDR-I18N = COOKIE ENCODING
COOKIE = "@#" %d73 "18" %d110 ; that is: "@#I18n"
; note the capital "I" and lower case "n"
ENCODING = ??? ; to be decided, e.g. USASCII, but prefer
; "MIME-binding" NO SP allowed
MSGN-I18N = 1*%((%d33-126) / SP)
SP = %d32
As can be seen, an I18N content message is embeded into a RFC3164
[12] content field. The I18N content is distinguished from plain
RFC3164 content by the presence of a HDR-I18N COOKIE. If the COOKIE
is present, the ENCODING part of the HDR-I18N tells which encoding is
used. After a space, the actual content appears.
I have not yet put any real thoughts into what the actual encoding
may look like. I have the weak opinion that we can borrow something
by MIME. For example, we can support base64 and quoted-printable
encodings and have different charsets. Also UTF-7 is a good choice. I
need to do more homework reading the other specs to come up with a
real suggestion. Comments in this regard are extermely welcome.
The following examples are given.
Example 1
<34>Oct 11 22:14:15 mymachine su: @#I18n??USANSI?? 'su root'
failed for
lonvick on /dev/pts/8
In this example, as it was originally described in RFC 3164, the
message CONTENT actually is in US ANSI so it could also be sent in a
plain RFC3164 message. To remove uncertainty, it was specifically
flagged as being US ANSI.
Example 2
<165>Aug 24 05:34:00 10.1.1.1 myproc[10]:
@#I18n??QUOTED-PRINTABLE??Gr=FC=DF Gott
In this example, we have non US ANSI characters. The CONTENT part
contains "Gruess Gott" which is the Bavarian way of saying hello. I
am using a replacement writing method to make this readable in US
ANSI. The actual string in ABNF is %x47.72.fc.fd.20.47.6f.74.74. As
the content encoding notation and specification is not yet decided, I
used QUOTED-PRINTABLE encoding in this sample.
----
Please notet that I also changed where the i18n header occurs: it is no
longer at the start of the MSG part but rather at the start of the
CONTENT part itself. I think this is more natural, as the TAG value
should have no problem with being in US ANSI. At least, it would greatly
easy compatibility with existing systems and recordings. On the other
hand, it may introduce uncertainties if there is a good reason for non
US ANSI characters in the TAG. Maybe we already have them in some
environments...
I hope this clarifies what I am currently thinking. Especially the
samples should provide a good view of how I think i18n can be embeded in
existing syslog without any change in any existing components (relays
will not need to be touched at all).
I am looking forward to any comments.
Rainer
> Hi Rainer,
>
> Having an unusual header in the MSG part of the syslog
> message would be
> fine, but also adding some readable code to determine the
> encoding type
> would be good.
>
> @@$$##USASCII or @@$$##Base64 etc.
>
> Syslog message are meant to be read by humans after all :)
>
> Even if we can't decode the text that follows, we will know that it is
> encoded and in what form.
>
> Cheers
>
> Andrew
>
>
>
> -----Original Message-----
> From: Rainer Gerhards [mailto:[EMAIL PROTECTED]
> Sent: Friday, 18 July 2003 1:10 a.m.
> To: Andrew Ross
> Cc: [EMAIL PROTECTED]
> Subject: RE: Syslog Internationalization - Message size
>
>
> > Any one have any thoughts on the syntax to specify the encoding?
> >
> > Encoding=USASCII
> > Encoding=Base64
>
> How about borrowing somthing from mime?
>
> Anyhow, I think the sequence must start with some "unusual
> sequence" so
> that it most probably can not mistakenly occur within a non-i18n
> message. For example something like (wihtout quotes) "@@$$##". This at
> the very beginning (byte 0) of the MSG part means that it is an
> i18n-enabled payload.
>
> Does this make sense?
>
> Rainer
>
>
>