All:

a late reply as I was on the road.

I agree with Anton. I probably know where some of the concerns stem
from. There are some cultures in the world where Unicode is not
well-accepted (some claim it to the way Unicode was standardized, but
there is no point in elaborating on this). This indeed may cause some
implications. However, I think the usage of Unicode in the protocol
should not cause so much harm that it outweighs the trouble of
supporting a multitude of different character encodings. Of course, for
local storage and visualization other encodings might be used. But this
is beyond the scope of syslog-protocol. If we allowed multiple encodings
in the protocol, syslog senders and receivers would need to be capable
of understanding all of them. This will obviously introduce multiple
interoperability issues.

As such, I, too, strongly advise not to change unicode encoding of the
messages.

Rainer 

> -----Original Message-----
> From: Anton Okmianski (aokmians) [mailto:[EMAIL PROTECTED] 
> Sent: Saturday, May 28, 2005 12:25 AM
> To: Steve Chang (schang99); Alexander Clemm (alex); 
> [EMAIL PROTECTED]; Rainer Gerhards
> Cc: syslog-sec@employees.org
> Subject: RE: [Syslog-sec] Syslog protocol - UTF-8 encoding
> 
> Steve:
> 
> I am not sure I understand which octet you were talking 
> about.  Sorry if
> I missed earlier discussion.
> 
> UTF-8 is based on Unicode.  Unicode provides a constant integer for
> various language/symbol visuals.  UTF-8 encodes those constants into
> variable length byte sequence.  ASCII is one byte, other symbols - two
> or more bytes.  
> 
> In order to display Unicode, you need a viewer which can 
> handle Unicode.
> You do not need to know locale information to display UTF-8 in Unicode
> viewer.  This was the hole idea behind Unicode instead of the 
> legacy of
> gazillion locale-specific encodings. 
> 
> If you are suggesting some indication to parser on whether or not the
> message uses UTF-8 or just strict ASCII subset, then I think that
> indication is already there. You can determine it based on looking at
> the first bit of every byte. Basic (non-extended) ASCII does 
> not use it.
> If the bit is set, you have got extended symbols in your data and need
> UTF-8 aware parser.  
> 
> As far as passing locale info. If there is consensus that it will be
> widely used, we can define a standard structured data tag for it with
> defined semantics.  However, making specific tags required on senders
> would require more discussion. I am not sure all senders will 
> know their
> locale. So, requiring it would be tough. 
> 
> Thanks,
> Anton. 
> 
> > -----Original Message-----
> > From: Steve Chang (schang99) 
> > Sent: Friday, May 27, 2005 5:57 PM
> > To: Anton Okmianski (aokmians); Alexander Clemm (alex); 
> > [EMAIL PROTECTED]; Rainer Gerhards
> > Cc: syslog-sec@employees.org
> > Subject: RE: [Syslog-sec] Syslog protocol - UTF-8 encoding
> > 
> > Hi, Anton:
> > 
> > The suggested octet may not seem necessary from sender's 
> perspective.
> > But as Alex pointed out, the receiving end syslog 
> > server/application can do the decoding easier with the help 
> > of that "encoding type" octet before the structure data and 
> > message body.
> > 
> > Besides, this octet can be helpful to allow other encoding 
> > not limited to laguages, if needed.  And if some specific 
> > value out of the octet is reserved, it can help future 
> > extension for this specification and help ease the extension 
> > related migration issues.
> > 
> > Regards,
> > 
> > Steve
> > 
> > 
> > > -----Original Message-----
> > > From: [EMAIL PROTECTED] 
> [mailto:syslog-sec- 
> > > [EMAIL PROTECTED] On Behalf Of Anton Okmianski 
> > (aokmians)
> > > Sent: Friday, May 27, 2005 2:46 PM
> > > To: Alexander Clemm (alex); [EMAIL PROTECTED]; Rainer Gerhards
> > > Cc: syslog-sec@employees.org
> > > Subject: RE: [Syslog-sec] Syslog protocol - UTF-8 encoding
> > > 
> > > Alex:
> > > 
> > > We had discussions and proposals to support various 
> locale-specific 
> > > encodings early in the process.  We decided against it as 
> > UTF-8 really 
> > > covers representation of all languages.  It is also the general 
> > > direction of IETF for various protocols.  And the 
> > compatibility with ASCII helps too.
> > > I think it is a pretty good choice.
> > > 
> > > Thanks,
> > > Anton.
> > > 
> > > > -----Original Message-----
> > > > From: [EMAIL PROTECTED]
> > > > [mailto:[EMAIL PROTECTED] On Behalf Of 
> > > > Alexander Clemm (alex)
> > > > Sent: Friday, May 27, 2005 4:58 PM
> > > > To: [EMAIL PROTECTED]; Rainer Gerhards
> > > > Cc: syslog-sec@employees.org
> > > > Subject: RE: [Syslog-sec] Syslog protocol - UTF-8 encoding
> > > >
> > > > Andrew, David,
> > > >
> > > > thank you.  I was a bit too quick sending out the earlier 
> > message; I 
> > > > was confused.  With ASCII being effectively a subset of 
> > UTF-8, issue 
> > > > 1 goes away, and as far as issue 2 is concerned, this 
> > does allay my 
> > > > concerns, at least as far as the sender side is 
> concerned.  I am 
> > > > still wondering if for the receiver side it might still 
> > be useful to 
> > > > know what encoding to expect - full UTF-8, or just the 
> > ASCII subset.
> > > > It would be interesting to hear the perspective of 
> someone on the 
> > > > receiver side, but from my point, my concerns are 
> > addressed.  As for 
> > > > other encodings being of interest, while I would not rule 
> > it out I'm 
> > > > not aware of any.
> > > >
> > > > Kind regards
> > > > --- Alex
> > > >
> > > >
> > > >
> > > > -----Original Message-----
> > > > From: David B Harrington [mailto:[EMAIL PROTECTED]
> > > > Sent: Wednesday, May 25, 2005 8:10 PM
> > > > To: [EMAIL PROTECTED]; Alexander Clemm (alex); 
> 'Rainer Gerhards'
> > > > Cc: syslog-sec@employees.org
> > > > Subject: RE: [Syslog-sec] Syslog protocol - UTF-8 encoding
> > > >
> > > > Hi,
> > > >
> > > > In reading my response, it seeems a bit too succinct.
> > > >
> > > > The relevant text from STD63 is:
> > > > "UTF-8, the object of this memo, has a one-octet encoding 
> > unit.  It
> > > >    uses all bits of an octet, but has the quality of 
> > preserving the 
> > > > full
> > > >    US-ASCII [US-ASCII] range: US-ASCII characters are 
> > encoded in one
> > > >    octet having the normal US-ASCII value, and any octet 
> > with such a
> > > >    value can only stand for a US-ASCII character, and 
> > nothing else."
> > > >
> > > > Hope this allays your concerns.
> > > >
> > > > David Harrington
> > > > [EMAIL PROTECTED]
> > > >
> > > > > -----Original Message-----
> > > > > From: [EMAIL PROTECTED]
> > > > > [mailto:[EMAIL PROTECTED] On Behalf 
> > Of David B 
> > > > > Harrington
> > > > > Sent: Wednesday, May 25, 2005 10:58 PM
> > > > > To: 'Alexander Clemm (alex)'; 'Rainer Gerhards'
> > > > > Cc: syslog-sec@employees.org
> > > > > Subject: RE: [Syslog-sec] Syslog protocol - UTF-8 encoding
> > > > >
> > > > > Hi,
> > > > >
> > > > > According to STD63, UTF-8 has the characteristic of 
> > preserving the 
> > > > > full US-ASCII range.
> > > > >
> > > > > David Harrington
> > > > > [EMAIL PROTECTED]
> > > > >
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: [EMAIL PROTECTED]
> > > > > > [mailto:[EMAIL PROTECTED] On Behalf
> > > > Of Alexander
> > > >
> > > > > > Clemm (alex)
> > > > > > Sent: Wednesday, May 25, 2005 8:56 PM
> > > > > > To: Rainer Gerhards
> > > > > > Cc: syslog-sec@employees.org
> > > > > > Subject: [Syslog-sec] Syslog protocol - UTF-8 encoding
> > > > > >
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > 2 questions/ suggestions concerning the UTF-8 
> encoding in the
> > > > syslog
> > > > > > protocol:
> > > > > >
> > > > > > 1) Is the " " (white space) after the header to be 
> encoded in
> > > > ASCII
> > > > > or
> > > > > > UTF-8?  The spec seems currently open to that respect
> > > > (although it
> > > > > > would seem logical for it to be still in ASCII); should be 
> > > > > > clarified.
> > > > > >
> > > > > > 2)   Concerning the UTF-8 encoding, depending on 
> > where you send
> > > > > syslog
> > > > > > messages there are many scenarios in which it would be 
> > > > > > beneficial
> > > > to
> > > > > > have an option in which NOT to use UTF-8 encoding but to
> > > > also allow
> > > > > > for other encodings, in particular plain ASCII.  Such 
> > an option 
> > > > > > would
> > > > > also
> > > > > > allow for quicker adaptation of this specification, 
> as it is 
> > > > > > eases
> > > > > the
> > > > > > migration.  To provide for that, it seems it would 
> > make sense to
> > > > > allow
> > > > > > for a flag in the header part of the message - at the
> > > > tail end (that
> > > >
> > > > > > is known to be still ASCII encoded), right before the 
> > structured 
> > > > > > data, that indicates which encoding is used - that is,
> > > > whether UTF-8
> > > >
> > > > > > is in effect, or if another encoding is used - ex. 
> ASCII, or 
> > > > > > even proprietary.
> > > >
> > > > > >
> > > > > > (Apologies in case this aspect was discussed in the 
> > past and I 
> > > > > > am beating on a dead horse; but this appears important
> > > > enough to bring
> > > > > > up.)
> > > > > >
> > > > > >
> > > > > > --- Alex
> > > > > > _______________________________________________
> > > > > > Syslog-sec mailing list
> > > > > > Syslog-sec@www.employees.org
> > > > > > http://www.employees.org/mailman/listinfo/syslog-sec
> > > > > >
> > > > >
> > > > >
> > > > > _______________________________________________
> > > > > Syslog-sec mailing list
> > > > > Syslog-sec@www.employees.org
> > > > > http://www.employees.org/mailman/listinfo/syslog-sec
> > > > >
> > > > _______________________________________________
> > > > Syslog-sec mailing list
> > > > Syslog-sec@www.employees.org
> > > > http://www.employees.org/mailman/listinfo/syslog-sec
> > > >
> > > _______________________________________________
> > > Syslog-sec mailing list
> > > Syslog-sec@www.employees.org
> > > http://www.employees.org/mailman/listinfo/syslog-sec
> > 
> 
_______________________________________________
Syslog-sec mailing list
Syslog-sec@www.employees.org
http://www.employees.org/mailman/listinfo/syslog-sec

Reply via email to