I agree.

Besides, it seems to me if we prohibit something like this, the
implementations would still need to be aware of it.  In fact, they will
need to check for it in order to be compliant with the RFC and log a
diagnostic message if they receive something non-compliant.

Also, most "modern" languages like C#, Java have built in support for
Unicode. At least in Java, 0x00 character has no special meaning in a
string. Example:

 char[] str = new char[] { 0x41, 0x42, 0x00, 0x43 };
 String string = new String( str );
 System.out.println("Length: " + string.length());
 System.out.println("String: [" + string + "]");

Output...

 Length: 4
 String: [AB C]

Anton.

> -----Original Message-----
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of Harrington, David
> Sent: Friday, February 06, 2004 12:49 PM
> To: Rainer Gerhards; Anton Okmianski; [EMAIL PROTECTED]
> Subject: RE: -international: trailer
>
>
> Hi,
>
> I see our messages crossed in transit. You've already
> researched whether 0x00 occurs in UTF-8.
>
> I have a concern about making C-compatibility a requirement
> of -protocol. I understand the concern about the amount of
> work implementors may need to do, and it spotential impact on
> adoption. However, I think this is a red herring. All an
> implementor has to do is put in one piece of code that looks
> at the incoming message and looks for 0x00 octets, if they
> care, and handle however they choose in their implementation.
> I do not believe it is appropriate for the standard to impose
> a specific solution for this implementation-dependent issue.
>
> Referring again to my SNMP experience, we have found that
> CLRs - crappy little rules - constantly come back to bite us
> in unforeseen ways. We should write the standard, without a
> lot of crappy little rules to deal with corner cases, and let
> the implementors solve their own implementation problems.
>
> dbh
>
> > -----Original Message-----
> > From: Rainer Gerhards [mailto:[EMAIL PROTECTED]
> > Sent: Friday, February 06, 2004 12:14 PM
> > To: Anton Okmianski; Harrington, David; [EMAIL PROTECTED]
> > Subject: RE: -international: trailer
> >
> > Anton:
> >
> > > > I am still tempted to allow only octets in the range of
> 1..255. ;)
> > >
> > > I think at least technically this restriction is possible
> > because 0x00
> > > never appears as part of any characters encoded as multi-octet
> > > characters in UTF-8.  See table here:
> > > http://www.cl.cam.ac.uk/~mgk25/unicode.html#utf-8
> >
> > Technically it is possible ... my concern is more existing
> code. I am
> > still in the hope that existing syslog code can be upgraded to
> > -protocol. Sure, it requires some change, but it is not that bad.
> > However, most of the code I have seen uses ANSI C string, that is a
> > string that is terminated by NUL (0x00). If we allow 0x00 in the
> > message, all of this code needs to be re-written to use
> byte-counting
> > string libraries. Far from trivial.
> >
> > If we forbid 0x00, things are much easier. All the needs to
> be done is
> > that the receiver process (subsystem) needs to parse the
> message as it
> > arrives. If there is 0x00 in it, the receiver can log a diagnostic
> > message and do whatever else it is configured to do in this
> case. For
> > example, it could replace the 0x00 with a pre-configured
> replacement
> > char ("?" maybe?). Or it could drop the message altogether.
> Whatever
> > it does, it can ensure that the message passed on to its upper
> > layers is a
> > valid C string (and, yes, it should do this sanity check, otherwise
> > things may become wild).
> >
> > This is strictly speaking not a protocol issue, eventually not even
> > one that the IETF should care about (algorithms and
> > implementations seem to
> > be out of scope ;)). However, I thinks this is an important
> real-world
> > issue. I even think this can be an issue that can be
> important for the
> > overall success of -protocol. Let's say we require that 0x00
> > is a valid
> > characters. How many implementors would be tempted to say "we don't
> > care, we don't allow it no matter what the RFC says". If we have the
> > impression that a fair amount would actually take this route,
> > it may be
> > wise to forbid 0x00 inside the RFC. It may look ugly
> > (especially for the
> > Java guys) ... but sometimes it is better to do things in a slightly
> > ugly way than to have a hell of interop problems later on.
> >
> > Having written all this, I'd say I will forbid 0x00 in -protocol-03
> > and allow everything else if nobody violently objects.
> >
> > Rainer
> >
>
>



Reply via email to