Hi,

I see our messages crossed in transit. You've already researched whether
0x00 occurs in UTF-8.

I have a concern about making C-compatibility a requirement of
-protocol. I understand the concern about the amount of work
implementors may need to do, and it spotential impact on adoption.
However, I think this is a red herring. All an implementor has to do is
put in one piece of code that looks at the incoming message and looks
for 0x00 octets, if they care, and handle however they choose in their
implementation. I do not believe it is appropriate for the standard to
impose a specific solution for this implementation-dependent issue.

Referring again to my SNMP experience, we have found that CLRs - crappy
little rules - constantly come back to bite us in unforeseen ways. We
should write the standard, without a lot of crappy little rules to deal
with corner cases, and let the implementors solve their own
implementation problems.

dbh

> -----Original Message-----
> From: Rainer Gerhards [mailto:[EMAIL PROTECTED]
> Sent: Friday, February 06, 2004 12:14 PM
> To: Anton Okmianski; Harrington, David; [EMAIL PROTECTED]
> Subject: RE: -international: trailer
>
> Anton:
>
> > > I am still tempted to allow only octets in the range of 1..255. ;)
> >
> > I think at least technically this restriction is possible
> because 0x00
> > never appears as part of any characters encoded as multi-octet
> > characters in UTF-8.  See table here:
> > http://www.cl.cam.ac.uk/~mgk25/unicode.html#utf-8
>
> Technically it is possible ... my concern is more existing code. I am
> still in the hope that existing syslog code can be upgraded to
> -protocol. Sure, it requires some change, but it is not that bad.
> However, most of the code I have seen uses ANSI C string, that is a
> string that is terminated by NUL (0x00). If we allow 0x00 in the
> message, all of this code needs to be re-written to use byte-counting
> string libraries. Far from trivial.
>
> If we forbid 0x00, things are much easier. All the needs to be done is
> that the receiver process (subsystem) needs to parse the message as it
> arrives. If there is 0x00 in it, the receiver can log a diagnostic
> message and do whatever else it is configured to do in this case. For
> example, it could replace the 0x00 with a pre-configured replacement
> char ("?" maybe?). Or it could drop the message altogether.
> Whatever it
> does, it can ensure that the message passed on to its upper
> layers is a
> valid C string (and, yes, it should do this sanity check, otherwise
> things may become wild).
>
> This is strictly speaking not a protocol issue, eventually
> not even one
> that the IETF should care about (algorithms and
> implementations seem to
> be out of scope ;)). However, I thinks this is an important real-world
> issue. I even think this can be an issue that can be important for the
> overall success of -protocol. Let's say we require that 0x00
> is a valid
> characters. How many implementors would be tempted to say "we don't
> care, we don't allow it no matter what the RFC says". If we have the
> impression that a fair amount would actually take this route,
> it may be
> wise to forbid 0x00 inside the RFC. It may look ugly
> (especially for the
> Java guys) ... but sometimes it is better to do things in a slightly
> ugly way than to have a hell of interop problems later on.
>
> Having written all this, I'd say I will forbid 0x00 in
> -protocol-03 and
> allow everything else if nobody violently objects.
>
> Rainer
>


Reply via email to