Rainer:

It is a tough one.  You almost convinced me.  But you talk about server
implementation's side only. What about client's side?

If I write a Java client and 0x00 is a prohibited character, do you
think I will have to scan each string passed to me to make sure it does
not include 0x00?  It would seem like an unnecessary overhead. And then
what should the client library do if it does get one?

We might be trading ease of writing a server for ease of writing a
client here.

I don't know why somebody would pass 0x00 character to a logging
library, but as a library writer, this is not my job to decide.  I just
have to ensure compliancy with standard, right?

Have a good weekend everyone!

Anton.

> -----Original Message-----
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of Rainer Gerhards
> Sent: Friday, February 06, 2004 4:02 PM
> To: Harrington, David; [EMAIL PROTECTED]
> Subject: RE: -international: trailer
>
>
> > I have a concern about making C-compatibility a requirement of
> > -protocol. I understand the concern about the amount of work
> > implementors may need to do, and it spotential impact on adoption.
> > However, I think this is a red herring.
>
> I know that we do not discuss progamming languages here in
> the IETF. I hope I will be granted a quick exception because
> it is hard to otherwise show the importance of this point.
>
> One primary thing we need to keep in mind is that most syslog
> code today is written in C and I guess that this will be the
> case for quite some time. So it may be worth taking a closer
> look at it...
>
> > All an implementor
> > has to do is
> > put in one piece of code that looks at the incoming message
> and looks
> > for 0x00 octets, if they care, and handle however they
> choose in their
> > implementation.
>
> Actually, this is the issue. It is *not* easy to do this. It
> requires architectural change to the application and even
> requires that the normal run time library can NOT be used.
>
> Why that? We are talking about Unicode. As such, all octet
> values are defined, there is no single octet value that the
> 0x00 could be mapped to. It can also not be mapped to a
> multi-byte sequence, because they, too, are all taken up by
> the "normal" encodings. So in order to support this in C, you
> must do either one of these:
>
> A) extend the character size, e.g. use not 16 or 32 bit for
> the character but 24 or 40. This gives you room for the extra
> "flag bit" to escape the 0x00 value.
> B) use (or write) a non-standard string library, that handles
> strings based on a byte counter (as Java and hopefully C# does)
>
> I think A) is an totally impractical approach. B) works, but
> requires complete re-design of (most) existing applications.
> It also forces developers to use a non-standard (but more
> secure!) approach. In the C community, there is a lot of
> objection to byte-counted strings. This alone can cause some
> acceptance problems.
>
> Then, some other systems/tools that are written in C may also
> misbehave if they have to deal with 0x00 characters. I think
> a number of *nix system tools qualifies as victims. I have no
> idea about PERL, but I have a weak feeling that it may have
> problems handling 0x00 inside strings, too. If that would be
> the case, that would be bad, too, because a lot of
> administrators use PERL to analyze their logs (just think
> about SWATCH).
>
> Granted, this is a programming issue - and parts of it are
> not even related to on-the-wire protocol.
>
> I wouldn't care if the 0x00 could have *any* legitimate use.
> But is has NOT. I can hardly envision any legitimate use for
> 0x00 inside a message. For UTF-8, it is not needed (it is in
> the reserved US-ASCII pane). In US-ASCII, it is traditionally
> a) the C string terminator and b) a fill-character (NUL),
> thrown in a string to give a slow tty time to catch up (eg
> after receiving a CR character) - remember those devices
> connected at 110 baud ;). So why should it be in a syslog
> message? It traditionally was never seen in syslog nor in any
> other message text.
>
> Besides that, 0x00 has a prooven track record in causing
> security issues (of course, all boiling down to the "smart" C
> string handling, which is another issue in itself...).
>
> So this is my point: we know that 0x00 potentially causes
> security troubles, causes big implementation issues, costs us
> acceptance - and we can't find a legit use for it. On the
> other hand (in my point of view), we have the point that
> allowing 0x00 is cleaner and less crippled.
>
> If I weight both of the arguments, I come to the conclusion
> that it may be better to disallow 0x00, as it may not be as
> clean but has some obvious advantages...
>
> Rainer
>
>



Reply via email to