Re: utf-8 encoding scheme

Henry Spencer Thu, 27 Jul 2000 21:45:55 -0700
On 21 Jul 2000, H. Peter Anvin wrote:
> > > One possible thing to do in a decoder is to emit U+FFFD SUBSTITUTION
> > > CHARACTER on encountering illegal sequences.
> > Unless you are Bill Gates and have the power to decree that your users
> > *will* use your preferred decoder, this may be a mistake.  Remember that
> > the users of a decoder see no advantage from this behavior, since they are
> > canonicalizing anyway.
> 
> Um... not so...
> The user of the decoder is the user that gets bitten by these security
> holes...

Um, no, I think you've missed my point.  The user of a decoder is *not*
going to get bitten by these security holes, because he's *decoding*.  The
act of decoding transforms the input into a form where these holes do not
exist.  The potential for security holes comes when you attempt to use the
raw input, *without* decoding it.  It is the *non-decoding* users who are
vulnerable. 

This being so, decoding users -- who are not vulnerable -- may balk at
having their programs misbehave on inputs which do not threaten them anyway.

> Implicit aliases are very dangerous.

I agree, but the problem is to protect the non-decoding users, and doing
substitutions in decoders may not be the best way to do that. 

                                                          Henry Spencer
                                                       [EMAIL PROTECTED]

-
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/lists/
Re: utf-8 encoding scheme

Reply via email to