On Thu, 12 Jun 2003, Murray Sargent wrote:
>A key point so far missing in this thread and as far as I can tell in
>Markus's paper is that UTF-8 is a subset of FSS-UTF and not the full
>FSS-UTF.
No, it is an evolved (and renamed) version of FSS-UTF, rather than a
distinct encoding.
>In particular, UTF-8 has the restrictions:
>1. Shortest UTF-8 form for a 32-bit value is always used; longer forms
>are illegal
Already present in the original FSS-UTF proposal from Ken Thompson. (The
omission of this rule from some later documents is regrettable but is
essentially an error in those documents.)
>2. The surrogate codes 0xD800 - 0xDFFF are illegal in UTF-8 form
>3. Only the first 17 planes are legal...
These are later, evolutionary changes in FSS-UTF/UTF-8 to match changes in
Unicode, not fundamental differences between two different encodings.
Henry Spencer
[EMAIL PROTECTED]
--
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/linux-utf8/