Edmund,

>
> Note also that "\xe0\x84\x80" is illegal, for example, as U+0100
> should be represented only by "\xc4\x80".

Likewise \xF0\x80\x84\x80 would be ilegal as well.  I had not considered it.

I guess I should also stop encoding spaces as \xC0\xA0  ;-}

>
> Perhaps you want to exclude U+FFFF, too.
>

You are right.

Carl



-
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Reply via email to