> FdC> OK, let's go ahead and try to solve specific problems as they
> FdC> come up.  For example, can UTF-8 be used in the 7-bit
> FdC> environment? :-)
> 
> Easy.  A UTF-8 character is represented either by a GL code, in which
> case it can represent itself, or by a sequence
> 
>   x_1 ... x_k
> 
> where the x_i are eight-bit codes with the high bit set.  Such a
> character can be represented by
> 
>   SO x'_1 ... x'_k SI
> 
> where the x'_i are the x_i with the high bit stripped.
> 
It was a trick question.  SO/SI only work on characters in the G1 range.
C1 != G1.  If it worked on C1 characters, then how would you tell the
difference between shifted and unshifted SI?

- Frank


-
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Reply via email to