On 26/04/06, Peter Tomlinson <[EMAIL PROTECTED]> wrote:
> Ludovic Rousseau wrote:
> >
> > According to [1] you may code some unicode characters on
> > 4 bytes.
> > [1] http://en.wikipedia.org/wiki/UTF-16
>
> You should consult ISO 10646 [1].
>
> The advice that I was given when having to incorporate multiple
> character sets into eURI [2] was that it is satisfactory to restrict an
> implementation to UTF-16, as that covers all commercially and government
> used written scripts. But designers should make a statement that UTF-16
> is used in their work (I'm not sure that I made that clear in eURI...).

I think I know why Microsoft or Java uses UCS-2. Unicode 1.0 was only
16 bits [1].

But I don't see why UTF-16 is better than UTF-8 if the choice is made
_now_. Maybe because functions to manipulate UTF-8 are not available
in Windows and Java?

Bye,

[1] http://www.debian.org/doc/manuals/intro-i18n/ch-codes.en.html#s-surrogate

--
  Dr. Ludovic Rousseau

_______________________________________________
Muscle mailing list
[email protected]
http://lists.drizzle.com/mailman/listinfo/muscle

Reply via email to