On 26/04/06, Peter Tomlinson <[EMAIL PROTECTED]> wrote: > Ludovic Rousseau wrote: > > > > According to [1] you may code some unicode characters on > > 4 bytes. > > [1] http://en.wikipedia.org/wiki/UTF-16 > > You should consult ISO 10646 [1]. > > The advice that I was given when having to incorporate multiple > character sets into eURI [2] was that it is satisfactory to restrict an > implementation to UTF-16, as that covers all commercially and government > used written scripts. But designers should make a statement that UTF-16 > is used in their work (I'm not sure that I made that clear in eURI...).
I think I know why Microsoft or Java uses UCS-2. Unicode 1.0 was only 16 bits [1]. But I don't see why UTF-16 is better than UTF-8 if the choice is made _now_. Maybe because functions to manipulate UTF-8 are not available in Windows and Java? Bye, [1] http://www.debian.org/doc/manuals/intro-i18n/ch-codes.en.html#s-surrogate -- Dr. Ludovic Rousseau _______________________________________________ Muscle mailing list [email protected] http://lists.drizzle.com/mailman/listinfo/muscle
