Ludovic Rousseau wrote:

According to [1] you may code some unicode characters on
4 bytes. [1] http://en.wikipedia.org/wiki/UTF-16

You should consult ISO 10646 [1].

The advice that I was given when having to incorporate multiple character sets into eURI [2] was that it is satisfactory to restrict an implementation to UTF-16, as that covers all commercially and government used written scripts. But designers should make a statement that UTF-16 is used in their work (I'm not sure that I made that clear in eURI...).

Peter

[1] ISO 10646:2003 Information Technology - Universal Multiple-Octet Coded Character Sets (UCS)

[2] CEN/ISSS CWA 13987:2003 User Related Information, available under 'CEN Workshop Agreements' at www.cenorm.be/isss


_______________________________________________
Muscle mailing list
[email protected]
http://lists.drizzle.com/mailman/listinfo/muscle

Reply via email to