Ludovic Rousseau wrote:
According to [1] you may code some unicode characters on
4 bytes.
[1] http://en.wikipedia.org/wiki/UTF-16
You should consult ISO 10646 [1].
The advice that I was given when having to incorporate multiple
character sets into eURI [2] was that it is satisfactory to restrict an
implementation to UTF-16, as that covers all commercially and government
used written scripts. But designers should make a statement that UTF-16
is used in their work (I'm not sure that I made that clear in eURI...).
Peter
[1] ISO 10646:2003 Information Technology - Universal Multiple-Octet
Coded Character Sets (UCS)
[2] CEN/ISSS CWA 13987:2003 User Related Information, available under
'CEN Workshop Agreements' at www.cenorm.be/isss
_______________________________________________
Muscle mailing list
[email protected]
http://lists.drizzle.com/mailman/listinfo/muscle