Hello Lazarus-List, Thursday, May 5, 2011, 12:20:10 PM, you wrote:
MW> According to unicode.org, when UTF-16 got introduced, the USC2 standard MW> was extended. So yes they are the same. MW> In some cases when ppl refer to USC2 they mean the old unicode 1.1 MW> standard, but that is wrong. The name USC2 is misleading and should not MW> be used anymore. Maybe, but, taken from www.unicode.org glossary of terms: UCS-2. ISO/IEC 10646 encoding form: Universal Character Set coded in 2 octets, limited to the Basic Multilingual Plane. (See Appendix C, Relationship to ISO/IEC 10646.) UTF-16. A multibyte encoding for text that represents each Unicode character with 2 or 4 bytes; it is not backward-compatible with ASCII. It is the internal form of Unicode in many programming languages, such as Java, C#, and JavaScript, and in many operating systems. More technically: (1) The UTF-16 encoding form. (2) The UTF-16 encoding scheme. (3) Transformation format for 16 planes of Group 00, defined in Annex C of ISO/IEC 10646:2003; technically equivalent to the definitions in the Unicode Standard. -------------------- I think that the text that says the UCS2 has been extended, does not means that UCS2 has been extended, it says that UCS2 has been extended to UTF-16, so UCS2 can not be considered Unicode anymore as noted in ISO 10646: UCS-2. UCS-2 stands for Universal Character Set coded in 2 octets and is also known as the two-octet BMP form. It was documented in earlier editions of 10646 as the two-octet (16-bit) encoding consisting only of code positions for plane zero, the Basic Multilingual Plane. This documentation has been removed from ISO/IEC 10646:2011, and the term UCS-2 should now be considered obsolete. It no longer refers to an encoding form in either 10646 or the Unicode Standard. -- Best regards, José -- _______________________________________________ Lazarus mailing list [email protected] http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus
