At 11:41 AM 7/20/00 -0800, Ken Krugler wrote:
>No. UCS-2 and UCS-4 have always been bigendian. Read ISO 10646-1:1993,
>section "6.3 Octet order" (page 7):
>
>   When serialized as octets, a more significant octet shall
>   precede less significant octets.

The section continues: "When not serialized as octets the order of octets 
may be specified by an agreement between sender and recipient (see claus 
17.1 and Annnex F )"

Annex F introduces the BOM.

On the face of it the two parts of clause 6.3 seem to be a bit 
self-contradictory and could possibly stand some editorial clarification, 
but on the whole, even ISO/IEC 10646 recognizes that other byte orders 
exist and suggests means (in Annex F) how sender and recipient might 
communicate this fact.

Since the time of writing for this clause (1991), both the amount of data 
in the various byte order, and practical experience with Unicode has 
increased dramatically and the full discussion is available in

http://www.unicode.org/unicode/reports/tr17 Character Encoding Model

as well as the relevant sections of The Unicode Standard, Version 3.0


Note that there is no such thing as UCS-2LE or UCS-2BE. These terms are not 
defined anywhere, but UTF-16LE and UTF-16BE are. Unicode has adopted the 
philosophy that indications of subsets (e.g. surrogate-accessible 
characters supported or not) is not something that belongs in the 
designation of the encoding form.

A./

Reply via email to