On 2013/01/06 7:21, Costello, Roger L. wrote:

Does this mean that when exchanging Unicode data across the Internet the 
endianness is not relevant?

Are these stated correctly:

     When Unicode data is in a file we would say, for example, "The file contains 
UTF-32BE data."

     When Unicode data is in memory we would say, "There is UTF-32 data in 
memory."

     When Unicode data is sent across the Internet we would say, "The UTF-32 data 
was sent across the Internet."

The first is correct. The second is correct. The third is wrong. The Internet deals with data as a series of bytes, and by its nature has to pass data between big-endian and little-endian machines. Therefore, endianness is very important on the Internet. So you would say:

"The UTF-32BE data was sent across the Internet."

Actually, as far as I'm aware of, the labels UTF-16BE and UTF-16LE were first defined in the IETF, see http://tools.ietf.org/html/rfc2781#appendix-A.1.

Because of this, Internet protocols mostly prefer UTF-8 over UTF-16 (or UTF-32), and actual data is also heavily UTF-8. So it would be better to say:

When Unicode data is sent across the Internet we would say, "The UTF-8 data was sent across the Internet."

Regards,   Martin.

Reply via email to