Hi Nicholas,
UTF-8 datastreams can contain a BOM. However, UTF-8 is byte oriented and always has the same byte order. A BOM can be used as a signature, but it will make no difference to the endianness of the bytestream. I agree with you that it may be helpful to some applications to identify the encoding form. The danger is though; some recipients of UTF-8 encoded data do not expect a BOM. Especially if UTF-8 is used in 8-bit environments, the use of a BOM will interfere with any protocol or file format that expects specific ASCII characters at the beginning, such as the use of "#!" of at the beginning of Unix shell scripts. Cheers, -Ozgur Sahoglu ________________________________ From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Tuesday, March 18, 2008 7:38 AM To: [email protected] Subject: BOM for UTF-8 Hello, Has there been any discussion or thought to adding a BOM to a UTF-8 serialized file if the developer specifically set the BOM feature? By default, this should not exist, but the BOM is pretty useful for certain editors to correctly identify the underlying encoding if they are not parsing the first line. Thanks, Nicholas Thayer
