Thanks for the link! The section on the BOM, written by Mark Davis, President of the Unicode Consortium, gives no indication that a UTF-8 stream should never have a BOM.
I quote: Q: When a BOM is used, is it only in 16-bit Unicode text? A: No, a BOM can be used as a signature no matter how the Unicode text is transformed: UTF-16, UTF-8, UTF-7, etc. ...and... Q: Can a UTF-8 data stream contain the BOM character (in UTF-8 form)? A: Yes, UTF-8 can contain a BOM. ...and... Q: Why wouldn't I always use a protocol that requires a BOM? A: Where the data is typed, such as a field in a database, a BOM is unnecessary. If there is a file on disc called foo.txt, it is clearly not typed data. Thus, it appears to be Mr Davis' opinion that when such a file contains UTF-8 data, it is quite appropriate for there to be a BOM at the start. If Mr Freytag still disagrees, I hope he will explain why. Thanks! - rick cameron -----Original Message----- From: Tom Gewecke [mailto:[EMAIL PROTECTED]] Sent: Thursday, 14 February 2002 20:42 To: [EMAIL PROTECTED] Subject: RE: Unicode and end users >Can you please expand on your statement that UTF-8 should never have a >BOM? Having one makes it very easy to distinguish a text file that >contains UTF-8 from one that contains text in the system default MBCS >encoding. > >You may not be surprised to learn that Microsoft (or, at least, one of >its >programmers) does not agree with you. When I save a file from Notepad on >Windows XP in UTF-8, the file contains a BOM. It seems there are quite a few answers to these questions in the Unicode utf-bom faq http://www.unicode.org/unicode/faq/utf_bom.html including mention of the Microsoft case and the fact that generally a BOM can be used with any UTF.