Hi, everyone, It's almost unbelievable to me how many email postings are wasted on discussions such as this UTF-8 BOM issue ... I guess it means that there is a lot of BADLY WRITTEN software out there in the world ;-)
With regard to READING incoming UTF-8 text streams, surely any good software designer will do exactly as Michael Michka has suggested here: > INCOMING TEXT: Trivial to simply check. I say (once again) its THREE > BYTES. With regard to EMITTING outgoing UTF-8 text streams, IMHO the default should be to do what is simplest, which is *not* output the BOM. It is superfluous to have it on UTF-8 streams. There's no harm in having a global option to turn BOM outputting on for the benefit of BRAIN-DEAD programs that are going to read the text: > EMITTING: They could simply choose globally whether to emit the BOM or not. > If they wanted to get "fancy" they could have a command line option which > said whether to emit the bytes or not. But that is optional. The whole issue is analogous to the CR\LF issue in ASCII texts across different platforms. Well-written software is able to READ the text properly regardless of whether lines end in CR, LF, or CR\LF.

