On Mon, Apr 4, 2011 at 17:51, Yitzchak Gale <[email protected]> wrote: > malcolm.wallace wrote: > >> BOM is not part of UTF8, because UTF8 is byte-oriented. But > applications > >> should be prepared to read and discard it, because some applications > >> erroneously generate it. > > For maximum portability, the standard should be require compilers > to accept and discard an optional BOM as the first character of a > source code file. > > Tako Schotanus wrote: > > That's not what the official unicode site says in its FAQ: > > http://unicode.org/faq/utf_bom.html#bom4 and > http://unicode.org/faq/utf_bom.html#bom5 > > That FAQ clearly states that BOM is part of some "protocols". > It carefully avoids stating whether it is part of the encoding. > > It is certainly not erroneous to include the BOM > if it is part of the protocol for the applications being used. > Applications can include whatever characters they'd like, and > they can use whatever handshake mechanism they'd like to > agree upon an encoding. The BOM mechanism is common > on the Windows platform. It has since appeared in other > places as well, but it is certainly not universally adopted. > > Python supports a pseudo-encoding called "utf8-bom" that > automatically generates and discards the BOM in support > of that handshake mechanism But it isn't really an encoding, > it's a convenience. > > Part of the source of all this confusion is some documentation > that appeared in the past on Microsoft's site which was unclear > about the fact that the BOM handshake is a protocol adopted > by Microsoft, not a part of the encoding itself. Some people > claim that this was intentional, part of the "extend and embrace" > tactic Microsoft allegedly employed in those days in an effort > to expand its monopoly. > > The wording of the Unicode FAQ is obviously trying to tip-toe > diplomatically around this issue without arousing the ire of > either pro-Microsoft or anti-Microsoft developers. > > Some reliable sources for all this would be entertaining (although irrelevant for the rest of this discussion).
Cheers, -Tako
_______________________________________________ Haskell-Cafe mailing list [email protected] http://www.haskell.org/mailman/listinfo/haskell-cafe
