Justin Dearing wrote:
Thank you all for you quick responses.
On 10/30/07, Jesse Pelton <[EMAIL PROTECTED]> wrote:
Actually, the XML spec discusses the UTF-8 BOM. See
http://www.w3.org/TR/2006/REC-xml-20060816/#sec-guessing-no-ext-info.
Whether it makes sense is another question. I suppose it could be used
to quickly distinguish UTF-8 from ASCII and similar encodings. Since
conforming processors are required to handle UTF-8 and UTF-16, but no
other encodings, this might have some value.
It seems to me, and I'm not an expert on unicode at all, that it would
make sense for the doc building tools to handle the UTF-8 BOM
character if it sometimes gets inserted there. If for the sake of
consistency you want to explicitly forbid UTF-8 BOMs in the
documentation source, but want to make the error thrown more clear you
could do the following. The current exception that gets thrown should
be caught, and become the InnerException of an exception with the
message of "This document has a UTF-8 BOM character. Xerces
Documentation source files should not contain a UTF-8 BOM character."
Any XML-compliant processor should handle documents with a UTF-8 BOM.
However, documents with a UTF-8 BOM are not that common, so there may be a
bug somewhere. In fact, Xerces-C had a bug where we would not always
handle the BOM correctly.
I now know to change a checkbox on XML copy editor to prevent the
issue from affecting me. I will talk to Gerald about default behavior
of XML Copy Editor. However, GVIM, handles the file with or without
the BOM and I couldn't figure out what the problem was until I
generated a diff which rendered the BOM as a 2 printable characters in
the text file since it was no longer the first byte sequence in the
file. TortoiseMerge, and WinMerge both indicate the first line is
changed but do not illustrate the BOM character, which is a bug on
their part so I will report that to their respective maintainers.
It would help to see the error message you're getting, and to know what
tool is issuing it.
Dave
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]