Hello Henry,
Some comments on your specific questions, which may trigger some
additional discussion.
On 2013/12/12 1:43, Henry S. Thompson wrote:
I'm one of the editors of a proposed replacement for RFC3023 [1], the
media type registration for application/xml, text/xml and 3 others.
The draft replacement [2] includes several significant changes in the
handling of information about character encoding:
* In cases where conflicting information is supplied (from charset
param, BOM and/or XML encoding declaration) it give a BOM, if
present, authoritative status;
I'm a bit uneasy about the fact that we now have BOM (internal) -
charset (external) - encoding (internal), i.e.
internal-external-internal, but I guess there is lots of experience in
HTML 5 for giving the BOM precedence. Also, it will be extremely rare to
have something that looks like a BOM but isn't, and this combined with
the fact that XML balks on encoding errors should make things quite robust.
* It recommends against the use of UTF-32.
UTF-32 has some (limited) appeal for internal representation, but none
really on the network, and media types are for network interchange, so
this should be fine, too.
Regards, Martin.
The interoperability situation in this space is currently poor, with
some tools treating a charset parameter as authoritative, but the HTML
5 spec and most browsers preferring the BOM. The goal of the draft is
to specify an approach which will promote convergence, while
minimising the risk of damage from backward incompatibilities.
Since these changes overlap with a wide range of technologies, I'm
seeking review outside the relevant IETF mailing list
(apps-disc...@ietf.org) -- please take a look if you can, particularly
at Section 3 [3] and Appendix C [4].
Thanks,
ht
[1] http://tools.ietf.org/html/rfc3023
[2] http://tools.ietf.org/html/draft-ietf-appsawg-xml-mediatypes-06
[3] http://tools.ietf.org/html/draft-ietf-appsawg-xml-mediatypes-06#section-3
[4] http://tools.ietf.org/html/draft-ietf-appsawg-xml-mediatypes-06#appendix-C