Henry and Martin, Martin J. Dürst, Wed, 18 Dec 2013 16:59:10 +0900, in reply to Henry S. Thompson:
>> * In cases where conflicting information is supplied (from charset >> param, BOM and/or XML encoding declaration) it give a BOM, if >> present, authoritative status; > > I'm a bit uneasy about the fact that we now have BOM (internal) - > charset (external) - encoding (internal), i.e. > internal-external-internal, A better way of looking at would be that we now get External-Internal. Were external is subdivided in charset parameter and encoding signature [BOM]. And internal is subdivided in encoding declaration and default/fallback encoding. Yeah, it might be that a lack of clear classification of the BOM as an external method is quite directly linked the lacking interoperability. Previously we had External-Limbo-Internal. However, per XML, both BOM and charset param are external.[1] The draft makes a point about this:[2] ”[XML] further states that the BOM is an encoding signature, and is not part of either the markup or the character data of the XML document.” > but I guess there is lots of experience > in HTML 5 for giving the BOM precedence. Sorry for focusing on XML rather than XML media types, but I think both of them should be edited. The way of looking at it that I propose above also incorporates the fact that XML-capable Web browsers (the HTML 5 browsers) give precedence to the BOM, and without fatal error if there is a (conflicting) XML encoding declaration. (Btw, I find it very odd that, up until now, the *charset* parameter could override the encoding declaration, but if the BOM does the same [that is: overrides the encoding declaration], *then* it is a fatal error ...) It makes sense to treat all external encoding declaration methods the same. Currently only the external *transport* protocol may override the internal mechanism. But the BOM should have the same ”right”. Therefore I would suggest that the other spec, XML 1.0, section 4.3.3 [3] does this (see the <INS> element): ]]In the absence of information provided by an external transport protocol (e.g. HTTP or MIME) <INS>OR BY THE BYTE ORDER MARK</INS>, it is a fatal error for an entity including an encoding declaration to be presented to the XML processor in an encoding other than that named in the declaration,[[ It should still be an error, but not a fatal error, if the xml encoding declaration conflicts with the external method - BOM or HTTP. [1] http://www.w3.org/TR/REC-xml/#NT-document [2] http://tools.ietf.org/html/draft-ietf-appsawg-xml-mediatypes-06#section-3.3 [3] http://www.w3.org/TR/REC-xml/#charencoding -- leif halvard silli

