For XML there's in fact no problem at all: XML (but also JSON) requires for its validity a single root element. If there's a BOM followed by another element, it is not a conforming XML document if that BOM is interpreted as part of a text element. If there's a BOM followed by an XML declaration, it cannot be a text element (the XML declaration must come before any other element). The only possiblity of ambiguity is an XML document that consists only in a single text element (possibly embedding comments) and no other element and no XML declaration. Such document is purely plain-text in fact (with the only exception of the predefined named or numeric character entities starting by "&" and terminated by ";".
In summary, there's no problem at all for XML, (or JSON, or other text-encoded syntaxes including javascript, where a leading ZWNBSP cannot be valid in its syntax). The theoretical ambiguity only exists with (unstructured) plain text (that have no defined syntax to restrict their validity), and for that plain texts should include a MIME document type in its transport headers to define the behavior of the BOM. And if possible if there's a leading ZWNBSP starting this text, it should be doubled to make sure it will be interpreted correctly, as part of the transport layer. But in practice, unstructured plain text documents never need to start with ZWNBSP (the only exception being in short individual plain text database fields, which are still rarely needed without a container (this includes CSV files where texts fields should be surrounded by quotation marks, or start with a leading row defining names of columns that never need an y leading ZWNBSP). Being liberal does not really introduces a security issue, including for digitally signed texts (signed plain texts also have other requirements related to the interpretation of loine breaks and whitespaces: the simple fix is to start this text by an empty line., and linebreaks and whitespaces are collapsed to a single space prior to computing the diginatl signature (hash / digest). 2015-06-28 14:31 GMT+02:00 Costello, Roger L. <[email protected]>: > Hi Folks, > > Postel's Law says: > > Be liberal in what you accept, and > conservative in what you send. > > How might Postel's Law be applied to web services that receive XML and > sends out XML? > > Here's one idea: a web service is willing to receive UTF-8 XML documents > containing a pseudo-BOM; the web service sends out UTF-8 XML documents > without the pseudo-BOM. > > Can you think of Unicode errors in inbound XML documents that a web > service might be willing to accept? > > /Roger > >

