Re: [whatwg] Getting .innerHTML in XML well-formedness issues

Simon Pieters Tue, 10 Jul 2007 08:10:27 -0700

On Fri, 15 Jun 2007 01:02:49 +0200, Ian Hickson <[EMAIL PROTECTED]> wrote:

On Fri, 27 Oct 2006, Simon Pieters wrote:


The spec says that getting .innerHTML in XML must return a
namespace-well-formed XML representation of the element or document. [1]
But what should happen when the DOM isn't namespace-well-formed and it
can't be fixed by namespace prefix rewriting?

E.g., when the DOM contains any of the following?:

  * A ProcessingInstruction node containing ?>
  * A Comment node containing -- (or ending with -)
  * A CDATASection node containing ]]>
[ * A processing instruction with the target "xml"
    (in any case combination)? ]
[ * Or colons in local names or processing instruction targets? ]


...or a DOCTYPE whose publicId or systemId parts contain both " and '
characters.

I've made the spec say that you raise an exception in those six cases.

DOM3 Core says that they "must generate a fatal error during
serialization" (or, for the CDATA case, "the cdata section must be
splitted before the serialization"). Does that mean raise a SYNTAX_ERR
exception?


I used INVALID_STATE_ERR, not SYNTAX_ERR (it's the reverse of a syntax
error).

What about when there are illegal characters?


The DOM doesn't let you create those cases.

Sure it does. e.g. the DOM allows e.g. control characters in variousplaces that XML doesn't. I haven't looked into every production in XML tosee if it differs from the DOM, but I guess you can spec something that iscatch-all, like "if the node contains a character that isn't allowedaccording to the corresponding XML production" or some such... thoughlisting all cases is nicer.

I'm tempted to allow the serialisation of PIs with the name "xml", and to
allow the splitting of CDATA blocks with ]]>. Opinions?


The former wouldn't result in well-formed XML, but the latter is cool.

--
Simon Pieters

Re: [whatwg] Getting .innerHTML in XML well-formedness issues

Reply via email to