Michel Fortin wrote:

I've started a wiki page about the common subset:
<http://wiki.whatwg.org/wiki/Common_Subset>

I'd like to explore this from a different angle.

Libraries (like html5lib) will likely provide a means to serialize a DOM, and will presumably have unit tests.

The question is: does it make sense to standardize what such a method produces? HTML allows variations on the case of elements, single vs. double vs. no quoting of attributes, etc.

If such were standardized, how would the HTML5 canonical serialization differ from the XHTML5 canonical serialization (in fact, must they be different at all?)

In any case, a desirable feature of such a serialization would be the ability to round trip. For HTML5, this would only apply to all valid HTML5 documents: as an example, one could artificially produce a DOM which contains a <h1> inside the <head> element; if such a DOM were serialized and then parsed by an HTML5 parser, the DOM produced would differ, as well it should.

If there is no interest in standardizing a serialization (or separate standard serializations form HTML5 and XHTML5), then this discussion belongs on [EMAIL PROTECTED] mailing list.

- Sam Ruby

Reply via email to