On Dec 2, 2006, at 14:02, Elliotte Harold wrote:

Lachlan Hunt wrote:

HTML and XML have significantly different parsing requirements and they absolutely must be treated as significantly different file formats. Any attempt to treat them as the same format is an extremely bad idea.

That's only true to the extent that some people seem to insist on making them needlessly different. HTML is tantalizingly close to well-formed XML. They both derive from SGML. They both use angle bracketed tags. They both define a tree structure. Indeed in many cases an HTML document is an XML document.

But the point is that the text/html processing model has to work with the real Web where not all documents are well-formed.

This enables the use of the very powerful XML toolchain for processing HTML.

You can use the toolchain, except for the XML processor itself, as I have explained before.

What I don't understand is why some members of this working group is so dead set on actively preventing HTML from being XML. The non- draconian error handling I understand. But why are you disappointed that <!DOCTYPE html> is well-formed XML? Why the active hostility to well-formedness?

To make a conformance checker not accidentally let MIME type mistakes silently pass in some cases.

--
Henri Sivonen
[EMAIL PROTECTED]
http://hsivonen.iki.fi/


Reply via email to