On Apr 16, 2008, at 10:47, Paul Libbrecht wrote:
I would like to put a grain of salt here and would love HTML5 passionates to answer:

why is the whole HTML5 effort not a movement towards a really enhanced parser instead of trying to redefine fully HTML successors?

text/html has immense network effects both from the deployed base of text/html content and the deployed base of software that deals with text/html. Failing to plug into this existing network would be extremely bad strategy. In fact, the reason why the proportion of Web pages that get parsed as XML is negligible is that the XML approach totally failed to plug into the existing text/html network effects (except for Appendix C which lacks a migration strategy to actual XML and amounts to the emperor's new clothes).

Being an enhanced parser (that would use a lot of context info to be really hand-author supportive) it would define how to parse better an XHTML 3 page, but also MathML and SVG as it does currently... It has the ability to specify very readable encodings of these pages.

It could serve as a model for many other situations where XML parsing is useful but its strictness bytes some.

Anne has been working on XML5, but being able to parse any well-formed stream to the same infoset as an XML 1.0 parser and being able to parse existing text/html content in a backwards-compatible way are mutually conflicting requirements. Hence, XML5 parsing won't be suitable for text/html.

Currently HTML5 defines at the same time parsing and the model and this is what can cause us to expect that XML is getting weaker. I believe that the whole model-definition work of XML is rich, has many libraries, has empowered a lot of great developments and it is a bad idea to drop it instead of enriching it.

The dominant design of non-browser HTML5 parsing libraries is exposing the document tree using an XML parser API. The non-browser HTML5 libraries, therefore, plug into the network of XML libraries. For example, Validator.nu's internals operate on SAX events that look like SAX events for an XHTML5 document. This allows Validator.nu to use libraries written for XML, such as oNVDL and Saxon.

--
Henri Sivonen
[EMAIL PROTECTED]
http://hsivonen.iki.fi/


Reply via email to