On Mon, 17 Jan 2011 01:45:43 -0000, Miller Medeiros <[email protected]> wrote:

I still believe that this analogy fits well.. XML is stricter than HTML and have simpler rules (all tags open and close on a sane order) and because of that is easier to parse..

A little off-topic: I've been implementing my own HTML and XML parsers, and I don't agree that XML is easier to parse.

The seemingly magic rules for optional tags in HTML are actually very simple to implement, and you can hardcode them instead of using real DTD.

Handling of empty elements is a matter of looking up tagname in a fixed list vs two extra states in an XML parser — it's not very different in complexity. Optionally closed tags are piece of cake to implement too (basically you implement part of XML error handling, except the line that stops the parser).

XML has huge additional complexity. Before you even start, you need to write an SGML DTD parser and fetch half dozen files in order to be able to parse a typical XHTML file. The syntax is additionally complicated by allowing infinitely nested entities containing markup and namespace indirection. Even XML's strict error handling is not helpful, because these are extra code paths and strict behaviors you have to add to the parser.

--
regards, porneL

--
To view archived discussions from the original JSMentors Mailman list: 
http://www.mail-archive.com/[email protected]/

To search via a non-Google archive, visit here: 
http://www.mail-archive.com/[email protected]/

To unsubscribe from this group, send email to
[email protected]

Reply via email to