On Mon, 17 Jan 2011 01:45:43 -0000, Miller Medeiros
<[email protected]> wrote:
I still believe that this analogy fits well.. XML is stricter than HTML
and have simpler rules (all tags open and close on a sane order) and
because of that is easier to parse..
A little off-topic: I've been implementing my own HTML and XML parsers,
and I don't agree that XML is easier to parse.
The seemingly magic rules for optional tags in HTML are actually very
simple to implement, and you can hardcode them instead of using real DTD.
Handling of empty elements is a matter of looking up tagname in a fixed
list vs two extra states in an XML parser — it's not very different in
complexity. Optionally closed tags are piece of cake to implement too
(basically you implement part of XML error handling, except the line that
stops the parser).
XML has huge additional complexity. Before you even start, you need to
write an SGML DTD parser and fetch half dozen files in order to be able to
parse a typical XHTML file. The syntax is additionally complicated by
allowing infinitely nested entities containing markup and namespace
indirection. Even XML's strict error handling is not helpful, because
these are extra code paths and strict behaviors you have to add to the
parser.
--
regards, porneL
--
To view archived discussions from the original JSMentors Mailman list:
http://www.mail-archive.com/[email protected]/
To search via a non-Google archive, visit here:
http://www.mail-archive.com/[email protected]/
To unsubscribe from this group, send email to
[email protected]