On 17 Jan 2008, at 01:44, Kevin Burton wrote:

Specifically, the probability that a naive non-XML parser can make
while indexing the content.

I'm not sure what you mean here, but I'd reccomend against using an
XML parser against web content and instead use something like the
HTML5 parsing algorithm [#html5-parsing].

Yes... I'm just trying to avoid using a full HTML parser (DOM or not)
to avoid garbage generation and processor overhead.

However, I think I'm losing that battle.

Once you start dealing with the joy of DOCTYPEs and the like, it becomes rather questionable whether XML parsers really are much simpler than HTML ones.


--
Geoffrey Sneddon

_______________________________________________
microformats-discuss mailing list
[email protected]
http://microformats.org/mailman/listinfo/microformats-discuss

Reply via email to