On Nov 3, 2009, at 12:06 AM, Guido van Rossum wrote:
Though I imagine what
it really needs is a "quirks mode" parser that is compatible with the
HTML dialect accepted by, say, IE6. Maybe a summer of code project?
Already exists: html5lib.
http://code.google.com/p/html5lib/
Or if you want a faster (yet I think less exact) HTML parser,
libxml2's HTML parser, via lxml:
http://codespeak.net/lxml/parsing.html#parsing-html
James
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com