On Nov 3, 2009, at 12:06 AM, Guido van Rossum wrote:
Though I imagine what
it really needs is a "quirks mode" parser that is compatible with the
HTML dialect accepted by, say, IE6. Maybe a summer of code project?

Already exists: html5lib.
http://code.google.com/p/html5lib/

Or if you want a faster (yet I think less exact) HTML parser, libxml2's HTML parser, via lxml:
http://codespeak.net/lxml/parsing.html#parsing-html

James
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to