On Mon, 2 Nov 2009 at 22:06, Guido van Rossum wrote:
On Mon, Nov 2, 2009 at 9:51 PM, sstein...@gmail.com <sstein...@gmail.com> wrote:
BeautifulSoup, which I use every day, is one such product. ?Since the crappy
old SMGL parser's gone, BeautifulSoup uses the one that's left in Python 3
and it makes BeautifulSoup completely useless for my daily work.

This sounds an area where some help might be useful. Perhaps the
quickest solution would simply be to copy the old crappy "sgml" based
html parser into a new version of BeautifulSoup. Though I imagine what
it really needs is a "quirks mode" parser that is compatible with the
HTML dialect accepted by, say, IE6. Maybe a summer of code project?

It's not a matter of quirks.  It's a matter of being able to parse
truly broken html/xml, which browsers unfortunately do too well
for everyone else's sanity.

So, call it a "sloppy mode" parser, and then yes, that would solve the
problem.

--David (RDM)
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to