Ivan Krstić wrote: > On Mar 4, 2009, at 12:32 PM, James Y Knight wrote: >> I think html5lib would be a better candidate for an imrpoved HTML >> parser in the stdlib than BeautifulSoup. > > While we're talking about alternatives, Ian Bicking appears to swear by > lxml: > > <http://blog.ianbicking.org/2008/12/10/lxml-an-underappreciated-web-scraping-library/>
I second that. ;) And, BTW, I wouldn't mind getting lxml into the stdlib either. Stefan _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com