[Python-Dev] Integrate BeautifulSoup into stdlib?

Vaibhav Mallya Mon, 02 Mar 2009 06:12:39 -0800

I haven't seen a lot of discussion on this - maybe I didn't search hardenough - but what are people's thoughts on including BeautifulSoup instdlib? It's small, fast, and pretty widely-liked by the people who knowabout it. Someone mentioned that web scraping needs are infrequent. Myargument is that people ask questions about them less because they feelthey can just reinvent the wheel really easily using urllib and regexes.It seems like this is similar to the CSV problem from a while backactually, with everyone implementing their own parsers.

We do have HTMLParser, but that doesn't handle malformed pages well, andjust isn't as nice as BeautifulSoup.

In a not-entirely-unrelated vein, has there been any discussion on justthrowing all of Mechanize into stdlib?


BeautifulSoup: http://www.crummy.com/software/BeautifulSoup/
mechanize: http://wwwsearch.sourceforge.net/mechanize/

Regards,
Vaibhav Mallya
_______________________________________________
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Integrate BeautifulSoup into stdlib?

Reply via email to