tags 434094 + upstream moreinfo thanks > The data processed should be http://www.cguelich.de/
Hi. I've taken over this package and am cleaning up its bugs. I've tried all the URLs listed on this page, with the attached test script, and none of them trigger this bug. I'm assuming it's due to incorrect encoding detection, but it would help to have a test case. I do see HTMLParseErrors, but those are due to the poor quality of HTMLParser, used by BeatifulSoup 3.1. 3.0/3.2 doesn't have these issues. It looks like other people have run into similar issues: http://groups.google.com/group/beautifulsoup/search?q=concatenate+NoneType SR -- Stefano Rivera http://tumbleweed.org.za/ H: +27 21 465 6908 C: +27 72 419 8559 UCT: x3127
#!/usr/bin/env python import traceback import urllib2 import BeautifulSoup for url in ('http://www.vupp.cz/czvupp/', 'http://www.cguelich.de/', 'http://www.singular-tech.com/', 'http://www.presse-citron.net/', 'http://blogs.bnet.com/business-books/?p=327', 'http://peaceclub.de/2007/11/26/z-grabstein/'): print url try: data = urllib2.urlopen(url).read() BeautifulSoup.BeautifulSoup(data) except Exception, e: traceback.print_exc() continue