Christopher Reimer via Python-list wrote: > Greetings, > > I have Python 3.6 script on Windows to scrape comment history from a > website. It's currently set up this way: > > Requestor (threads) -> list -> Parser (threads) -> queue -> CVSWriter > (single thread) > > It takes 15 minutes to process ~11,000 comments. > > When I replaced the list with a queue between the Requestor and Parser > to speed up things, BeautifulSoup stopped working. > > When I changed BeautifulSoup(contents, "lxml") to > BeautifulSoup(contents), I get the UserWarning that no parser wasn't > explicitly set and a reference to line 80 in threading.py (which puts it > in the RLock factory function). > > When I switched back to using list between the Requestor and Parser, the > Parser worked again. > > BeautifulSoup doesn't work with a threaded input queue?
The documentation https://www.crummy.com/software/BeautifulSoup/bs4/doc/#making-the-soup says you can make the BeautifulSoup object from a string or file. Can you give a few more details where the queue comes into play? A small code sample would be ideal... -- https://mail.python.org/mailman/listinfo/python-list