Re: Concurrent threads to pull web pages?

MRAB Thu, 01 Oct 2009 18:48:00 -0700

Gilles Ganault wrote:

Hello


        I recently asked how to pull companies' ID from an SQLite database,
have multiple instances of a Python script download each company's web
page from a remote server, eg. www.acme.com/company.php?id=1, and use
regexes to extract some information from each page.

I need to run multiple instances to save time, since each page takes
about 10 seconds to be returned to the script/browser.

Since I've never written a multi-threaded Python script before, to
save time investigating, I was wondering if someone already had a
script that downloads web pages and save some information into a
database.

Thank you for any tip.


You could put the URLs into a queue and have multiple worker threads
repeatedly get a URL from the queue, download the page, and then put the
page into another queue for processing by another extraction thread.
This post might help:

http://mail.python.org/pipermail/python-list/2009-September/195866.html

--
http://mail.python.org/mailman/listinfo/python-list

Re: Concurrent threads to pull web pages?

Reply via email to