I want an execute button. This starts a multi-threaded background
process that scrapes data from 100 web pages. While it is scraping I
want an "execute" web page that shows a running total of the number of
pages scraped; and confirms when the process has completed.

If one uses threads in the execute controller it does not work. If you
join the threads then this blocks the controller until execution is
complete, and the execute view page only gets shown at the end of
execution. If you don't join the threads then web2py kills them all
when it reaches the return at the end of the controller.

The manual and a previous forum discussion suggest running a separate
web2py instance i.e. subprocess.Popen("c:/python27/python %s/web2py.py
-S scraper -M -R %s" %(os.getcwd(), os.getcwd()+"/applications/scraper/
modules/execute.py -A "  str(session.scraperid))+" "+str(processid)",
shell=True). However this requires a separate web2py process. Also the
subprocess has no access to session variables so you need to use the
database to communicate between the two. This means database reads
every time the web page polls for latest status.

An alternative would be an ajax call from execute view to a controller
that starts the threads and joins them. This would keep it all within
the same web2py instance and retain access to session variables. It
requires less code and is simpler.

Are there any downsides to the second alternative?

Reply via email to