On 10/03/2018 12:33 PM, paul sorenson wrote:
Mike,
Are there unique features of joblib that you need to use?
I was seduced by "Parallel". On reading the docs a little more
diligently it seems well suited to parallel computation with heavy
compute-bound stuff like scientific number crunching and disk caching
results to prevent re-computing.
Scraping web pages is often a good candidate for asyncio based models.
I think I'm being seduced by io in the name. I do judge books by their
cover so I think I'll read asyncio
Thanks Paul
Mike
cheers
On 03/08/2018 11:41 PM, Mike Dewhirst wrote:
https://media.readthedocs.org/pdf/joblib/latest/joblib.pdf
I'm trying to make the following code run in parallel on separate CPU
cores but haven't had any success.
def make_links(self): for db in databases: link =
create_useful_link(self, Link, db) if link: scrape_db(self, link, db)
This is a web scraper which is working nicely in a leisurely
sequential manner. databases is a list of urls with gaps to be
filled by create_useful_link() which makes a link record from the
Link class. The self instance is a source of attributes for filling
the url gaps. self is a chemical substance and the link record url
field when clicked in a browser will bring up that external website
with the chemical substance selected for researching by the viewer.
If successful, we then fetch the external page and scrape a bunch of
interesting data from it and turn that into substance notes.
scrape_db() doesn't return anything but it does create up to nine
other records.
from joblib import Parallel, delayed
class Substance( etc ..
...
def make_links(self):
#Parallel(n_jobs=-2)(delayed(
# scrape_db(self, create_useful_link(self, Link, db), db)
for db in databases
#))
I'm getting a TypeError from Parallel delayed() - can't pickle
generator objects
So my question is how to write the commented code properly? I suspect
I haven't done enough comprehension.
Thanks for any help
Mike
_______________________________________________
melbourne-pug mailing list
melbourne-pug@python.org
https://mail.python.org/mailman/listinfo/melbourne-pug
_______________________________________________
melbourne-pug mailing list
melbourne-pug@python.org
https://mail.python.org/mailman/listinfo/melbourne-pug