Alexander Harrowell wrote:
I'm having to do this in order to fix problems with Viktorfeed and it's a pain in the arse. (Wikipedia has a table, but I'm making slow progress because some names have markup spooged into cells where there should only be data.)

Also, Wikipedia seems to have some sort of very unopen whine about downloading data with things that are not web browsers. I can of course use Python's webbrowser module, but *really*.

Really? I've never had any problems, and I've done some scraping in the past week or so, waiting 3 seconds between each request. Try changing your user agent to something other than 'libwww', in case they're blocking that.

Paul

P.S. Yes, I know you can download data as well, but for a tiny subset it's less work to download a few pages and parse them.

--
Paul Waring
http://www.pwaring.com

_______________________________________________
Mailing list [email protected]
Archive, settings, or unsubscribe:
https://secure.mysociety.org/admin/lists/mailman/listinfo/developers-public

Reply via email to