Alexander Harrowell wrote:
I'm having to do this in order to fix problems with Viktorfeed and it's a pain
in the arse. (Wikipedia has a table, but I'm making slow progress because some
names have markup spooged into cells where there should only be data.)
Also, Wikipedia seems to have some sort of very unopen whine about downloading
data with things that are not web browsers. I can of course use Python's
webbrowser module, but *really*.
Really? I've never had any problems, and I've done some scraping in the
past week or so, waiting 3 seconds between each request. Try changing
your user agent to something other than 'libwww', in case they're
blocking that.
Paul
P.S. Yes, I know you can download data as well, but for a tiny subset
it's less work to download a few pages and parse them.
--
Paul Waring
http://www.pwaring.com
_______________________________________________
Mailing list [email protected]
Archive, settings, or unsubscribe:
https://secure.mysociety.org/admin/lists/mailman/listinfo/developers-public