Alexander Harrowell wrote:
I'm having to do this in order to fix problems with Viktorfeed and it's a pain in the arse. (Wikipedia has a table, but I'm making slow progress because some names have markup spooged into cells where there should only be data.)
Have you taken the Special:Export versions of the pages, which might make things a bit easier?
Also, Wikipedia seems to have some sort of very unopen whine about downloading data with things that are not web browsers. I can of course use Python's webbrowser module, but *really*.
I believe that, as they're simply worried about keeping the site itself running fine, they simply ask that if you wish to download data, you actually download it from http://download.wikimedia.org/ rather than screenscrape:
http://en.wikipedia.org/wiki/Wikipedia_database#Why_not_just_retrieve_data_from_wikipedia.org_at_runtime.3F Seems fair enough to me. ATB, Matthew _______________________________________________ Mailing list [email protected] Archive, settings, or unsubscribe: https://secure.mysociety.org/admin/lists/mailman/listinfo/developers-public
