On Fri, Sep 5, 2014 at 10:21 AM, Jonas Öberg <[email protected]> wrote:
> It's possible to use Special:Redirect or thumb.php to get the > thumbnail/URL, but both are actually PHP scripts that need running. So > while perhaps not ideal, it seems to make the most sense here to > generate the thumbnail URLs ourselves and hit the web server directly. > That can work if you don't mind getting errors in some % of cases where the file format would require a more complex URL scheme. Otherwise, you have three options: - just use Special:Redirect. Depending on your request frequency, it might be fine. We can ask ops what speed limit would be reasonable; for bots using the API, the general recommendation is 12 requests per minute. - scrape file description pages. The HTML page is cached in varnish and it has links to various standard image sizes, so you won't hit PHP this way; of course, HTML scraping is not the most reliable way of retrieving data. - use the API in batches. You can retrieve the information (including thumbnail URL) for 500 files in a single request (5000 if you get a bot flag): https://en.wikipedia.org/w/api.php?format=jsonfm&action=query&titles=File:30C3_Commons_Machinery_1.jpg|File:30C3_Commons_Machinery_2.jpg|File:30C3_Commons_Machinery_3.jpg&prop=imageinfo&iiprop=extmetadata|url&iiextmetadatafilter=ObjectName|Artist|LicenseShortName&iiurlwidth=640 IMO the last option is the cleanest one.
_______________________________________________ Multimedia mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/multimedia
