Re: [Toolserver-l] Little Project Idea

Daniel Kinzler Sat, 27 Jun 2009 00:59:10 -0700

K. Peachey schrieb:
> On Sat, Jun 27, 2009 at 5:40 PM, Lars Dɪᴇᴄᴋᴏᴡ 迪拉斯<[email protected]> wrote:
>> The tools exist. Ample documentation exists. Both programmatic
>> interfaces and easy form-based interfaces exist.
>>
>> Screen scraping still happens not only because of laziness but also
>> because the correct way is not promoted. For example, if I access the
>> English Wikipedia main page with libwww-perl (a banned UA), the
>> response body says (among other things):
>>
>>    Our servers are currently experiencing a technical problem. This is
>>    probably temporary and should be fixed soon. Please _try again_ in a
>>    few minutes.
>>
>> This is of course bullshit of the highest degree. It's certainly a
>> permanent problem, and no amount of retrying will get me around the
>> ban. The response body should say something like this:
>>
>>    Screen scraping is forbidden as it causes undue burden on the
>>    infrastructure and servers. Use the export feature to parse single
>>    pages, use the dump feature to parse a whole wiki.
>>
>>    http://mediawiki.org/wiki/Special:Export
>>    http://en.wikipedia.org/wiki/WP:Export
>>
>>    http://download.wikimedia.org/
>>    http://en.wikipedia.org/wiki/WP:Download
>>
>> Can anyone with access to the appropriate bugtracker file a bug on this,
>> please?
> The [[Special:Export]] function exports the page in a xml format
> whilst most people want a plain/static html dump of the page, and the
> download packages are for the whole database collections which for the
> people i'm suggesting this tool for is way too much, they just want
> seveal articles not a several gigabyte collection of every article
> page.


You are looking for action=render then. I don't think there is a form based
interface for it, but creating one would be very trivial.

Example: http://en.wikipedia.org/wiki/Tool?action=render

-- daniel

_______________________________________________
Toolserver-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/toolserver-l

Re: [Toolserver-l] Little Project Idea

Reply via email to