Thank you, your answer helped me to find the next step.

  Thomas

Am 09.08.20 um 22:12 schrieb AntiCompositeNumber:
This is not the best list for this.
<https://www.mediawiki.org/wiki/Mailing_lists/Wikitech-l> would be
better.

If you're wanting to download articles to be able to read them, my
current recommendation is to use <https://www.kiwix.org/en/>. If
you're looking to do something else, a data dump from
<https://dumps.wikimedia.org/> may be a better idea.

If neither of those options are what you are looking for, you can find
the list of pages by number of views at
<https://pageviews.toolforge.org/topviews/?project=en.wikipedia.org&platform=all-access&date=last-year&excludes=>.
You should limit your requests to a reasonable rate, typically no more
than 60-75 requests/minute (but a slower rate if possible is
appreciated). Make sure to also comply with
<https://meta.wikimedia.org/wiki/User-Agent_policy> so we can yell at
you specifically if there's a problem.

AntiCompositeNumber

On Sun, Aug 9, 2020 at 3:52 PM Thomas Güttler Lists
<[email protected]> wrote:
Hi,

I would like to sync the 100k most popular articles to my local machine.

Maybe I was blind, but I could not find a suitable documentation about this.

Maybe the list with the 100k most popular articles could be the start.

If I have this list, I could download the articles with 100k http calls.

But before I do this, I would kindly ask you about the correct way to do
this.

I don't want that you think I am doing a DoS attack :-)

If there is a better place for this question, please tell me!

Regards,

    Thomas Güttler


--
Thomas Guettler http://www.thomas-guettler.de/
I am looking for feedback: https://github.com/guettli/programming-guidelines


_______________________________________________
Wikimedia Cloud Services mailing list
[email protected] (formerly [email protected])
https://lists.wikimedia.org/mailman/listinfo/cloud
_______________________________________________
Wikimedia Cloud Services mailing list
[email protected] (formerly [email protected])
https://lists.wikimedia.org/mailman/listinfo/cloud

--
Thomas Guettler http://www.thomas-guettler.de/
I am looking for feedback: https://github.com/guettli/programming-guidelines


_______________________________________________
Wikimedia Cloud Services mailing list
[email protected] (formerly [email protected])
https://lists.wikimedia.org/mailman/listinfo/cloud

Reply via email to