Thank you, your answer helped me to find the next step.
Thomas Am 09.08.20 um 22:12 schrieb AntiCompositeNumber:
This is not the best list for this. <https://www.mediawiki.org/wiki/Mailing_lists/Wikitech-l> would be better. If you're wanting to download articles to be able to read them, my current recommendation is to use <https://www.kiwix.org/en/>. If you're looking to do something else, a data dump from <https://dumps.wikimedia.org/> may be a better idea. If neither of those options are what you are looking for, you can find the list of pages by number of views at <https://pageviews.toolforge.org/topviews/?project=en.wikipedia.org&platform=all-access&date=last-year&excludes=>. You should limit your requests to a reasonable rate, typically no more than 60-75 requests/minute (but a slower rate if possible is appreciated). Make sure to also comply with <https://meta.wikimedia.org/wiki/User-Agent_policy> so we can yell at you specifically if there's a problem. AntiCompositeNumber On Sun, Aug 9, 2020 at 3:52 PM Thomas Güttler Lists <[email protected]> wrote:Hi, I would like to sync the 100k most popular articles to my local machine. Maybe I was blind, but I could not find a suitable documentation about this. Maybe the list with the 100k most popular articles could be the start. If I have this list, I could download the articles with 100k http calls. But before I do this, I would kindly ask you about the correct way to do this. I don't want that you think I am doing a DoS attack :-) If there is a better place for this question, please tell me! Regards, Thomas Güttler -- Thomas Guettler http://www.thomas-guettler.de/ I am looking for feedback: https://github.com/guettli/programming-guidelines _______________________________________________ Wikimedia Cloud Services mailing list [email protected] (formerly [email protected]) https://lists.wikimedia.org/mailman/listinfo/cloud_______________________________________________ Wikimedia Cloud Services mailing list [email protected] (formerly [email protected]) https://lists.wikimedia.org/mailman/listinfo/cloud
-- Thomas Guettler http://www.thomas-guettler.de/ I am looking for feedback: https://github.com/guettli/programming-guidelines _______________________________________________ Wikimedia Cloud Services mailing list [email protected] (formerly [email protected]) https://lists.wikimedia.org/mailman/listinfo/cloud
