>are trying to rebuild our stale encyclopedia apps for offline usage but are space-limited and would only like to include the most likely pages that would be looked at that can fit within a size envelope >that varies with the device in question (up to 100k article limit probably) For this use case I would be careful to look at page ranks as true popularity as the top data is affected by bot spikes regularly (that is a known issue that we intend to fix). After you have your list of most popular pages please take a second look, some -but not all- of the pages that are artificially high due to bot traffic are pretty obvious (many special pages).
On Mon, Apr 2, 2018 at 8:54 AM, Leila Zia <[email protected]> wrote: > > > On Mon, Apr 2, 2018 at 7:47 AM, Dan Andreescu <[email protected]> > wrote: > >> Hi Srdjan, >> >> The data pipeline behind the API can't handle arbitrary skip or limit >> parameters, but there's a better way for the kind of question you have. We >> publish all the pageviews at https://dumps.wikimedia.org/ot >> her/pagecounts-ez/, look at the "Hourly page views per article" >> section. I would imagine for your use case one month of data is enough, >> and you can get the top N articles for all wikis this way, where N is >> anything you want. >> > > One suggestion here is that if you want to find articles that are > consistently high-page-view (and not part of spike/trend-views), you > increase the time-window to 6 months or longer. > > Best, > Leila > > > > -- > Leila Zia > Senior Research Scientist, Lead > Wikimedia Foundation > > > > > _______________________________________________ > Analytics mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/analytics > >
_______________________________________________ Analytics mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/analytics
