Hi Srdjan,

The data pipeline behind the API can't handle arbitrary skip or limit
parameters, but there's a better way for the kind of question you have.  We
publish all the pageviews at
https://dumps.wikimedia.org/other/pagecounts-ez/, look at the "Hourly page
views per article" section.  I would imagine for your use case one month of
data is enough, and you can get the top N articles for all wikis this way,
where N is anything you want.  These files are compressed, so when you
process and expand the data you'll see the reason we can't do this
dynamically: it's huge data and our cluster is limited.

On Sun, Apr 1, 2018 at 11:51 AM, Marko Obrovac <[email protected]>
wrote:

> (+Analytics-l)
>
> Hello Srdjan,
>
> The 1k limit is a hard one: only the top 1000 articles for a given day get
> loaded into the database. I added the folks from the Analytics team to this
> thread, they may be able to help you, as they generate and expose the data
> in question.
>
>
> Cheers,
> Marko Obrovac, PhD
> Senior Services Engineer
> Wikimedia Foundation
>
>
> On 30 March 2018 at 16:59, Srdjan Grubor <[email protected]> wrote:
>
>> Heya,
>> I asked this on IRC but didn't get any replies so I'm following it up
>> this way.
>> I have a question about the newer metrics REST v1 API: is there a way to
>> specify how many top articles to pull from https://wikimedia.org/api/rest
>> _v1/#!/Pageviews_data/get_metrics_pageviews_top_project_acce
>> ss_year_month_day or is 1k hardcoded? Old metrics data was available
>> that had the most viewed pages but that disappeared with the change to the
>> new API.
>>
>> The reason I ask is because we (https://endlessos.com) are trying to
>> rebuild our stale encyclopedia apps for offline usage but are space-limited
>> and would only like to include the most likely pages that would be looked
>> at that can fit within a size envelope that varies with the device in
>> question (up to 100k article limit probably) but the new API doesn't
>> provide us with the tools to figure out the rankings cleanly (other than
>> rate-limiting on our side and checking every single article's metric
>> endpoint for counts).
>>
>> So the main question is: do we have a way to get this data out with the
>> current API? If this data is not available, can the "
>> metrics/pageviews/top" API be augmented to maybe have a `skip` and/or `
>> limit` params like other similar services that have this type of
>> filtering?
>>
>> Thanks,
>>
>> ............................................................
>> ..............
>>
>> Srdjan Grubor  |  +1.314.540.8328 <(314)%20540-8328>  |  Endless
>> <http://endlessm.com/>
>>
>> _______________________________________________
>> Services mailing list
>> [email protected]
>> https://lists.wikimedia.org/mailman/listinfo/services
>>
>>
>
> _______________________________________________
> Analytics mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
>
_______________________________________________
Analytics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to